计算机工程

• 体系结构与软件技术 • 上一篇    下一篇

基于Simics 的系统级故障注入平台

胡 倩,王 超,王海霞,汪东升   

  1. (清华大学信息科学与技术国家实验室,北京100084)
  • 收稿日期:2014-03-28 出版日期:2015-02-15 发布日期:2015-02-13
  • 作者简介:胡 倩(1988 - ),男,硕士研究生,主研方向:多核体系结构,系统容错技术;王 超,助理研究员、博士后;王海霞,副教授; 汪东升,教授、博士生导师。
  • 基金项目:
    国家自然科学基金资助项目(61373025,61303002);国家“863”计划基金资助项目(2012AA0100905);高等学校博士学科点专 项科研基金资助项目(20120002110032)。

Simics-based System Level Fault Injection Platform

HU Qian,WANG Chao,WANG Haixia,WANG Dongsheng   

  1. (National Laboratory for Information Science and Technology,Tsinghua University,Beijing 100084,China)
  • Received:2014-03-28 Online:2015-02-15 Published:2015-02-13

摘要: 故障注入技术是评价系统可靠性的有效方法。现有基于仿真的故障注入平台大多基于现场可编程门阵列 或超高速集成电路硬件描述语言实现,对故障模型的支持非常有限。为此,基于Simics 结构级模拟器,设计并实现 系统级硬件故障注入平台。该平台上层支持不同固件、操作系统以及应用程序,底层支持对处理器典型流水部件 的故障注入,同时实现瞬时故障、永久故障和间歇故障模型以及其他较全面的故障类型,并将一组系统级故障检测 机制集成入平台中。实验通过监测硬件故障在系统级的传播,对比分析了故障对不同部件造成的系统级影响,结果表明,瞬时故障对系统影响较小,永久故障容易引起系统失效,间歇故障对各部件有不同程度的干扰作用。

关键词: 故障注入, 系统可靠性, 故障模型, 故障检测, 结构级模拟器

Abstract: Fault injection provides an effective method to evaluate the reliability of system,which is a complex topic in multicore situation. There are many simulation-based fault injection tools now,most of which are implemented by Field Programmable Gate Array(FPGA) and Very High Speed Integrated Circuits Hardware Description Language(VHDL), with limited fault models. Based on the widely used system simulator Simics in computer architecture,this paper designs and implements a system level fault injection platform,supporting different firmware,OS and applications. It can inject faults into several components,with different fault models(including transient faults,permanent and intermittent faults) and most fault types. Further more,it integrates fault detection module into the system. After observing of the propagation of hardware faults in system,it analyzes the effect of different components,fault models on system level,inspiring fault detection,and finds that transient faults have a little impact on system,while permanent faults seriously interrupt the running and intermittent faults performs differently on different components.

Key words: fault injection, system reliability, fault model, fault detection, structure level simulator

中图分类号: