作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程

• 开发研究与工程应用 • 上一篇    下一篇

一种自适应失效检测算法的研究与应用

段文佳,刘晓洁   

  1. (四川大学计算机学院,成都 610065)
  • 收稿日期:2013-03-20 出版日期:2014-03-15 发布日期:2014-03-13
  • 作者简介:段文佳(1988-),男,硕士研究生,主研方向:数据存储,容灾抗毁;刘晓洁,副教授。
  • 基金资助:
    国家自然科学基金资助项目(61173159);教育部重大项目培育基金资助项目(708075)。

Study and Application of an Adaptive Failure Detection Algorithm

DUAN Wen-jia, LIU Xiao-jie   

  1. (School of Computer, Sichuan University, Chengdu 610065, China)
  • Received:2013-03-20 Online:2014-03-15 Published:2014-03-13

摘要: 失效检测技术是保证容灾备份系统高可用性的关键技术之一,但经典的自适应失效检测算法失效检测时间较长、误判率较高。为此,提出一种基于指数分布的自适应失效检测算法λ-FD,采用Push与Pull 2种心跳模式结合的方法实现算法的重查策略。实验结果表明,λ-FD在阈值取0.68时性能较优,失效检测时间为1 339.5 ms,误判率为0.055 7%,远低于同等失效检测时间下经典算法Φ-FD的15.19%和Chen-FD的24.92%。λ-FD在相同失效检测时间下误判率普遍低于经典的自适应失效检测算法,相同误判率时耗费的失效检测时间较短,有效提高失效检测的性能,更符合广域网中灾备系统的应用需求。

关键词: 自适应, 失效检测, 指数分, 容灾备份, 心跳, 阈值

Abstract: Failure detection is one of the crucial techniques to promise the disaster recovery system’s serviceability, and classical adaptive failure detection algorithm has the shortage of long failure detection time and high error rate. For this problem, this paper studies an adaptive failure detection algorithm λ-FD, based on exponential distribution. Simultaneously, λ-FD combines Pull heartbeat and Push heartbeat to achieve re-check. Experimental results show that λ-FD has the optimal performance when it sets the threshold to 0.68, the failure detection time to 1 339.5 ms and the error rate to 0.055 7%, and the latter is much lower than the error rate of Φ-FD, 15.19%, and the error rate of Chen-FD, 24.92%. So the error rate of λ-FD is generally lower than the classical algorithms which have the same failure detection time, and λ-FD takes the shortest failure detection time if its error rate is the same with classical algorithm, λ-FD can better adapt to the disaster recovery system in the Wide Area Network(WAN).

Key words: adaptive, failure detection, exponential distribution, disaster recovery, heartbeat, threshold

中图分类号: