[1] SARI A,AKKAYA M.Fault tolerance mechanisms in distributed systems[J].International Journal of Communica-tions,Network and System Sciences,2015,8(12):471-482. [2] MARIANI L,PEZZE M,RIGANELLI O.Predicting failures in multi-tier distributed systems[EB/OL].[2020-02-15].https://arxiv.org/abs/1911.09561. [3] ITANI M,SHARAFEDDINE S,ELKABANI I.Dynamic single node failure recovery in distributed storage systems[J].Computer Networks,2017,113:84-93. [4] GUO Baolong,WANG Jian,YAN Yunyi,et al.Optimal design of DSP protection based on multi-target PSO algorithm[J].Computer Engineering,2018,44(4):74-80.(in Chinese)郭宝龙,王健,闫允一,等.基于多目标PSO算法的DSP防护优化设计[J].计算机工程,2018,44(4):74-80. [5] LEI Changjian,LIN Yaping,LI Jinguo,et al.Research on Byzantine fault tolerance under volunteer cloud environ-ment[J].Computer Engineering,2016,42(5):1-7.(in Chinese)雷长剑,林亚平,李晋国,等.志愿云环境下的拜占庭容错研究[J].计算机工程,2016,42(5):1-7. [6] BERROCAL E,BAUTISTA-GOMEZ L,DI S,et al.Toward general software level silent data corruption detection for parallel applications[J].IEEE Transactions on Parallel and Distributed Systems,2017,28(12):3642-3655. [7] LI S Z,MADDAH-ALI M A,QIAN Y,et al.A fundamental tradeoff between computation and communication in dis-tributed computing[J].IEEE Transactions on Information Theory,2018,64(1):109-128. [8] REISIZADEH A,PRAKASH S,PEDARSANI R,et al.Coded computation over heterogeneous clusters[J].IEEE Transactions on Information Theory,2019,65(7):4227-4242. [9] KONSTANTINIDIS K,RAMAMOORTHY A.Leveraging coding techniques for speeding up distributed computing[C]//Proceedings of 2018 IEEE Global Communications Conference.Washington D.C.,USA:IEEE Press,2018:1-6. [10] DEAN J,GHEMAWAT S.MapReduce:simplified data processing on large clusters[J].Communications of the ACM,2008,51(1):107-113. [11] LI S Z,QIAN Y,MADDAH-ALI M A,et al.Coded distributed computing:fundamental limits and practical challenges[C]//Proceedings of the 50th Asilomar Conference on Signals,Systems and Computers.Washington D.C.,USA:IEEE Press,2016:509-513. [12] D'ANGELO G,FERRETTI S,MARZOLLA M.Fault tolerant adaptive parallel and distributed simulation through functional replication[J].Simulation Modelling Practice and Theory,2019,93:192-207. [13] LEDMI A,BENDJENNA H,HEMAM S M.Fault tolerance in distributed systems:a survey[C]//Proceedings of the 3rd International Conference on Pattern Analysis and Intelligent Systems.Washington D.C.,USA:IEEE Press,2018:1-5. [14] LIAO Weicheng,WU Janjan.Replica-aware job scheduling in distributed systems[C]//Proceedings of Advances in Grid and Pervasive Computing.Berlin,Germany:Springer,2010:290-299. [15] BARKAHOUM K,HAMOUDI K.A fault-tolerant scheduling algorithm based on check pointing and redundancy for distributed real-time systems[J].International Journal of Distributed Systems and Technologies,2019,10:58-75. [16] LYONS R E,VANDERKULK W.The use of triple-modular redundancy to improve computer reliability[J].IBM Journal of Research and Development,1962,6(2):200-209. [17] FU M,HAN S J,LEE P P C,et al.A simulation analysis of redundancy and reliability in primary storage deduplication[J].IEEE Transactions on Computers,2018,67(9):1259-1272. [18] SALEHI M,KHAVARI TAVANA M,REHMAN S,et al.Energy-efficient permanent fault tolerance in hard real-time systems[J].IEEE Transactions on Computers,2019,68(10):1539-1545. [19] XU Wenfang,LIU Hongwei,SHU Yanjun,et al.Management board for triple module redundant fault-tolerance system[J].Journal of Tsinghua University(Science and Technology),2011,51(S1):1434-1439.(in Chinese)徐文芳,刘宏伟,舒燕君,等.三模冗余容错系统管理板[J].清华大学学报(自然科学版),2011,51(S1):1434-1439. [20] ZHOU Ao,WANG Shangguang,CHENG Bo,et al.Cloud service reliability enhancement via virtual machine placement optimization[J].IEEE Transactions on Services Computing,2017,10(6):902-913. [21] LI Xin,LIN Yufei,GUO Xiaowei.A triple modular eager redundancy fault-tolerant technique for distributed stream architecture[J].Computer Engineering and Science,2015,37(12):2233-2241.(in Chinese)李鑫,林宇斐,郭晓威.面向分布式流体系结构的多副本积极容错技术[J].计算机工程与科学,2015,37(12):2233-2241. [22] O'MALLEY O.TeraByte sort on Apache Hadoop[EB/OL].[2020-02-15].http://sortbenchmark.org/YahooHadoop.pdf. |