摘要: 针对Hadoop分布式文件系统中的Namenode单点故障问题,在研究Secondary Namenode机制、Backup Node机制和Facebook Avatar机制的基础上,提出一种Avatar改进方案。主节点向备用节点转发客户端请求,使用Zookeeper实现故障切换,从而解决Namenode的单点故障问题。利用Petri网模型在理论上证明了该方案的正确性,采用基于有限源的存储网络故障修复模型对该方案的可用性进行定量分析。实验结果表明,该方案具有不丢失数据、快速切换和故障自动恢复的特点。
关键词:
云计算,
单点故障,
Hadoop分布式文件系统,
高可用性,
Petri网,
故障恢复
Abstract: Based on the analysis of the Secondary Namenode mechanism, Backup Node mechanism and Facebook Avatar mechanism, an improved scheme of Avatar is proposed to solve the Single Point of Fault(SPOF)of Namenode existing in Hadoop Distributed File System(HDFS) which is a distributed file system of Hadoop. Client request is transmitted from the master node to standby node, and the Zookeeper is used to take over failover. And then the SPOF of Namenode is solved. The correctness of this program is verified by Petri net modal in theory and the quantitative analysis of its availability is conducted by storage network fault repairing model based on finite element method. Experimental result demonstrates the advantages of this program which are nonvolatile, fast failover and automatic failover.
Key words:
cloud computing,
Single Point of Fault(SPOF),
Hadoop Distributed File System(HDFS),
High Availability(HA),
Petri net,
fault recovery
中图分类号:
邓鹏, 李枚毅, 何诚. Namenode单点故障解决方案研究[J]. 计算机工程, 2012, 38(21): 40-44.
DENG Feng, LI Mei-Yi, HE Cheng. Research on Namenode Single Point of Fault Solution[J]. Computer Engineering, 2012, 38(21): 40-44.