作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2012, Vol. 38 ›› Issue (11): 48-50. doi: 10.3969/j.issn.1000-3428.2012.11.015

• 软件技术与数据库 • 上一篇    下一篇

一种面向大规模数据处理的数据库引擎

王 毅,刘长城,马建庆   

  1. (复旦大学计算机科学技术学院,上海 200433)
  • 收稿日期:2011-12-19 出版日期:2012-06-05 发布日期:2012-06-05
  • 作者简介:王 毅(1985-),男,硕士,主研方向:数据仓库技术,并行计算;刘长城,硕士;马建庆,讲师、博士
  • 基金资助:
    国家自然科学基金资助项目(60803117)

Database Engine for Large Scale Data Processing

WANG Yi, LIU Chang-cheng, MA Jian-qing   

  1. (School of Computer Science, Fudan University, Shanghai 200433, China)
  • Received:2011-12-19 Online:2012-06-05 Published:2012-06-05

摘要: 当数据量从GB级上升至TB级甚至PB级时,具有高性能的并行数据库在保证扩展性和容错性的同时计算代价会很高。针对该问题,设计一种面向大规模数据处理的并行数据库引擎FlexDB。利用Map Reduce的并行计算框架作为通信层,调度和协调集群中各节点的计算和通信。实验结果表明,FlexDB的系统性能接近于并行数据库,并且具有较好的扩展性和容错性。

关键词: 海量数据, 扩展性, 容错性, Map Reduce框架, 并行数据库

Abstract: When the amount of data from GB goes up to TB level or even PB level, parallel database with high performance cost too much in order to achieve scalability and fault tolerance. To address the problem, this paper designs a parallel database engine——FlexDB, which is based on Map Reduce. The parallel computing framework of Map Reduce is as a communication layer of FlexDB which is to assign computing tasks and coordinate communications among all nodes in cluster. Experimental results show that the FlexDB system performance is close to parallel database, and has good expansibility and fault tolerance.

Key words: mass data, scalability, fault tolerance, Map Reduce framework, parallel database

中图分类号: