计算机工程

• 先进计算与数据处理 • 上一篇    下一篇

基于流式计算框架的实时数据库分区系统

郭蒙雨,康宏,袁晓洁   

  1. (南开大学 计算机与控制工程学院,天津 300350)
  • 收稿日期:2016-09-23 出版日期:2017-11-15 发布日期:2017-11-15
  • 作者简介:郭蒙雨(1993—),女,硕士,主研方向为流式大数据处理;康宏(通信作者),讲师、博士;袁晓洁,教授、博士。
  • 基金项目:
    天津市应用基础与前沿技术研究计划项目(14JCYBJC15500);高等学校博士学科点专项科研基金(20130031120029)。

Real-time Database Partitioning System Based on Streaming Computing Framework

GUO Mengyu,KANG Hong,YUAN Xiaojie   

  1. (College of Computer and Control Engineering,Nankai University,Tianjin 300350,China)
  • Received:2016-09-23 Online:2017-11-15 Published:2017-11-15

摘要: 为实现大数据环境下大规模动态分区信息的高效处理,结合流式计算框架,提出一种实时数据库分区系统。采用大数据环境下的流式计算技术处理大规模动态的工作负载,设计实时数据分区算法实现数据分区的自动与即时生成,并利用流式计算框架的水平扩展机制提高系统扩展性和吞吐量。实验结果表明,该系统可在大数据环境下实现高效、实时的数据库分区,与传统分区算法相比,具有更高的分区质量和更少的分区时间。

关键词: 数据库分区, 流式计算框架, 大数据管理, 分布式存储, 动态负载

Abstract: In order to realize the efficient processing of large scale and dynamic partition information in the big data environment,a real-time database partitioning system is proposed combining with the flow computing framework.This system copes with the large scale and dynamic workloads by using stream computing technologies in the big data environment.It designs a real-time data partitioning algorithm to realize automatic and immediate generation of data partitions.The system realizes the scalability and high-throughput adaption by using the horizontal scaling mechanism of streaming computing framework.The experimental results show that the system can realize efficient and real-time database partition in big data environment.It has higher partitioning quality and lower time than tranditional partitioning algorithm.

Key words: database partitioning, streaming computing framework, big data management, distributed storage, dynamic load

中图分类号: