作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2015, Vol. 41 ›› Issue (1): 1-5. doi: 10.3969/j.issn.1000-3428.2015.01.001

• 专栏 • 上一篇    下一篇

支持异构集群并行的高能物理数据处理系统

霍菁1,2,雷晓凤1,2,李强1,2,孙功星1   

  1. 1.中国科学院高能物理研究所,北京 100049; 2.中国科学院大学,北京 100049
  • 收稿日期:2014-02-17 修回日期:2014-03-20 出版日期:2015-01-15 发布日期:2015-01-16
  • 作者简介:霍 菁(1985-),男,博士研究生,主研方向:分布式计算,集群资源管理;雷晓凤、李 强,博士研究生;孙功星,研究员。
  • 基金资助:

    国家自然科学基金资助项目(11375223,11375221);国家自然科学基金A3前瞻计划基金资助项目(61161140454)

High Energy Physics Data Processing System with Parallel Heterogeneous Clusters

HUO Jing1,2,LEI Xiaofeng1,2,LI Qiang1,2,SUN Gongxing1   

  1. 1.Institute of High Energy Physics,Chinese Academy of Sciences,Beijing 100049,China;
    2.Graduate University of Chinese Academy of Sciences,Beijing 100049,China
  • Received:2014-02-17 Revised:2014-03-20 Online:2015-01-15 Published:2015-01-16

摘要:

传统集群计算系统无法充分利用本地磁盘的存储资源和I/O,大量网络I/O成为系统瓶颈,导致资源利用率降低,并造成高昂的存储和网络成本。使用Hadoop处理分析作业可有效利用本地磁盘存储和I/O资源,而集群资源统一管理工具Mesos则使用轻量化的设计和高效的通信机制,能在不同计算集群之间动态共享集群资源。为此,分析高能物理数据处理的特点,利用Mesos构建异构集群间资源共享的高能物理实验数据处理系统,实现Torque/Maui和Hadoop集群的集成。测试结果表明,该系统能够在集群间动态分配集群资源,并利用本地存储和磁盘I/O显著降低网络I/O,提高集群资源利用率。

关键词: 高能物理, 集群资源管理, 资源共享, Mesos工具, Hadoop平台, Torque/Maui系统

Abstract:

The traditional cluster computing system can not make best of the local disks and disk I/O resources,therefore the network becomes the bottleneck of the whole system.And this is the reason of low utilization of the cluster resources and high cost on data storage and network equipment.Using Hadoop to process analysis can significantly reduce the pressure on network I/O by using the local disks as a distributed file system.Mesos is a cluster resource manager with light-weight design and efficient communication mechanisms that can dynamically share resources among clusters.This paper introduces the features of High Energy Physics(HEP),data processing,presents a new HEP data processing system by using Mesos to provide dynamic resource sharing among clusters,and implements integration of Toruqe/Maui and Hadoop which can avoid the disadvantages.The test result shows that the new system can dynamic distribute the cluster resource,and reduce the network I/O,improve the resource utilization.

Key words: High Energy Physics(HEP), cluster resource management, resource sharing, Mesos tool, Hadoop platform, Toruqe/Maui system

中图分类号: