作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2012, Vol. 38 ›› Issue (24): 37-41. doi: 10.3969/j.issn.1000-3428.2012.24.009

• 软件技术与数据库 • 上一篇    下一篇

一种大规模分布式应用性能分析系统

臧冬松 1,3,Vincent Garonne 2,孙功星 1   

  1. (1. 中国科学院高能物理研究所,北京 100049;2. 欧洲核子研究中心,瑞士 日内瓦 CH-1211;3. 中国科学院研究生院,北京 100049)
  • 收稿日期:2012-02-13 修回日期:2012-04-08 出版日期:2012-12-20 发布日期:2012-12-18
  • 作者简介:臧冬松(1981-),男,博士研究生,主研方向:分布式计算,海量数据管理;Vincent Garonne,博士;孙功星,研究员
  • 基金资助:
    国家自然科学基金资助重点项目(90912004)

A Performance Analysis System for Large-scale Distributed Application

ZANG Dong-song 1,3, Vincent Garonne 2, SUN Gong-xing 1   

  1. (1. Institute of High Energy Physics, Chinese Academy of Sciences, Beijing 100049, China; 2. European Organization for Nuclear Research, Geneva CH-1211, Switzerland; 3. Graduate University of Chinese Academy of Sciences, Beijing 100049, China)
  • Received:2012-02-13 Revised:2012-04-08 Online:2012-12-20 Published:2012-12-18

摘要: 在网格和云计算环境下,由于平台和网络环境的复杂性,使得对大规模分布式应用的有效监控和性能分析变得非常困难。为此,提出一种基于数据流管理的大规模分布式应用性能分析系统,利用消息队列收集、缓冲和分发追踪消息,使用分布式实时处理框架分析和追踪消息。将该系统部署到一个Petabyte级别的分布式数据管理系统中,通过事例演示追踪消息的重要性。应用结果表明,该系统能够满足大规模分布式应用环境下大数据量处理能力和实时性的要求,为监控并分析系统性能、预测用户行为等提供了较好的平台支持。

关键词: 分布式应用, 性能分析, 数据流管理, 消息跟踪, 消息队列, NoSQL数据库

Abstract: Monitoring and analyzing large-scale distributed applications in grid or cloud environment is very difficult, due to the complexity of platform and network environment. This paper describes a system to monitoring and analysis such applications. This system is based on the concept of data stream management. It uses message queues to collect, cache and distribute trace messages, uses distributed computing framework to analyze the trace messages in real time. The prototype is deployed in a real Petabyte-scale distributed data management system. The usefulness of the collected trace messages is demonstrated by examples. Application result shows that this system is easy to deploy and has little affection on the applications, can well suit the requirement of big data analysis and real-time compute, provides a platform to analyze the performance of large-scale distributed system, predict user behavior.

Key words: distributed application, performance analysis, data stream management, message trace, message queue, NoSQL database

中图分类号: