作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2021, Vol. 47 ›› Issue (2): 226-232. doi: 10.19678/j.issn.1000-3428.0057292

• 体系结构与软件技术 • 上一篇    下一篇

基于YARN的分布式资源动态调度与协同分配系统

郝志峰1,2, 黄泽林1, 蔡瑞初1, 傅正佳1, 温雯1, 唐凯麟1   

  1. 1. 广东工业大学 计算机学院, 广州 510006;
    2. 佛山科学技术学院 数学与大数据学院, 广东 佛山 528000
  • 收稿日期:2020-01-27 修回日期:2020-03-26 出版日期:2021-02-15 发布日期:2020-03-31
  • 作者简介:郝志峰(1968-),男,教授、博士,主研方向为机器学习、人工智能;黄泽林(通信作者),硕士研究生;蔡瑞初,教授、博士;傅正佳,博士;温雯,副教授、博士;唐凯麟,硕士。
  • 基金资助:
    国家自然科学基金(61876043);NSFC-广东联合基金(U1501254);广东省自然科学基金(2014A030306004,2014A030308008);广东特支计划(2015TQ01X140);广州市珠江科技新星专项(201610010101);广州市科技计划项目(201902010058)。

Dynamic Scheduling and Collaborative Allocation System for Distributed Resource Based on YARN

HAO Zhifeng1,2, HUANG Zelin1, CAI Ruichu1, FU Zhengjia1, WEN Wen1, TANG Kailin1   

  1. 1. College of Computer, Guangdong University of Technology, Guangzhou 510006, China;
    2. College of Mathematics and Big Data, Foshan University, Foshan, Guangdong 528000, China
  • Received:2020-01-27 Revised:2020-03-26 Online:2021-02-15 Published:2020-03-31

摘要: Storm on YARN是目前主流的分布式资源调度框架,但其存在需要人工干预和无法根据资源可用性实时调整系统资源的不足。根据流数据处理的实时延迟计算系统负载情况,在Storm平台上基于YARN设计分布式资源调度和协同分配系统。建立包含系统层和任务层的双层调度模型,系统层通过对流数据处理负载的实时监测进行资源分配预测,任务层利用ZooKeeper和YARN对集群资源的高效管理能力进行动态资源管理。实验结果表明,该系统可以实时调整集群资源分布,有效减小系统延迟。

关键词: 分布式集群, 动态调度, 协同分配, 流数据处理, 资源分配

Abstract: Storm on YARN is one of the mainstream distributed resource scheduling frameworks.However,it requires manual intervention,so it fails to adjust the system resources according to resource availability in real time.To address the problem,this paper proposes a distributed resource scheduling and collaborative allocation model based on YARN on the Storm platform,where the loads of the system is estimated based on the real-time delay of the streaming data process.The model is based on a two-layer scheduling model,including the system layer and the task layer.The system layer predicts the resource allocation through real-time monitoring of streaming data processing loads,and the ZooKeeper-based task layer uses YARN's efficient management of cluster resources to dynamically manage them. Experimental results show that the proposed system can dynamically adjust the cluster resources distribution in real time to reduce the system delay.

Key words: distributed cluster, dynamic scheduling, collaborative allocation, stream data processing, resource allocation

中图分类号: