作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程

• 先进计算 • 上一篇    下一篇

异构Spark集群下自适应任务调度策略

杨志伟,郑烇,王嵩,杨坚,周乐乐   

  1. (中国科学技术大学自动化系,合肥 230027)
  • 收稿日期:2015-01-09 出版日期:2016-01-15 发布日期:2016-01-15
  • 作者简介:杨志伟(1989-),男,硕士研究生,主研方向为任务调度策略、网络传播与控制;郑烇,副教授、博士;王嵩,讲师、博士;杨坚,副教授、博士生导师;周乐乐,硕士研究生。
  • 基金资助:
    国家自然科学基金资助项目(61174062)。

Adaptive Task Scheduling Strategy for Heterogeneous Spark Cluster

YANG Zhiwei,ZHENG Quan,WANG Song,YANG Jian,ZHOU Lele   

  1. (Department of Automation,University of Science and Technology of China,Hefei 230027,China)
  • Received:2015-01-09 Online:2016-01-15 Published:2016-01-15

摘要: Spark是一种基于内存的类Hadoop MapReduce高效大数据处理平台,但其默认的任务调度策略在异构Spark集群下未考虑到节点的能力差异,降低了系统性能。为此,提出一种基于异构Spark集群的自适应任务调度策略。该策略通过监测节点的负载及资源利用率,分析监测得到 的参数,自适应动态调整节点任务分配权值。实验结果表明,在异构节点情况下,该策略在作业完成时间、节点工作状态及资源利用率方面的性能均优于默认的任务调度策略。

关键词: Spark平台, 异构集群, 自适应, 任务调度, 监测, 权值

Abstract: Spark is a kind of efficient big data processing platform based on memory and similar to Hadoop MapReduce.But the Spark default task scheduling strategy does not take the different capacity of node into account for heterogeneous Spark cluster,thus leading to the low system performance.For this problem,this paper presents an adaptive task scheduling strategy for heterogeneous Spark cluster,which analyzes parameters from surveillance to dynamically adjust the task allocation weights of nodes through monitoring the load and resource utilization of nodes.Experimental result validates that this strategy for heterogeneous nodes is superior to the default task scheduling strategy in aspects like task completion time,nodes working state and resource utilization.

Key words: Spark platform, heterogeneous cluster, adaptive, task scheduling, monitoring, weight

中图分类号: