Author Login Chief Editor Login Reviewer Login Editor Login Remote Office

Computer Engineering ›› 2026, Vol. 52 ›› Issue (3): 318-331. doi: 10.19678/j.issn.1000-3428.0069968

• High-Performance Computing and Big Data • Previous Articles     Next Articles

A Novel Spark Job Scheduling Algorithm Oriented Deadline-Cost Balance

HE Yulin1,2,*(), MO Peiheng1,2, HUANG Zhexue1,2, Philippe Fournier-Viger2   

  1. 1. Guangdong Laboratory of Artificial Intelligence and Digital Economy(Shenzhen), Shenzhen 518107, Guangdong, China
    2. College of Computer Science and Software Engineering, Shenzhen University, Shenzhen 518060, Guangdong, China
  • Received:2024-06-06 Revised:2024-08-04 Online:2026-03-15 Published:2024-10-30
  • Contact: HE Yulin

一种新的截止期限与成本平衡为导向的Spark作业调度算法

何玉林1,2,*(), 莫沛恒1,2, 黄哲学1,2, Philippe Fournier-Viger2   

  1. 1. 人工智能与数字经济广东省实验室(深圳), 广东 深圳 518107
    2. 深圳大学计算机与软件学院, 广东 深圳 518060
  • 通讯作者: 何玉林
  • 作者简介:

    何玉林,男,研究员、博士,主研究方向为大数据系统计算技术、大数据工程技术应用

    莫沛恒,硕士研究生

    黄哲学,特聘教授、博士

    Philippe Fournier-Viger,特聘教授、博士

  • 基金资助:
    广东省自然科学基金面上项目(2023A1515011667); 深圳市科技重大专项(202302D074); 深圳市基础研究面上项目(JCYJ20210324093609026)

Abstract:

The significance of big data computation frameworks such as Apache Spark for large-scale data analysis is becoming increasingly prominent. However, handling data-intensive jobs by relying solely on local computing resources is difficult. Therefore, a feasible solution is to rent cloud resources from public cloud service providers and fully deploy Spark clusters in the cloud. However, this operation incurs high deployment costs. To reduce costs, an increasing number of users are choosing to use local and cloud resources together to build hybrid cloud computing clusters. However, in Spark clusters deployed in hybrid clouds, scheduling jobs while simultaneously meeting multiple service-level agreement requirements (such as minimizing costs and ensuring job deadlines at the same time) is challenging. Existing research mainly focuses on ways to reduce cluster usage costs or improve job deadline satisfaction rates, without considering the balance between these two goals. This paper proposes a Deadline-Cost Aware Ant Colony Optimization (DC-ACO) algorithm to solve the job scheduling problem in hybrid clouds. DC-ACO can optimize the pricing of different Virtual Machine (VM) instances in a hybrid cloud deployment cluster while maximizing the percentage of job deadlines met. In extensive simulation experiments, DC-ACO is compared with baseline methods. The results demonstrate that the proposed algorithm exhibits robust scalability, achieving an approximately 20% increase in the job deadline fulfillment percentage, coupled with a notable 10% reduction in VM usage costs for hybrid clusters.

Key words: Spark cluster, job scheduling, hybrid cloud, Ant Colony Optimization (ACO) algorithm, job deadline, Virtual Machine (VM)

摘要:

大数据计算框架如Apache Spark在大数据分析任务中的重要性日益凸显, 但是仅依靠本地计算资源往往难以支撑数据密集型作业任务的处理。因此, 一种可行的方案是租用公共云服务商的云资源, 并将Spark集群完全部署在云端。然而, 这样会导致计算成本过高。为了降低成本, 越来越多的用户选择使用本地资源和云资源协同的方式构建混合云计算集群。但是在混合云部署的Spark集群中, 在满足多个服务水平协议需求(例如最小化成本和保证作业截止期限)的同时完成作业调度是一项具有挑战性的任务。现有的研究主要关注如何降低集群使用成本或者提高作业截止日期的满足率, 而没有考虑这两个目标之间的平衡。针对这一问题, 提出了一种新的期限-成本感知蚁群优化(DC-ACO)作业调度算法, 该算法能够在利用混合云部署集群中不同虚拟机(VM)实例定价下优化集群VM使用成本的同时, 最大限度地保证作业截止日期的满足百分比, 并通过仿真实验对比提出的DC-ACO作业调度算法与基线算法的性能。实验结果表明, DC-ACO算法具有良好的可扩展性, 并且能够将作业截止日期的满足百分比提升约20%, 同时将混合集群的VM使用成本降低约10%。

关键词: Spark集群, 作业调度, 混合云, 蚁群优化算法, 作业截止期限, 虚拟机