[1] VAVILAPALLI V K, MURTHY A C, DOUGLAS C, et al. Apache Hadoop YARN: Yet another resource negotiator[C]//Proc of the 4th annual Symposium on Cloud Computing, New York, ACM, 2013: 1-16.
[2] ZAHARIA M, REYNOLD S X, WENDELL P, et al. Apache Spark: A Unified Engine for Big Data Processing[J]. Communications of the ACM, 2016, 59(11): 56-65.
[3] ZAHARIA M, CHOWDHURYET M, TATHAGATA D, et al. Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing[C]//Proc of the 9th USENIX conference on Networked Systems Design and Implementation, New York, ACM, 2012: 15-28.
[4] CAI Y J, JIN S, YU W W, et al. Cooperative Distributed Resource Allocation in Heterogeneous Networks with D2D Communication[J]. IEEE Transactions on Vehicular Technology, 2023, 72(12): 16426-16440.
[5] LI H J, ZHU L S, WANG S C, et al. Cost-Aware Scheduling and Data Skew Alleviation for Big Data Processing in Heterogeneous Cloud Environment[J]. Journal of Grid Computing, 2023, 21: 33.
[6] DEBABRATA S, KOUSIK R, CHANDAN K, et al. TMDS: Temperature-aware Makespan Minimizing DAG Scheduler for Heterogeneous Distributed Systems[J]. ACM Transactions on Design Automation of Electronic Systems, 2023, 28(6): 22.
[7] SONG Y X, YU J Y, WANG J J, et al. Memory Management Optimization Strategy in Spark Framework Based on Less Contention[J]. The Journal of Supercomputing, 2023, 79: 1504-1525.
[8] KENNEDY J, EBERHART R. Particle Swarm Optimization[C]//Proc of the 1995 International Conference on Neural Networks, WA, Australia, IEEE, 1995: 1942-1948.
[9] HUSSAIN M, LUO M X, HUSSAIN A, et al. Deadline-Constrained Cost-Aware Workflow Scheduling in Hybrid Cloud[J]. Simulation Modelling Practice and Theory, 2023, 129: 102819.
[10] 严磊, 张功萱, 王添, 等. 混合云下具有交付期约束的众包任务调度算法[J]. 计算机科学, 2022, 49(05): 244-249.YAN L, ZHANG G X, WANG T, et al. Crowdsourcing Task Scheduling Algorithm with Delivery Time Constraints in Hybrid Cloud[J]. Computer Science, 2022, 49(05): 244-249.
[11] RAMESH D, RIZVI N, SRINIVASA R, et al. Improved Chemical Reaction Optimization with Fitness-Based Quasi-Reflection Method for Scheduling in Hybrid Cloud-Fog Environment[J]. IEEE Transactions on Network and Service Management, 2024, 21(1): 653-669.
[12] MIKRAM H, KAFHALI S E, SAADI Y. HEPGA: A New Effective Hybrid Algorithm for Scientific Workflow Scheduling in Cloud Computing Environment[J]. Simulation Modelling Practice and Theory, 2024, 130: 102864.
[13] SUN Z X, HUANG H J, LI Z K, et al. Efficient, Economical and Energy-Saving Multi-Workflow Scheduling in Hybrid Cloud[J]. Expert Systems with Applications, 2023, 228: 120401.
[14] PAL S, JHANJHI N Z, ABDULBAQI A S, et al. An Intelligent Task Scheduling Model for Hybrid Internet of Things and Cloud Environment for Big Data Applications[J]. Sustainability, 2023, 15(6): 5104.
[15] STAVRINIDES G L, KARATZA H D. Dynamic Scheduling of Bags-Of-Tasks with Sensitive Input Data and End-To-End Deadlines in a Hybrid Cloud[J]. Multimedia Tools and Applications, 2021, 80: 16781-16803.
[16] 林莉, 毛新雅, 储振兴, 等. 混合云环境下面向数据生命周期的自适应访问控制[J]. 软件学报, 2024, 35(3): 1357-1376.LIN L, MAO X Y, CHU Z X, et al. Adaptive Access Control for Data Lifecycle in Hybrid Cloud Environment[J]. Journal of Software, 2024, 35(3): 1357-1376.
[17] XIE Y, WANG, X Y, SHEN Z J, et al. A Two-Stage Estimation of Distribution Algorithm with Heuristics for Energy-Aware Cloud Workflow Scheduling[J]. IEEE Transactions on Services Computing, 2023, 16(6): 4183-4197.
[18] YE L J, XIA Y Q, YANG L W, et al. Dynamic Scheduling Stochastic Multiworkflows with Deadline Constraints in Clouds[J]. IEEE Transactions on Automation Science and Engineering, 2023, 20(4): 2594-2606.
[19] YE L J, XIA Y Q, TAO S Y, et al. Reliability-Aware and Energy-Efficient Workflow Scheduling in IaaS Clouds[J]. IEEE Transactions on Automation Science and Engineering, 2023, 20(3): 2156-2169.
[20] TONG Y L, LIU J Z, WANG H, et al. DAG-Aware Harmonizing Job Scheduling and Data Caching for Disaggregated Analytics Frameworks[J]. Future Generation Computer Systems,2024, 156: 116-129.
[21] LU S X, ZHAO M M, LI C L, et al. Time-Aware Data Partition Optimization and Heterogeneous Task Scheduling Strategies in Spark Clusters[J]. The Computer Journal, 2024, 67(2): 762-776.
[22] ZHOU Y F, LI X J, LUO J H, et al. Learning to Optimize DAG Scheduling in Heterogeneous Environment[C]//Proc of the 2022 23rd IEEE International Conference on Mobile Data Management (MDM), Washington D. C.: IEEE, 2022: 137-146.
[23] DUAN Y B, WANG N, WU J. Accelerating DAG-Style Job Execution via Optimizing Resource Pipeline Scheduling[J]. Journal of Computer Science and Technology, 2022, 37: 852-868.
[24] WANG Q Y, BIN G, ZHI Z, et al. DAG-Aware Optimization for Geo-Distributed Data Analytics[C]//Proc of the 52nd International Conference on Parallel Processing, New York: ACM, 2023: 472-481.
[25] UETER N, GÜNZEL M, BRÜGGEN, et al. Parallel Path Progression DAG Scheduling[J]. IEEE Transactions on Computers, 2023, 72(10): 3002-3016.
[26] FU Z M, HE M S, TANG Z, et al. Optimizing Data Locality by Executor Allocation in Spark Computing Environment[J]. Computer Science and Information Systems. 2023, 20(1): 491-512.
[27] HERODOTOS H, ELENA K. Cost-based Data Prefetching and Scheduling in Big Data Platforms over Tiered Storage Systems[J]. ACM Transactions Database System, 2023, 48(4): 40.
[28] RAJPUT K Y, LI X P, ABDULLAH L, et al, Task Scheduling in Multi-Cloud Environments for Spark Workflow under Performance Uncertainty[C]//Proc of the 2024 27th International Conference on Computer Supported Cooperative Work in Design (CSCWD), Washington D. C.: IEEE, 2024: 2752-2757.
[29] FU Z M, HE M S, YI Y, et al. Improving Data Locality of Tasks by Executor Allocation in Spark Computing Environment[J]. IEEE Transactions on Cloud Computing, 2024, 12(3): 876-888.
[30] ABDELRAHMAN E, YAN J H, ZHANG M Y. A Parallel Distributed Genetic Algorithm Using Apache Spark for Flexible Scheduling of Multitasks in a Cloud Manufacturing Environment[J]. International Journal of Computer Integrated Manufacturing, 2023, 37: 652-667.
[31] 何玉林, 莫沛恒, 黄哲学, 等. 一种新的期限与成本平衡为导向的Spark作业调度算法[J/OL]. 计算机工程, 1-15 [2025-01-05]. https://doi.org/10.19678/j.issn.1000-3428.0069968.
HE Y L, MO P H, HUANG Z X, et al. A Novel Spark Job Scheduling Algorithm Based on Deadline-Cost Balance[J/OL]. Computer Engineering, 1-15 [2025-01-05]. https://doi.org/10.19678/j.issn. 1000-3428. 0069968.
[32] JUVE G, CHERVENAK A, DEELMAN E, et al. Characterizing and Profiling Scientific Workflows[J]. Future Generator Computing System, 2013, 29(3): 682–692.
[33] Apache Spark. Spark Scheduling.
https://spark.apache.org/docs/latest/job-scheduling.html
[34] Apache Spark. Fair Scheduler.
https://spark.apache.org/docs/latest/job-scheduling.html#fair-scheduler-pools
[35] ISLAM M T, WU H M, KARUNASEKERA S, et al. SLA-Based Scheduling of Spark Jobs in Hybrid Cloud Computing Environments[J]. IEEE Transactions on Computers, 2021, 71(5): 1117-1132.
[36] MAO H Z, MALTE S, SHAILESHH B V, et al. Learning Scheduling Algorithms for Data Processing Clusters[C]//Proc of the ACM Special Interest Group on Data Communication, New York, ACM, 2019: 270-288.
[37] VERMA V P, KUMAR S, KUMAR S, et al. Optimizing Spark Job Scheduling with Distributional Deep Learning in Cloud Environments[J]. Journal of Cloud Computing, 2025, 14(1): article number 59.
[38] CAI J, LU L J. A Deep Reinforcement Learning Approach with Attention Mechanism for DAG Task Scheduling in Data Centers[J]. Concurrency and Computation: Practice and Experience, 2025, 37(25-26): article number e70279.
[39] RAJPUT K Y, LI X P, LAKHAN A. (2025). Spark Workflow Task Scheduling with Deadline and Privacy Constraints in Hybrid Cloud Networks[J]. Soft Computing, 2025, 29(2): 783-801.
[40] 何玉林, 吴东彤, Fournier-Viger Philippe , 等. 基于优先填补策略的Spark数据均衡分区方法[J]. 电子学报, 2024, 52(10): 3322-3335
HE Y L, WU D T, FOURNIER-VIGER P, et al. First Filling Strategy-Based Partitioning Method to Balance Data in Spark[J]. Acta Electronica Sinica, 2024, 52(10): 3322-3335.
|