Author Login Chief Editor Login Reviewer Login Editor Login Remote Office

Computer Engineering

   

Resource Scheduling Algorithm for Space Science Satellite Ground Data Processing

  

  • Published:2026-06-12

面向空间科学卫星数据处理的资源调度算法

Abstract: With the increasing number of space science satellites, the types of onboard scientific payloads have become increasingly diverse, and the volume of downlinked scientific data has grown continuously. However, the available computational resources of ground data processing systems for space science satellites remain limited. Consequently, data processing tasks generated by satellites during in-orbit operations must be completed under constrained resource conditions. Meanwhile, different tasks exhibit significant heterogeneity in terms of timeliness requirements and computational resource consumption characteristics, and the system workload and resource states vary dynamically over time. Therefore, scheduling strategies need to dynamically adjust the execution order of data processing tasks and resource allocation schemes based on real-time system states (including task loads and computational resource utilization) to improve overall processing efficiency and system responsiveness.To address these challenges, we propose an online decision-making deep reinforcement learning–based resource scheduling algorithm, DeepRL-Sched, which is built upon Proximal Policy Optimization (PPO) and models the satellite data processing task scheduling problem as a Markov Decision Process (MDP). To mitigate the short-sighted decision-making issue caused by reinforcement learning methods relying solely on the current system state, as well as the challenges of slow convergence and unstable training, we design two key components: a computational resource demand prediction module and an imitation learning module. The former predicts future task workloads and resource demands to provide the scheduling policy with foresight information, thereby alleviating short-sighted decisions caused by partial observability. The latter employs imitation learning to extract prior knowledge from high-quality expert scheduling strategies, guiding the training of the policy network and significantly improving convergence speed and training stability.Experimental results demonstrate that the proposed algorithm effectively enhances the scheduling efficiency of space science satellite ground data processing systems, reduces the overall task completion time, and significantly improves the timeliness of processing high-priority tasks.

摘要: 随着空间科学卫星数量的持续增加及科学载荷类型的多样化,下行科学数据规模不断增长,数据处理任务在数量、类型及处理流程等方面的复杂度显著提升,不同数据处理任务在时效性要求和计算资源占用特征方面存在显著差异,给地面数据处理系统带来了更高的计算与调度压力,因此需要结合空间科学卫星数据处理任务特点进行计算资源调度策略的研究,对卫星数据处理任务的执行顺序与计算资源进行更高效的调度与分配,从而提升整体处理效率与系统响应能力。本文提出了一种支持在线决策的深度强化学习资源调度算法DeepRL- Sched,该算法以近端策略优化(Proximal Policy Optimization,PPO)为核心,将卫星数据处理任务调度过程建模为马尔可夫决策过程。为了解决强化学习方法仅依赖当前系统状态进行决策而易产生短视性的问题,以及训练过程中收敛慢、稳定性差的挑战,设计了计算资源需求预测模块和模仿学习模块两个关键组件:前者通过预测未来任务负载与资源需求,构建扩展状态表示,从而增强策略对系统未来演化趋势的感知能力,缓解因局部观测导致的短视决策;后者则采用模仿学习方法,从高质量专家调度策略中提取先验知识,引导策略网络训练,从而有效提升算法的收敛速度与训练稳定性。实验结果表明,该算法能够有效提升空间科学卫星地面数据处理系统的调度效率,降低任务整体完成时间,并显著改善高时效性任务的处理及时性。