作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2022, Vol. 48 ›› Issue (11): 284-290,298. doi: 10.19678/j.issn.1000-3428.0063375

• 开发研究与工程应用 • 上一篇    下一篇

基于QMix的车辆云计算资源动态分配方法

刘金石, Manzoor Ahmed, 林青   

  1. 青岛大学 计算机科学技术学院, 山东 青岛 266071
  • 收稿日期:2021-11-28 修回日期:2022-01-04 发布日期:2022-01-12
  • 作者简介:刘金石(1997—),男,硕士研究生,主研方向为车辆云计算、任务卸载;Manzoor Ahmed(通信作者),副教授;林青,讲师。
  • 基金资助:
    国家重点研发计划重点专项(2018YFB2100303);国家自然科学基金(61802216);山东省高等学校青创科技计划创新团队项目(2020KJN011);山东省自然科学基金(ZR2020MF060)。

QMix-Based Method for Dynamic Resource Allocation Leveraging Vehicular Cloudlet Computing

LIU Jinshi, Manzoor Ahmed, LIN Qing   

  1. College of Computer Science and Technology, Qingdao University, Qingdao, Shandong 266071, China
  • Received:2021-11-28 Revised:2022-01-04 Published:2022-01-12

摘要: 城市交通智能化和通信技术的进步会产生大量基于车辆的应用,但目前车辆有限的计算资源无法满足车辆应用的计算需求与延迟性约束。车辆云(VC)可以高效地调度资源,从而显著降低任务请求的延迟与传输成本。针对VC环境下任务卸载与计算资源分配问题,提出一个考虑异质车辆和异质任务的计计资源分配算法。对到达的任务构建M/M/1队列模型与计算模型,并定义一个效用函数以最大化系统整体效用。针对环境中车辆地理分布的高度动态系统变化,提出基于双时间尺度的二次资源分配机制(SRA),使用两个不同时间尺度的资源分配决策动作,对其分别构建部分可观测马尔可夫决策过程。两个决策动作通过执行各自的策略获得的奖励进行连接,将问题建模为两层计算资源分配问题。在此基础上提出基于二次资源分配机制的多智能体算法SRA-QMix求解最优策略。仿真结果表明,与深度确定性策略梯度算法对比,该算法的整体效用值和任务完成率分别提高了70%、6%,对于QMix和MADDPG算法分别应用SRA后的任务完成率分别提高了13%与15%,可适用于动态的计算资源分配环境。

关键词: 车辆云, 多智能体强化学习, QMix算法, 任务卸载, 排队理论

Abstract: With the development of urban traffic intelligence and communication technology, several vehicle-based applications can exist.However, the limited computing resources of today's vehicles cannot meet the vehicular applications' computing requirements and latency constraints.The Vehicular Cloudlet (VC) can efficiently dispatch resources to significantly reduce the time delay and transmission cost of the task request.For the task offloading and resource allocation problem in a VC environment, a computing resource allocation algorithm is proposed considering heterogeneous vehicles and tasks.First, M/M/1 queue and computing models are formulated for arriving tasks, and then, a utility function is defined to maximize the overall utility of the system.To deal with a highly dynamic system by vehicle geographical distribution in the environment, a Secondary Resource Allocation (SRA) mechanism based on dual-time-scales is proposed.In this mechanism, two difference time-scales are used for resource allocation decision-making actions and constructing partially observable Markov decision processes.The two decision-making actions are connected through the reward feedbacks obtained by their respective execution strategies, and the problem is modeled as a two-layered computing resource allocation problem.Subsequently, a multi-agent algorithm based on the SRA mechanism, SRA-QMix, is proposed to obtain an optimal strategy.Compared with the deep deterministic policy gradient algorithm, the simulation results show that the proposed algorithm can improve the utility value by 70% and the task finish rate by 6%.In addition, the task finish rate can be improved by 13% and 15% after applying the SRA mechanism for the QMix and MADDPG algorithms.This shows that the scheme based on SRA mechanism can adapt to the dynamic resource allocation environment.

Key words: Vehicular Cloudlet (VC), Multi-Agent Reinforcement Learning(MARL), QMix algorithm, task offloading, queuing theory

中图分类号: