作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2024, Vol. 50 ›› Issue (8): 142-152. doi: 10.19678/j.issn.1000-3428.0069352

• 网络空间安全 • 上一篇    下一篇

基于深度强化学习的可信任务卸载方案

石琼, 段辉, 师智斌*()   

  1. 中北大学计算机科学与技术学院, 山西 太原 030051
  • 收稿日期:2024-02-04 出版日期:2024-08-15 发布日期:2024-06-13
  • 通讯作者: 师智斌
  • 基金资助:
    山西省应用基础研究计划(自由探索类面上项目)(20210302123075); 公安部重点实验室基金(2023FMKFKT01)

Trusted Task Offloading Scheme Based on Deep Reinforcement Learning

Qiong SHI, Hui DUAN, Zhibin SHI*()   

  1. School of Computer Science and Technology, North University of China, Taiyuan 030051, Shanxi, China
  • Received:2024-02-04 Online:2024-08-15 Published:2024-06-13
  • Contact: Zhibin SHI

摘要:

针对移动边缘计算(MEC)中边缘服务器是否可信的安全性问题, 以及基于深度强化学习(DRL)的任务卸载方案存在收敛慢、波动大的难题, 提出一种基于信任感知和DRL算法的任务卸载方案。首先, 构建基于客观信息熵和历史卸载次数组合赋权的多源反馈信任融合模型, 用于聚合信任反馈信息, 评估边缘服务器的可信度; 然后, 利用基于优先级经验采样的优先经验回放(PER)-SAC算法, 将基站作为智能体, 负责计算任务的卸载决策。实验结果表明, 该方案相较于TASACO、SRTO-DDPG和I-PPO方案, 具有更优的性能和更好的收敛性, 其累积奖励、时延和能耗指标均为最优, 且其收敛速度更快、波动幅度更小, 在多个测试场景下, 相较于TASACO方案能耗性能最少提升5.8%, 最大提升32.2%, 时延性能最少提升8.5%, 最大提升21.3%。

关键词: 移动边缘计算, 任务卸载, 网络安全, 深度强化学习, 信任机制

Abstract:

To address security concerns related to the trust worthiness of edge servers in Mobile Edge Computing (MEC) as well as the challenges of slow convergence and significant fluctuations in task offloading schemes based on Deep Reinforcement Learning(DRL), this study proposes a task offloading scheme based on trust perception and the DRL algorithm. A multisource feedback trust fusion model is first constructed that utilizes the combined weighting of objective information entropy and historical offload times to assess edge server credibility. The Priority Experience Replay(PER)-SAC algorithm, based on priority experience sampling, is then used to treat the base station as an intelligent agent responsible for offloading decisions in computational tasks. Experimental results show that the proposed scheme has superior performance and convergence as compared with the TASACO, SRTO-DDPG, and I-PPO schemes. The cumulative reward, delay, and energy consumption indicators of the proposed scheme all exhibit optimality with a faster convergence speed and minimized fluctuation range. Compared with the TASACO solution under various test scenarios, the energy consumption performance of the proposed scheme is improved by at least 5.8%, with a maximum improvement of 32.2%, and the latency performance is improved by at least 8.5%, with a maximum improvement of 21.3%.

Key words: Mobile Edge Computing(MEC), task offloading, network security, Deep Reinforcement Learning(DRL), trust mechanism