作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2023, Vol. 49 ›› Issue (8): 122-129. doi: 10.19678/j.issn.1000-3428.0065169

• 移动互联与通信技术 • 上一篇    下一篇

面向大规模网络的服务功能链部署方法

张冠莹1,2, 伊鹏2, 李丹2, 朱棣2, 毛明2   

  1. 1. 郑州大学 网络空间安全学院, 郑州 450000
    2. 中国人民解放军战略支援部队信息工程大学 信息技术研究所, 郑州 450000
  • 收稿日期:2022-07-07 出版日期:2023-08-15 发布日期:2023-08-15
  • 作者简介:

    张冠莹(1997—),女,硕士研究生,主研方向为新型网络、深度学习

    伊鹏,研究员

    李丹,副研究员

    朱棣,硕士研究生

    毛明,博士研究生

  • 基金资助:
    国家重点研发计划(2022YFB2901304); 国家自然科学基金(62002382); 嵩山实验室项目(221100210900-03)

Service Function Chain Deployment Method for Large-Scale Network

Guanying ZHANG1,2, Peng YI2, Dan LI2, Di ZHU2, Ming MAO2   

  1. 1. School of Cyber Science and Engineering, Zhengzhou University, Zhengzhou 450000, China
    2. Information Technology Institute, PLA Strategic Support Force Information Engineering University, Zhengzhou 450000, China
  • Received:2022-07-07 Online:2023-08-15 Published:2023-08-15

摘要:

网络功能虚拟化(NFV)将网络功能从硬件中间盒中解耦出来,部署功能实例并编排为服务功能链(SFC),从而实现网络服务。针对资源受限情况下大规模网络环境中的SFC动态部署问题,提出一种基于多智能体的群策部署方法,该方法结合了集中式深度强化学习(DRL)和传统分布式方法的优点。将SFC部署问题建模为部分可见马尔可夫决策过程,每个节点部署一个Actor-Critic智能体,仅通过观察本地节点信息即可得到全局训练策略,具有DRL的灵活性和自适应性。本地智能体控制交互过程,以解决集中式DRL方法在大规模网络中控制复杂、响应速度慢等问题。基于多线程的思想,收集、整合每个节点的经验进行集中式训练,避免完全分布式训练过程中部分节点因请求流量少而导致训练不充分、策略不适用等问题。实验结果表明,该方法无须考虑网络规模而且不依赖特定场景,可以很好地适应现实中复杂多变的网络环境,在相对复杂的流量环境中,与CDRL、GCASP方法相比,在多种流量模式下所提方法的部署成功率均提高了20%以上,同时能够降低部署成本。

关键词: 网络功能虚拟化, 服务功能链, 深度强化学习, 部分可见马尔可夫决策过程, 多智能体

Abstract:

Network Function Virtualization(NFV) decouples network functions from hardware intermediate boxes, deploys function instances and arranges them into Service Function Chains(SFC) to realize network services.A multi-agent based group strategy deployment method is proposed for the dynamic deployment of SFC in large-scale network environments with resource constraints.The proposed method combines the advantages of centralized Deep Reinforcement Learning(DRL) and traditional distributed methods.The SFC deployment problem is modeled as a Partially Observable Markov Decision Process(POMDP), with each node deploying an Actor-Critic(AC) agent.The global training strategy can be obtained only by observing local node information, which has DRL flexibility and adaptability. The local agent controls the interaction process to solve complex control and slow response speed problems in large-scale networks using centralized DRL methods.Based on the multithreading concept, this research aims to collect and integrate the experience of each node for centralized training, to avoid problems such as insufficient training and policy inapplicability caused by low request traffic in some nodes during the fully distributed training process. Experimental results demonstrate that while it adapts well to complex and everchanging environments in practice, it is not necessary for the proposed method to rely on specific scenarios or to consider network scale.In relatively complex traffic environments, compared with CDRL and GCASP methods, the proposed method's deployment success rate in multiple traffic modes increased by over 20%, while reducing deployment costs.

Key words: Network Function Virtualization(NFV), Service Function Chain(SFC), Deep Reinforcement Learning(DRL), Partially Observable Markov Decision Process(POMDP), multiple agent