基于多智能体深度强化学习的SD-IoT控制器部署

doi:10.19678/j.issn.1000-3428.0068958

摘要/Abstract

摘要：

物联网(IoT)中激增的流量, 影响了传感器等设备的数据传输。利用软件定义网络(SDN)技术可以优化网络性能, 提高数据传输质量。然而, 物联网中流量等网络状态的不断变化会影响软件定义网络控制平面的性能。研究软件定义物联网(SD-IoT)中的动态控制器部署问题, 以在流量变化时保证控制平面性能。考虑到物联网节点的能耗以及数据传输的特点, 在部署控制器时, 综合考虑延迟、控制可靠性以及能耗的影响, 并将该问题构建为马尔可夫博弈过程。为了同时兼顾单一控制器性能以及控制平面整体性能, 采用多智能体深度强化学习求解该问题。在部署阶段利用动作掩码屏蔽部分节点, 避免将控制器部署在性能不足或者供能不方便的节点。仿真实验表明, 与基于Louvain社区划分和基于单智能体深度Q网络(DQN)的部署算法相比, 所提算法可以更好地找到高性能的部署方案。

关键词: 软件定义物联网, 控制器部署, 多智能体深度强化学习, 动作掩码, 马尔可夫博弈

Abstract:

The rapid growth of Internet of Things (IoT) traffic has significantly impacted data transmission for devices such as sensors. Software-Defined Networking (SDN) offers a solution to optimize network performance and enhance data transmission quality. However, the dynamic nature of network states, such as traffic fluctuations in IoT environments, poses challenges to the performance of the control plane in SDN. This study addresses the dynamic controller placement problem in Software-Defined IoT (SD-IoT) to ensure consistent control plane performance under changing traffic conditions. The approach considers the energy consumption and data transmission characteristics of IoT nodes when deploying controllers, with a comprehensive evaluation of factors such as delay, control reliability, and energy consumption. The problem is modeled as a Markov game process to capture these dynamics effectively. To optimize both individual controller performance and the overall control plane performance, this study employs multi-agent deep reinforcement learning. During the deployment phase, action masks are utilized to exclude nodes with insufficient performance or limited power supply, ensuring robust and efficient controller placement. Simulation experiments demonstrate that the proposed algorithm identifies high-performance deployment solutions compared with the placement algorithms based on Louvain community division or single agent Deep Q-Network(DQN), achieving superior results in dynamic IoT environments.

Key words: Software-Defined Internet of Things (SD-IoT), controller placement, multi-agent deep reinforcement learning, action mask, Markov game

吕超峰, 徐鹏飞, 罗迪, 刘金平. 基于多智能体深度强化学习的SD-IoT控制器部署[J]. 计算机工程, 2025, 51(5): 83-92.

LÜ Chaofeng, XU Pengfei, LUO Di, LIU Jinping. SD-IoT Controller Placement Based on Multi-Agent Deep Reinforcement Learning[J]. Computer Engineering, 2025, 51(5): 83-92.

https://www.ecice06.com/CN/Y2025/V51/I5/83

图/表 10

图1 多智能体DQN执行过程

Fig.1 Execution process of multi-agent DQN

图2 Oxford拓扑结构

Fig.2 Oxford topology

图3 传感器节点到控制器的平均延迟

Fig.3 Average latency from sensor node to controller

图4 控制器负载差异

Fig.4 Controller load difference

图5 平均控制可靠性

Fig.5 Average control reliability

图6 控制器平均能耗

Fig.6 Average energy consumption of controller

图7 平均负载

Fig.7 Average loading

图8 最大负载

Fig.8 Maximum loading

图9 平均跳数

Fig.9 Average hops

图10 奖励值收敛图

Fig.10 Reward value convergence diagram

参考文献 31

1	THEODOROU T , MAMATAS L . SD-MIoT: a software-defined networking solution for mobile Internet of Things. IEEE Internet of Things Journal, 2021, 8 (6): 4604- 4617. doi: 10.1109/JIOT.2020.3027427
2	KOO Y C , MAHYUDDIN M N , WAHAB M N A . Novel control theoretic consensus-based time synchronization algorithm for WSN in industrial applications: convergence analysis and performance characterization. IEEE Sensors Journal, 2023, 23 (4): 4159- 4175. doi: 10.1109/JSEN.2022.3231726
3	LIANG Y J , WANG X , YU Z W , et al. Energy-efficient collaborative sensing: learning the latent correlations of heterogeneous sensors. ACM Transactions on Sensor Networks, 2021, 17 (3): 1- 28.
4	张钦宇, 顾术实, 王野, 等. 空间物联网的分布式数据存储与传输技术. 物联网学报, 2018, 2 (4): 22- 30.
	ZHANG Q Y , GU S S , WANG Y , et al. Distributed data storage and transmission for space Internet of Things. Chinese Journal on Internet of Things, 2018, 2 (4): 22- 30.
5	ZHANG Z Y , MA L , LEUNG K K , et al. More is not always better: an analytical study of controller synchronizations in distributed SDN. IEEE/ACM Transactions on Networking, 2021, 29 (4): 1580- 1590. doi: 10.1109/TNET.2021.3066580
6	BEKRI W , JMAL R , FOURATI L C . Softwarized Internet of Things network monitoring. IEEE Systems Journal, 2021, 15 (1): 826- 834. doi: 10.1109/JSYST.2020.3015435
7	BHATTACHARJYA K , DE D . IoUT: modelling and simulation of edge-drone-based software-defined smart Internet of underwater things. Simulation Modelling Practice and Theory, 2021, 109, 102304. doi: 10.1016/j.simpat.2021.102304
8	LIMA L E , ROSSET V . An extended software-defined approach for reprogramming low-end IoT devices. IEEE Latin America Transactions, 2022, 20 (7): 1058- 1066. doi: 10.1109/TLA.2021.9827468
9	MAHANTESH H M , NAGESWARA G M , HEMA M S . Optimized path and reduced rule caching cost for Software Defined Network (SDN) based Internet of Things (IoT). Wireless Personal Communications, 2021, 120 (3): 2349- 2365. doi: 10.1007/s11277-021-08698-4
10	BATISTA E , FIGUEIREDO G , PRAZERES C . Load balancing between fog and cloud in fog of things based platforms through software-defined networking. Journal of King Saud University-Computer and Information Sciences, 2022, 34 (9): 7111- 7125. doi: 10.1016/j.jksuci.2021.10.003
11	ASAITHAMBI S , RAVI L , KOTB H , et al. An energy-efficient and blockchain-integrated software defined network for the industrial Internet of Things. Sensors (Basel, Switzerland), 2022, 22 (20): 7917. doi: 10.3390/s22207917
12	QIU K , HUANG S Y , XU Q W , et al. ParaCon: a parallel control plane for scaling up path computation in SDN. IEEE Transactions on Network and Service Management, 2017, 14 (4): 978- 990. doi: 10.1109/TNSM.2017.2761777
13	JARWAN A , SABBAH A , IBNKAHLA M . Information-oriented traffic management for energy-efficient and loss-resilient IoT systems. IEEE Internet of Things Journal, 2022, 9 (10): 7388- 7403. doi: 10.1109/JIOT.2021.3132925
14	ZHANG T Z, BIANCO A, GIACCONE P. The role of inter-controller traffic in SDN controllers placement[C]//Proceedings of the IEEE Conference on Network Function Virtualization and Software Defined Networks (NFV-SDN). Washington D. C., USA: IEEE Press, 2016: 87-92.
15	HAN L, LI Z Y, LIU W J, et al. Minimum control latency of SDN controller placement[C]//Proceedings of the IEEE Trustcom/BigDataSE/ISPA. Washington D. C., USA: IEEE Press, 2016: 2175-2180.
16	LIN N, ZHAO Q, ZHAO L, et al. Intelligent UAV-aided controller placement scheme for software-defined vehicular networks[C]//Proceedings of the 18th ACM International Conference on Computing Frontiers. New York, USA: ACM Press, 2021: 38-44.
17	HU Y, LUO T, WANG W J, et al. On the load balanced controller placement problem in software defined networks[C]//Proceedings of the 2nd IEEE International Conference on Computer and Communications (ICCC). Washington D. C., USA: IEEE Press, 2016: 2430-2434.
18	SCHVTZ G , MARTINS J A . A comprehensive approach for optimizing controller placement in software-defined networks. Computer Communications, 2020, 159, 198- 205. doi: 10.1016/j.comcom.2020.05.008
19	赵季红, 孙天骜, 曲桦, 等. 一种改进社区检测算法的SDN控制器部署策略. 计算机工程, 2020, 46 (11): 207- 213. URL
	ZHAO J H , SUN T A , QU H , et al. An SDN controller deployment strategy for improved community detection algorithm. Computer Engineering, 2020, 46 (11): 207- 213. URL
20	LIN N , ZHAO Q , ZHAO L , et al. A novel cost-effective controller placement scheme for software-defined vehicular networks. IEEE Internet of Things Journal, 2020, 8 (18): 14080- 14093.
21	LUONG N C , HOANG D T , GONG S M , et al. Applications of deep reinforcement learning in communications and networking: a survey. IEEE Communications Surveys & Tutorials, 2019, 21 (4): 3133- 3174.
22	WU Y W, ZHOU S P, WEI Y K, et al. Deep reinforcement learning for controller placement in software defined network[C]//Proceedings of the IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS). Washington D. C., USA: IEEE Press, 2020: 1254-1259.
23	LI B , DENG X H , CHEN X C , et al. MEC-based dynamic controller placement in SD-IoV: a deep reinforcement learning approach. IEEE Transactions on Vehicular Technology, 2023, 71 (9): 10044- 10058.
24	WANG Z , WANG C Y , LI X H , et al. Evolutionary Markov dynamics for network community detection. IEEE Transactions on Knowledge and Data Engineering, 2022, 34 (3): 1206- 1220. doi: 10.1109/TKDE.2020.2997043
25	TOHIDI E , PARSAEEFARD S , ALI M M , et al. Near-optimal robust virtual controller placement in 5G software defined networks. IEEE Transactions on Network Science and Engineering, 2021, 8 (2): 1687- 1697. doi: 10.1109/TNSE.2021.3068975
26	王建晖, 邹涛, 张春良, 等. 带输出死区的多智能体系统预设时间事件触发式协同控制. 控制与决策, 2023, 38 (2): 441- 449.
	WANG J H , ZOU T , ZHANG C L , et al. Prescribed setting time event-triggered synergetic control of multiagent systems with output dead-zone. Control and Decision, 2023, 38 (2): 441- 449.
27	DENG X T , LI N Y , MGUNI D , et al. On the complexity of computing Markov perfect equilibrium in general-sum stochastic games. National Science Review, 2023, 10 (1): nwac256. doi: 10.1093/nsr/nwac256
28	LOW E S , ONG P , LOW C Y , et al. Modified Q-learning with distance metric and virtual target on path planning of mobile robot. Expert Systems with Applications, 2022, 199, 117191. doi: 10.1016/j.eswa.2022.117191
29	KNIGHT S , NGUYEN H X , FALKNER N , et al. The Internet topology zoo. IEEE Journal on Selected Areas in Communications, 2011, 29 (9): 1765- 1775. doi: 10.1109/JSAC.2011.111002
30	CHANDRASEKARAN S , SANTIBANEZ F , TRIPATHI B B , et al. In situ ultrasound imaging of shear shock waves in the porcine brain. Journal of Biomechanics, 2022, 134, 110913. doi: 10.1016/j.jbiomech.2021.110913
31	HOU X L , WU M Q , BO L , et al. Multi-controller deployment algorithm in hierarchical architecture for SDWAN. IEEE Access, 2019, 7, 65839- 65851. doi: 10.1109/ACCESS.2019.2917027

[1]	魏德宾, 乔维维, 张怡. 基于麻雀搜索算法的软件定义卫星网络控制器部署[J]. 计算机工程, 2025, 51(3): 172-179.
[2]	倪苏婕, 陈兵, 石优. 一种联合V2I和V2V的任务卸载优化方案[J]. 计算机工程, 2024, 50(12): 174-183.
[3]	王腾, 黄俊松, 王乐庭, 张才坤, 李枭扬. 基于MADDPG的多阵面相控阵雷达引导搜索资源优化算法[J]. 计算机工程, 2024, 50(11): 38-48.
[4]	高航航, 王翔, 赵尚弘, 彭聪. 面向航空信息网络的控制器可靠性部署方法[J]. 计算机工程, 2020, 46(6): 221-229.
[5]	赵季红, 孙天骜, 曲桦, 张茵, 翟凡妮. 一种改进社区检测算法的SDN控制器部署策略[J]. 计算机工程, 2020, 46(11): 207-213.
[6]	姚蓝,兰巨龙,胡涛. 基于聚类优化的SDN多域自适应管理方法[J]. 计算机工程, 2019, 45(6): 119-126.
[7]	邹卯荣,傅明,熊兵. 基于时延与负载的SDN控制器部署模型[J]. 计算机工程, 2019, 45(4): 30-35.
[8]	赵季红,蔡田杰,曲桦,赵建龙,罗金. SDN中应用网络分区的控制器部署策略[J]. 计算机工程, 2019, 45(1): 73-77.
[9]	陆杰,张震,胡涛. 基于可靠性与负载优化的多控制器弹性部署算法[J]. 计算机工程, 2018, 44(8): 135-141.
[10]	杨力, 孔志翔, 石怀峰. 软件定义空间信息网络多控制器动态部署策略[J]. 计算机工程, 2018, 44(10): 58-63.
[11]	赵思逸,陈靖,龚水清. 基于粒子群优化的虚拟SDN网络映射算法[J]. 计算机工程, 2016, 42(12): 84-90.

选择文件类型/文献管理软件名称

选择包含的内容