| 1 | LIU X, DING H, HU S. Uplink resource allocation for NOMA-based hybrid spectrum access in 6G-enabled cognitive Internet of Things. IEEE Internet of Things Journal, 2021, 8(20): 15049- 15058.  doi: 10.1109/JIOT.2020.3007017
 | 
																													
																							| 2 | ZAHEER K, OTHMAN M, REHMANI M H, et al. A survey of decision-theoretic models for Cognitive Internet of Things (CIoT). IEEE Access, 2018, 6, 22489- 22512.  doi: 10.1109/ACCESS.2018.2825282
 | 
																													
																							| 3 | WANG D Y, QI P H, FU Q F, et al. Multiple high-order cumulants-based spectrum sensing in full-duplex-enabled cognitive IoT networks. IEEE Internet of Things Journal, 2021, 8(11): 9330- 9343.  doi: 10.1109/JIOT.2021.3055782
 | 
																													
																							| 4 | EJAZ W, IBNKAHLA M. Multiband spectrum sensing and resource allocation for IoT in cognitive 5G networks. IEEE Internet of Things Journal, 2018, 5(1): 150- 163.  doi: 10.1109/JIOT.2017.2775959
 | 
																													
																							| 5 | LI F, LAM K Y, LI X H, et al. Advances and emerging challenges in cognitive Internet-of-Things. IEEE Transactions on Industrial Informatics, 2020, 16(8): 5489- 5496.  doi: 10.1109/TII.2019.2953246
 | 
																													
																							| 6 | BALA I, AHUJA K. Energy-efficient framework for throughput enhancement of cognitive radio network by exploiting transmission mode diversity. Journal of Ambient Intelligence and Humanized Computing, 2023, 14(3): 2167- 2184.  doi: 10.1007/s12652-021-03428-x
 | 
																													
																							| 7 | HE T J, CHIN K W, SOH S, et al. A novel distributed resource allocation scheme for wireless-powered cognitive radio Internet of Things networks. IEEE Internet of Things Journal, 2021, 8(20): 15486- 15499.  doi: 10.1109/JIOT.2021.3071396
 | 
																													
																							| 8 | DENG J H, CHEN S H, KU M L. Multiuser MIMO precoders with proactive primary interference cancelation and link quality enhancement for cognitive radio relay systems. IEEE Access, 2017, 5, 17701- 17712.  doi: 10.1109/ACCESS.2017.2749122
 | 
																													
																							| 9 | 张梓扬, 常军, 黄一帆, 等. 基于强化学习的空间引力波探测望远镜系统外杂光抑制研究. 光电工程, 2024, 51(2): 71- 81. | 
																													
																							|  | ZHANG Z Y, CHAGN J, HUAGN Y F, et al. Reinforcement learning-based stray light suppression study for space-based gravitational wave detection telescope system. Opto-Electronic Engineering, 2024, 51(2): 71- 81. | 
																													
																							| 10 | SUN C, DING H, LIU X. Multichannel spectrum access based on reinforcement learning in cognitive Internet of Things. Ad Hoc Networks, 2020, 106, 102200.  doi: 10.1016/j.adhoc.2020.102200
 | 
																													
																							| 11 | TAN X, ZHOU L, WANG H J, et al. Cooperative multi-agent reinforcement-learning-based distributed dynamic spectrum access in cognitive radio networks. IEEE Internet of Things Journal, 2022, 9(19): 19477- 19488.  doi: 10.1109/JIOT.2022.3168296
 | 
																													
																							| 12 | MOAYEDIAN N S, SALEHI S, KHABBAZIAN M. Fair resource allocation in cooperative cognitive radio IoT networks. IEEE Access, 2020, 8, 191067- 191079.  doi: 10.1109/ACCESS.2020.3032204
 | 
																													
																							| 13 | SAFDAR MALIK T, RAZZAQ MALIK K, AFZAL A, et al. RL-IoT: reinforcement learning-based routing approach for cognitive radio-enabled IoT communications. IEEE Internet of Things Journal, 2023, 10(2): 1836- 1847.  doi: 10.1109/JIOT.2022.3210703
 | 
																													
																							| 14 | XU L, YIN W X, ZHANG X L, et al. Fairness-aware throughput maximization over cognitive heterogeneous NOMA networks for industrial cognitive IoT. IEEE Transactions on Communications, 2020, 68(8): 4723- 4733.  doi: 10.1109/TCOMM.2020.2992720
 | 
																													
																							| 15 | TULI S, ILAGER S, RAMAMOHANARAO K, et al. Dynamic scheduling for stochastic edge-cloud computing environments using A3C learning and residual recurrent neural networks. IEEE Transactions on Mobile Computing, 2022, 21(3): 940- 954.  doi: 10.1109/TMC.2020.3017079
 | 
																													
																							| 16 | 王毅然, 经小川, 贾福凯, 等. 基于多智能体协同强化学习的多目标追踪方法. 计算机工程, 2020, 46(11): 90- 96.  doi: 10.3778/j.issn.1002-8331.1911-0132
 | 
																													
																							|  | WANG Y R, JING X C, JIA F K, et al. Multi-target tracking method based on multi-agent collaborative reinforcement learning. Computer Engineering, 2020, 46(11): 90- 96.  doi: 10.3778/j.issn.1002-8331.1911-0132
 | 
																													
																							| 17 | 杨思明, 单征, 丁煜, 等. 深度强化学习研究综述. 计算机工程, 2021, 47(12): 19- 29.  URL
 | 
																													
																							|  | YANG S M, SHAN Z, DING Y, et al. Survey of research on deep reinforcement learning. Computer Engineering, 2021, 47(12): 19- 29.  URL
 | 
																													
																							| 18 | LIU Y, YUAN X J, LIANG Y C, et al. Machine learning based iterative detection and multi-interference cancellation for cognitive IoT. IEEE Communications Letters, 2020, 24(9): 1995- 1999.  doi: 10.1109/LCOMM.2020.2997048
 | 
																													
																							| 19 | LIU Y, KUAI X Y, YUAN X J, et al. Learning-based iterative interference cancellation for cognitive Internet of Things. IEEE Internet of Things Journal, 2019, 6(4): 7213- 7224.  doi: 10.1109/JIOT.2019.2915598
 | 
																													
																							| 20 | LIU X, JIA M, DING H. Uplink resource allocation for multicarrier grouping cognitive Internet of Things based on k-means learning. Ad Hoc Networks, 2020, 96, 102002.  doi: 10.1016/j.adhoc.2019.102002
 | 
																													
																							| 21 | ZHAO K, XU H W, HUANG L Y, et al. Research on wireless communication distance test for mobile IoT devices[C]//Proceedings of the IEEE MTT-S International Microwave Workshop Series on Advanced Materials and Processes for RF and THz Applications (IMWS-AMP). Washington D. C., USA: IEEE Press, 2022: 1-3. | 
																													
																							| 22 | 宋佰霖, 许华, 齐子森, 等. 一种基于深度强化学习的协同通信干扰决策算法. 电子学报, 2022, 50(6): 1301- 1309. | 
																													
																							|  | SONG B L, XU H, QI Z S, et al. A collaborative communication jamming decision algorithm based on deep reinforcement learning. Acta Electronica Sinica, 2022, 50(6): 1301- 1309. | 
																													
																							| 23 | GENDERS W, RAZAVI S. Evaluating reinforcement learning state representations for adaptive traffic signal control. Procedia Computer Science, 2018, 130, 26- 33.  doi: 10.1016/j.procs.2018.04.008
 | 
																													
																							| 24 | WANG H J, GAO W, WANG W, et al. Research on obstacle avoidance planning for UUV based on A3C algorithm. Journal of Marine Science and Engineering, 2023, 12(63): 1- 14. | 
																													
																							| 25 | POKHREL S R. Learning from data streams for automation and orchestration of 6G industrial IoT: toward a semantic communication framework. Neural Computing and Applications, 2022, 34(18): 15197- 15206.  doi: 10.1007/s00521-022-07065-z
 | 
																													
																							| 26 | ALI SHAH H, ZHAO L, KIM I M. Joint network control and resource allocation for space-terrestrial integrated network through hierarchal deep actor-critic reinforcement learning. IEEE Transactions on Vehicular Technology, 2021, 70(5): 4943- 4954.  doi: 10.1109/TVT.2021.3071983
 | 
																													
																							| 27 | 罗志强, 王伟, 朱晓荣. 基于A3C的无线异构网络自适应视频流传输控制方法. 电信科学, 2020, 36(12): 65- 76. | 
																													
																							|  | LUO Z Q, WANG W, ZHU X R. An adaptive video stream transmission control method for wireless heterogeneous networks based on A3C. Telecommunications Science, 2020, 36(12): 65- 76. | 
																													
																							| 28 | HE Y, WANG Y H, QIU C, et al. Blockchain-based edge computing resource allocation in IoT: a deep reinforcement learning approach. IEEE Internet of Things Journal, 2021, 8(4): 2226- 2237.  doi: 10.1109/JIOT.2020.3035437
 | 
																													
																							| 29 | 邹玮琦, 牛朝阳, 刘伟, 等. 基于A3C的多功能雷达认知干扰决策方法. 系统工程与电子技术, 2023, 45(1): 86- 92. | 
																													
																							|  | ZOU W Q, NIU C Y, LIU W, et al. Cognitive jamming decision-making method against multifunctional radar based on A3C. Systems Engineering and Electronics, 2023, 45(1): 86- 92. | 
																													
																							| 30 | WEI Q L, WANG L X, LIU Y, et al. Optimal elevator group control via deep asynchronous actor-critic learning. IEEE Transactions on Neural Networks and Learning Systems, 2020, 31(12): 5245- 5256.  doi: 10.1109/TNNLS.2020.2965208
 | 
																													
																							| 31 | GUO S A, ZHAO X H. Deep reinforcement learning optimal transmission algorithm for cognitive Internet of Things with RF energy harvesting. IEEE Transactions on Cognitive Communications and Networking, 2022, 8(2): 1216- 1227.  doi: 10.1109/TCCN.2022.3142727
 |