1 |
黄柯棣, 刘宝全, 黄健, 等. 作战仿真技术综述[C]//全球化制造高级论坛暨21世纪仿真技术研讨会论文集. 北京: 中国系统仿真学会, 2004: 80-89.
|
|
HUANG K D, LIU B Q, HUANG J, et al. A survey of military simulation technologies[C]//Global Manufacturing Advanced Forum and 21st Century Simulation Technology Seminar. Beijing: China System Simulation Society, 2004: 80-89.
|
2 |
赵慧赟, 张东戈. 战场指挥控制时效性影响因素分析. 军事运筹与系统工程, 2015, 29(2): 12-16, 49.
URL
|
|
ZHAO H Y, ZHANG D G. Analysis of influencing factors on timeliness of battlefield command and control. Military Operations Research and Assessment, 2015, 29(2): 12-16, 49.
URL
|
3 |
尹强, 叶雄兵. 作战筹划方法研究. 国防科技, 2016, 37(1): 95- 99.
URL
|
|
YIN Q, YE X B. The initially research for the method of operational design. National Defense Science & Technology, 2016, 37(1): 95- 99.
URL
|
4 |
曹占广, 陶帅, 胡晓峰, 等. 国外兵棋推演及系统研究进展. 系统仿真学报, 2021, 33(9): 2059- 2065.
URL
|
|
CAO Z G, TAO S, HU X F, et al. Abroad wargaming deduction and system research. Journal of System Simulation, 2021, 33(9): 2059- 2065.
URL
|
5 |
刘海洋, 唐宇波, 胡晓峰, 等. 基于兵棋推演的联合作战方案评估框架研究. 系统仿真学报, 2018, 30(11): 4115-4122, 4131.
URL
|
|
LIU H Y, TANG Y B, HU X F, et al. Research on evaluation framework of COA based on wargaming. Journal of System Simulation, 2018, 30(11): 4115-4122, 4131.
URL
|
6 |
SURDU J R. The deep green concept[C]//Processings of the 2008 Spring Simulation Multiconference. Berlin, Germany: Springer, 2008: 623-631.
|
7 |
李承兴, 高桂清, 鞠金鑫, 等. 基于人工智能深度增强学习的装备维修保障兵棋研究. 兵器装备工程学报, 2018, 39(2): 61- 65.
URL
|
|
LI C X, GAO G Q, JU J X, et al. Study on equipment maintenance and security based on artificial intelligence depth enhancement. Journal of Ordnance Equipment Engineering, 2018, 39(2): 61- 65.
URL
|
8 |
张晓海, 操新文, 耿松涛, 等. 基于深度学习的军事辅助决策智能化研究. 兵器装备工程学报, 2018, 39(10): 162- 167.
URL
|
|
ZHANG X H, CAO X W, GENG S T, et al. Research on intelligence of military auxiliary decision-making system based on deep learning. Journal of Ordnance Equipment Engineering, 2018, 39(10): 162- 167.
URL
|
9 |
杨思明, 单征, 丁煜, 等. 深度强化学习研究综述. 计算机工程, 2021, 47(12): 19- 29.
URL
|
|
YANG S M, SHAN Z, DING Y, et al. Survey of research on deep reinforcement learning. Computer Engineering, 2021, 47(12): 19- 29.
URL
|
10 |
徐佳乐, 张海东, 赵东海, 等. 基于卷积神经网络的陆战兵棋战术机动策略学习. 系统仿真学报, 2022, 34(10): 2181- 2193.
URL
|
|
XU J L, ZHANG H D, ZHAO D H, et al. Learning tactics and maneuvering strategies of marine chess based on convolutional neural network. Journal of System Simulation, 2022, 34(10): 2181- 2193.
URL
|
11 |
|
12 |
刘全, 翟建伟, 章宗长, 等. 深度强化学习综述. 计算机学报, 2018, 41(1): 1- 27.
URL
|
|
LIU Q, ZHAI J W, ZHANG Z Z, et al. A survey on deep reinforcement learning. Chinese Journal of Computers, 2018, 41(1): 1- 27.
URL
|
13 |
WILLIAMS R J. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning, 1992, 8(3/4): 229- 256.
|
14 |
RIEDMILLER M. Neural fitted Q iteration-first experiences with a data efficient neural reinforcement learning method[C]//Proceedings of European Conference on Machine Learning. Berlin, Germany: Springer, 2005: 317-328.
|
15 |
MNIH V, KAVUKCUOGLU K, SILVER D, et al. Human-level control through deep reinforcement learning. Nature, 2015, 518(7540): 529- 533.
|
16 |
SUTTON R S. Learning to predict by the methods of temporal differences. Machine Learning, 1988, 3(1): 9- 44.
|
17 |
CAO J Q, LIU Q, ZHU F, et al. Gradient temporal-difference learning for off-policy evaluation using emphatic weightings. Information Sciences, 2021, 580, 311- 330.
|
18 |
YANG Z Y, MERRICK K, JIN L W, et al. Hierarchical deep reinforcement learning for continuous action control. IEEE Transactions on Neural Networks and Learning Systems, 2018, 29(11): 5174- 5184.
|
19 |
姚桐, 王越, 董岩, 等. 深度强化学习在作战任务规划中的应用. 飞航导弹, 2020,(4): 16- 21.
URL
|
|
YAO T, WANG Y, DONG Y, et al. Application of deep reinforcement learning in operational mission planning. Aerospace Technology, 2020,(4): 16- 21.
URL
|
20 |
MNIH V, GREGORY K. Asynchronous methods for deep reinforcement learning[C]//Proceedings of the 33rd International Conference on Machine Learning. New York, USA: ACM Press, 2016: 1-10.
|
21 |
ZHAO T T, HACHIYA H, NIU G, et al. Analysis and improvement of policy gradient estimation. Neural Networks, 2012, 26, 118- 129.
|
22 |
|
23 |
|
24 |
SCHULMAN J, LEVINE S, MORITZ P, et al. Trust region policy optimization[C]//Proceedings of the 32nd International Conference on Machine Learning. New York, USA: ACM Press, 2015: 1889-1897.
|
25 |
|
26 |
DAVID S, AJA H, MADDISON CHRIS J, et al. Mastering the game of Go with deep neural networks and tree search. Nature, 2016, 529(7587): 484- 489.
|
27 |
李昊. 五子棋人机博弈算法优化研究与实现[D]. 大连: 大连海事大学, 2020.
|
|
LI H. Research and implementation of man-machine game algorithm optimization for gobang[D]. Dalian: Dalian Maritime University, 2020. (in Chinese)
|