1 |
HAO J Y, YANG T P, TANG H Y, et al. Exploration in deep reinforcement learning: from single-agent to multiagent domain[EB/OL]. [2022-06-04]. https://arxiv.org/abs/2109.06668.
|
2 |
PATHAK D, AGRAWAL P, EFROS A A, et al. Curiosity-driven exploration by self-supervised prediction[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Workshops. Washington D. C., USA: IEEE Press, 2017: 488-489.
|
3 |
|
4 |
|
5 |
|
6 |
|
7 |
BELLEMARE M G, SRINIVASAN S, OSTROVSKI G, et al. Unifying count-based exploration and intrinsic motivation[C]//Proceedings of the 30th International Conference on Neural Information Processing Systems. New York, USA: ACM Press, 2016: 1479-1487.
|
8 |
PARISI S, DEAN V, PATHAK D, et al. Interesting object, curious agent: learning task-agnostic exploration[EB/OL]. [2022-06-04]. https://arxiv.org/abs/2111.13119.
|
9 |
DAYAN P, HINTON G E. Feudal reinforcement learning[C]//Proceedings of NIPS'92. New York, USA: ACM Press, 1992: 271-278.
|
10 |
SCHAUL T, HORGAN D, GREGOR K, et al. Universal value function approximators[C]//Proceedings of the 32nd International Conference on Machine Learning. New York, USA: ACM Press, 2015: 1312-1320.
|
11 |
KULKARNI T D, NARASIMHAN K R, SAEEDI A, et al. Hierarchical deep reinforcement learning: integrating temporal abstraction and intrinsic motivation[EB/OL]. [2022-06-04]. https://arxiv.org/abs/1604.06057.
|
12 |
|
13 |
|
14 |
PATERIA S, SUBAGDJA B, TAN A H, et al. End-to-end hierarchical reinforcement learning with integrated subgoal discovery. IEEE Transactions on Neural Networks and Learning Systems, 2022, 33(12): 7778- 7790.
doi: 10.1109/TNNLS.2021.3087733
|
15 |
LI R J, CAI Z L, HUANG T Y, et al. Anchor: the achieved goal to replace the subgoal for hierarchical reinforcement learning. Knowledge-Based Systems, 2021, 225, 107128.
doi: 10.1016/j.knosys.2021.107128
|
16 |
SUTTON R S, BARTO A G. Reinforcement learning: an introduction. London, UK: MIT Press, 2018.
|
17 |
刘全, 翟建伟, 章宗长, 等. 深度强化学习综述. 计算机学报, 2018, 41(1): 1- 27.
URL
|
|
LIU Q, ZHAI J W, ZHANG Z Z, et al. A survey on deep reinforcement learning. Chinese Journal of Computers, 2018, 41(1): 1- 27.
URL
|
18 |
|
19 |
LEVINE S, FINN C, DARRELL T, et al. End-to-end training of deep visuomotor policies[J]. Journal of Machine Learning Research, 2016, 17: 39: 1-40.
|
20 |
ARADI S. Survey of deep reinforcement learning for motion planning of autonomous vehicles. IEEE Transactions on Intelligent Transportation Systems, 2022, 23(2): 740- 759.
doi: 10.1109/TITS.2020.3024655
|
21 |
MNIH V, KAVUKCUOGLU K, SILVER D, et al. Human-level control through deep reinforcement learning. Nature, 2015, 518(7540): 529- 533.
doi: 10.1038/nature14236
|
22 |
BABAEIZADEH M, FROSIO I, TYREE S, et al. Reinforcement learning through asynchronous advantage actor-critic on a GPU[EB/OL]. [2022-06-04]. https://arxiv.org/abs/1611.06256.
|
23 |
ESPEHOLT L, MARINIER R, STANCZYK P, et al. SEED RL: scalable and efficient deep-RL with accelerated central inference[EB/OL]. [2022-06-04]. https://arxiv.org/abs/1910.06591.
|
24 |
WIJMANS E, KADIAN A, MORCOS A, et al. DD-PPO: learning near-perfect PointGoal navigators from 2.5 billion frames[EB/OL]. [2022-06-04]. https://arxiv.org/abs/1911.00357.
|
25 |
ESPEHOLT L, SOYER H, MUNOS R, et al. IMPALA: scalable distributed deep-RL with importance weighted actor-learner architectures[EB/OL]. [2022-06-04]. https://arxiv.org/abs/1802.01561.
|
26 |
STANTON C, CLUNE J. Deep curiosity search: intra-life exploration can improve performance on challenging deep reinforcement learning problems[EB/OL]. [2022-06-04]. https://arxiv.org/abs/1806.00553.
|