[1] D'ANDREA R.Guest editorial:a revolution in the warehouse:a retrospective on kiva systems and the grand challenges ahead[J].IEEE Transactions on Automation Science and Engineering, 2012, 9(4):638-639. [2] YUAN Z, GONG Y M.Improving the speed delivery for robotic warehouses[J].IFAC-PapersOnLine, 2016, 49(12):1164-1168. [3] BALLESTÍN F, PÉREZ Á, QUINTANILLA S.A multistage heuristic for storage and retrieval problems in a warehouse with random storage[J].International Transactions in Operational Research, 2020, 27(3):1699-1728. [4] 于赫年, 白桦, 李超.仓储式多AGV系统的路径规划研究及仿真[J].计算机工程与应用, 2020, 56(2):233-241. YU H N, BAI H, LI C.Research and simulation on path planning of warehouse multi-AGV system[J].Computer Engineering and Applications, 2020, 56(2):233-241.(in Chinese) [5] 鲁建厦, 陈寿伍, 易文超, 等.跨层穿梭车仓储系统复合作业路径规划[J].计算机集成制造系统, 2021, 27(6):1799-1808. LU J S, CHEN S W, YI W C, et al.Path planning of compound operations in tier-to-tier multi-shuttle warehouse system[J].Computer Integrated Manufacturing Systems, 2021, 27(6):1799-1808.(in Chinese) [6] SCHULMAN J, WOLSKI F, DHARIWAL P, et al.Proximal policy optimization algorithms[EB/OL].[2022-01-10].https://arxiv.org/abs/1707.06347. [7] KONDA V R, TSITSIKLIS J N.Actor-Critic algorithms[C]//Proceedings of Advances in Neural Information Processing Systems.Cambridge, USA:MIT Press, 2000:1008-1014. [8] KUSHMERICK N, HANKS S, WELD D S.An algorithm for probabilistic planning[J].Artificial Intelligence, 1995, 76(1/2):239-286. [9] 饶东宁, 郭海峰, 蒋志华.基于并行概率规划的股票指数模拟[J].计算机学报, 2019, 42(6):1334-1350. RAO D N, GUO H F, JIANG Z H.Stock index simulation based on parallel probabilistic planning[J].Chinese Journal of Computers, 2019, 42(6):1334-1350.(in Chinese) [10] CUI H, KELLER T, KHARDON R.Stochastic planning with lifted symbolic trajectory optimization[C]//Proceedings of International Conference on Automated Planning and Scheduling.Washington D.C., USA:IEEE Press, 2021:119-127. [11] MNIH V, KAVUKCUOGLU K, SILVER D, et al.Human-level control through deep reinforcement learning[J].Nature, 2015, 518(7540):529-533. [12] SCHULMAN J, LEVINE S, ABBEEL P, et al.Trust region policy optimization[C]//Proceedings of International Conference on Machine Learning.Washington D.C., USA:IEEE Press, 2015:1889-1897. [13] KAHN G, VILLAFLOR A, DING B S, et al.Self-supervised deep reinforcement learning with generalized computation graphs for robot navigation[C]//Proceedings of 2018 IEEE International Conference on Robotics and Automation.Washington D.C., USA:IEEE Press, 2018:5129-5136. [14] KIM B, PINEAU J.Socially adaptive path planning in human environments using inverse reinforcement learning[J].International Journal of Social Robotics, 2016, 8(1):51-66. [15] WU Y X, SONG W, CAO Z G, et al.Learning improvement heuristics for solving routing problems[J].IEEE Transactions on Neural Networks and Learning Systems, 2022, 33(9):5057-5069. [16] ZHANG C, SONG W, CAO Z G, et al.Learning to dispatch for job shop scheduling via deep reinforcement learning[C]//Proceedings of Advances in Neural Information Processing Systems.Cambridge, USA:MIT Press, 2020:1621-1632. [17] 杨思明, 单征, 丁煜, 等.深度强化学习研究综述[J].计算机工程, 2021, 47(12):19-29. YANG S M, SHAN Z, DING Y, et al.Survey of research on deep reinforcement learning[J].Computer Engineering, 2021, 47(12):19-29.(in Chinese) [18] DUAN Y, CHEN X, HOUTHOOFT R, et al.Benchmarking deep reinforcement learning for continuous control[C]//Proceedings of International Conference on Machine Learning.Washington D.C., USA:IEEE Press, 2016:1329-1338. [19] MNIH V, BADIA A P, MIRZA M, et al.Asynchronous methods for deep reinforcement learning[C]//Proceedings of International conference on machine learning.Washington D.C., USA:IEEE Press, 2016:1928-1937. [20] SCHULMAN J, MORITZ P, LEVINE S, et al.High-dimensional continuous control using generalized advantage estimation[EB/OL].[2022-01-10].https://arxiv.org/abs/1506.02438. [21] RUDER S.An overview of multi-task learning in deep neural networks[EB/OL].[2022-01-10].https://arxiv.org/abs/1706.05098. [22] 周伟枭, 蓝雯飞.融合文本分类的多任务学习摘要模型[J].计算机工程, 2021, 47(4):48-55. ZHOU W X, LAN W F.Summarization model using multi-task learning fused with text classification[J].Computer Engineering, 2021, 47(4):48-55.(in Chinese) [23] KIM S, HORI T, WATANABE S.Joint CTC-attention based end-to-end speech recognition using multi-task learning[C]//Proceedings of 2017 IEEE International Conference on Acoustics, Speech and Signal Processing.Washington D.C., USA:IEEE Press, 2017:4835-4839. [24] YU F, CHEN H F, WANG X, et al.BDD100K:a diverse driving dataset for heterogeneous multitask learning[C]//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C., USA:IEEE Press, 2020:2633-2642. [25] JIANG Y, CAO Z G, ZHANG J.Learning to solve 3-D Bin packing problem via deep reinforcement learning and constraint programming[J].IEEE Transactions on Cybernetics, 2021, 22(9):1-12. [26] IOFFE S, SZEGEDY C.Batch normalization:accelerating deep network training by reducing internal covariate shift[C]//Proceedings of the 32nd International Conference on Machine Learning.Washington D.C., USA:IEEE Press, 2015:448-456. [27] 汤洪涛, 闫伟杰, 陈青丰, 等.自动化立体仓库货位分配与作业调度集成优化[J].计算机科学, 2020, 47(5):204-211. TANG H T, YAN W J, CHEN Q F, et al.Integrated optimization of location assignment and job scheduling in automated storage and retrieval system[J].Computer Science, 2020, 47(5):204-211.(in Chinese) |