[1] 陈栋, 关新平, 龙承念, 陈彩莲. 基于多智能体的区域监控系统[J]. 计算机工程, 2010, 36(21): 191-193.
CHEN D, GUAN X P, LONG C N, CHEN C L. Regional Monitoring System Based on Multi-Agent[J]. Computer Engineering, 2010, 36(21): 191-193.(in Chinese)
[2] GRONAUER S, DIEPOLD K. Multi-agent deep reinforcement learning: a survey[J]. Artificial Intelligence Review, 2022, 55(2): 895-943.
[3] 黄昌勤, 钟益华, 王希哲, 等. 从单智能体到多智能体:大模型智能体支持下的激励型学习活动设计与实证研究[J]. 华东师范大学学报(教育科学版), 2025, 43(05): 44-56.
HUANG C X, ZHONG Y H, WANG X Z, et al. From single agent to multi-agent: Design and empirical study of motivational learning activities supported by large-scale intelligent agents[J]. Journal of East China Normal University, 2025, 43(5): 44-56.(in Chinese)
[4] HAN S, ZHANG Q, YAO Y, et al. LLM multi-agent systems: challenges and open problems [EB/OL]. [2024-02-05]. https://arxiv.org/abs/2402.03578.
[5] WANG S, ZHANG G, YU M, et al. G-safeguard: A topology-guided security lens and treatment on LLM-based multi-agent systems [EB/OL]. [2025-02-16]. https://arxiv.org/abs/2502.11127.
[6] CEMRI M, PAN M Z, YANG S, et al. Why do multi-agent LLM systems fail? [EB/OL]. [2025-03-17]. https://arxiv.org/abs/2503.13657.
[7] WANG Z, LI J, ZHOU Q, et al. A Survey on AgentOps: Categorization, Challenges, and Future Directions [EB/OL]. [2025-08-04]. https://arxiv.org/abs/2508.02121.
[8] 董之南, 张勤学, 胡进, 等. 面向大模型多智能体系统的多维评估方法[J]. Command Control & Simulation/Zhihui Kongzhi yu Fangzhen, 2025, 47(2).
DONG Z N,ZHANG Q X,HU J,et al. A multi-dimensional evaluation method for large language model-powered multi-agent systems[J]. Command Control & Simulation, 2025, 47(2): 121-131.(in Chinese)
[9] LI X, WANG S, ZENG S, et al. A survey on LLM-based multi-agent systems: workflow, infrastructure, and challenges [J]. Vicinagearth, 2024, 1(1): 9.
[10] HUANG J T, ZHOU J, JIN T, et al. On the resilience of LLM-based multi-agent collaboration with faulty agents [EB/OL]. [2024-08-02]. https://arxiv.org/abs/2408.00989.
[11] BARBI O, YORAN O, GEVA M. Preventing rogue agents improves multi-agent collaboration [EB/OL]. [2025-02-09]. https://arxiv.org/abs/2502.05986.
[12] SUNG Y Y, KIM H, ZHANG D. Verila: a human-centered evaluation framework for interpretable verification of LLM agent failures [EB/OL]. [2025-03-16]. https://arxiv.org/abs/2503.12651.
[13] EPPERSON W, BANSAL G, DIBIA V C, et al. Interactive debugging and steering of multi-agent AI systems[C]//Proceedings of the 2025 CHI Conference on Human Factors in Computing Systems. 2025: 1-15.
[14] LI G, HAMMOUD H, ITANI H, et al. Camel: communicative agents for "mind" exploration of large language model society[J]. Advances in Neural Information Processing Systems, 2023, 36: 51991-52008.
[15] FOURNEY A, BANSAL G, MOZANNAR H, et al. Magentic-one: a generalist multi-agent system for solving complex tasks [EB/OL]. [2024-11-07]. https://arxiv.org/abs/2411.04468.
[16] Dong L, Lu Q, Zhu L. AgentOps: Enabling Observability of LLM Agents[J]. arXiv preprint arXiv, 2024, 2411.05285.
[17] AgentOS[EB/OL]. [2025- 05-27]. https://ag2.ai/.
[18] LI Q, CUI L, ZHAO X, et al. GSM-plus: a comprehensive benchmark for evaluating the robustness of LLMs as mathematical problem solvers [EB/OL]. [2024-02-29]. https://arxiv.org/abs/2402.19255.
[19] TRIVEDI H, KHOT T, HARTMANN M, et al. AppWorld: a controllable world of apps and people for benchmarking interactive coding agents [EB/OL]. [2024-07-26]. https://arxiv.org/abs/2407.18901.
[20] LI Y, XU J, HAN L, et al. Q-star meets scalable posterior sampling: bridging theory and practice via HyperAgent[C]//Proceedings of the 41st International Conference on Machine Learning: vol. 235. Vienna, Austria: JMLR.org, 2024: 29022-29062.
[21] JIMENEZ C E, YANG J, WETTIG A, et al. SWE-bench: can language models resolve real-world GitHub issues?[C/OL]//The Twelfth International Conference on Learning Representations. Vienna, Austria: ICLR, 2024: 1-14.
[22] OpenManus - open-source robotics control framework [EB/OL]. [2025-05-27]. https://open-manus.org.
[23] MIALON G, FOURRIER C, WOLF T, et al. GAIA: a benchmark for general AI assistants[C/OL]//The Twelfth International Conference on Learning Representations. Vienna, Austria: ICLR, 2024: 1-15.
[24] HONG S, ZHUGE M, CHEN J, et al. MetaGPT: Meta programming for a multi-agent collaborative framework[C]//International Conference on Learning Representations (ICLR). New York, USA: ICLR, 2024.
[25] QIAN C, CONG X, YANG C, et al. Communicative agents for software development [EB/OL]. [2023-07-16]. https://arxiv.org/abs/2307.07924.
[26] HENDRYCKS D, BURNS C, BASART S, et al. Measuring massive multitask language understanding[C/OL]//International Conference on Learning Representations. Vienna, Austria: ICLR, 2021: 1-10.
[27] MAST/traces[EB/OL]. [2025-05-27]. https://github.com/ multi-agent-systems-failure-taxonomy/MAST/tree/main/traces.
[28] Gemini 2.5 pro[EB/OL]. [2025-05-27]. https://deepmind. google/models/gemini/pro/.
[29] Lune H, Berg B L. Qualitative research methods for the social sciences[M]. 2017.
[30] Kohen J. A coefficient of agreement for nominal scale[J]. Educ Psychol Meas, 1960, 20: 37-46.
|