| 1 | 翟洁, 李艳豪, 李彬彬, 等.  基于大语言模型的个性化实验报告评语自动生成与应用. 计算机工程, 2024, 50 (7): 42- 52.  doi: 10.19678/j.issn.1000-3428.0069593
 | 
																													
																							|  |  ZHAI J ,  LI Y H ,  LI B B , et al.  Personalized experiment report comments auto-generation and application based on large language models. Computer Engineering, 2024, 50 (7): 42- 52.  doi: 10.19678/j.issn.1000-3428.0069593
 | 
																													
																							| 2 |  CHANG Y P ,  WANG X ,  WANG J D , et al.  A survey on evaluation of large language models. ACM Transactions on Intelligent Systems and Technology, 2024, 15 (3): 1- 45. | 
																													
																							| 3 |  XIE T ,  KUANG Y Y ,  TANG Y , et al.  Using LLM-supported lecture summarization system to improve knowledge recall and student satisfaction. Expert Systems with Applications, 2025, 269, 126371.  doi: 10.1016/j.eswa.2024.126371
 | 
																													
																							| 4 | DING J H, NGUYEN H, CHEN H H. Evaluation of question-answering based text summarization using LLM invited paper[C]//Proceedings of the IEEE International Conference on Artificial Intelligence Testing (AITest). Washington D.C., USA: IEEE Press, 2024: 142-149. | 
																													
																							| 5 |  | 
																													
																							| 6 |  GAO D H ,  CHEN K D ,  CHEN B , et al.  LLMs-based machine translation for E-commerce. Expert Systems with Applications, 2024, 258, 125087.  doi: 10.1016/j.eswa.2024.125087
 | 
																													
																							| 7 | 刘金硕, 文尧.  模板运算代码的自动生成与调优框架. 计算机工程, 2024, 50 (6): 35- 47.  doi: 10.19678/j.issn.1000-3428.0068234
 | 
																													
																							|  |  LIU J S ,  WEN Y .  Auto-generation and auto-tuning framework of stencil operation code. Computer Engineering, 2024, 50 (6): 35- 47.  doi: 10.19678/j.issn.1000-3428.0068234
 | 
																													
																							| 8 |  MU F W ,  SHI L ,  WANG S , et al.  Clarifygpt: a framework for enhancing LLM-based code generation via requirements clarification. Proceedings of the ACM on Software Engineering, 2024, 1, 2332- 2354.  doi: 10.1145/3660810
 | 
																													
																							| 9 | 程腾腾, 姚春龙, 于晓强, 等.  基于多头注意力机制融合常识知识的共情对话生成. 计算机工程, 2024, 50 (6): 94- 101.  doi: 10.19678/j.issn.1000-3428.0068404
 | 
																													
																							|  |  CHENG T T ,  YAO C L ,  YU X Q , et al.  Empathetic dialogue generation by incorporating commonsense knowledge based on multi-head attention mechanism. Computer Engineering, 2024, 50 (6): 94- 101.  doi: 10.19678/j.issn.1000-3428.0068404
 | 
																													
																							| 10 |  ZHUANG Y C ,  YU Y ,  WANG K , et al.  ToolQA: a dataset for LLM question answering with external tools. Advances in Neural Information Processing Systems, 2023, 36, 50117- 50143. | 
																													
																							| 11 | OH H, KIM K, KIM J, et al. ExeGPT: constraint-aware resource scheduling for LLM inference[C]//Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems. New York, USA: ACM Press, 2024: 369-384. | 
																													
																							| 12 |  | 
																													
																							| 13 |  | 
																													
																							| 14 | SHENG Y, ZHENG L M, YUAN B H, et al. FlexGen: high-throughput generative inference of large language models with a single GPU[EB/OL]. [2024-10-05]. https://arxiv.org/abs/2303.06865 . | 
																													
																							| 15 | LIAO J J, LI M Z, YANG H L, et al. Exploiting input tensor dynamics in activation checkpointing for efficient training on GPU[C]//Proceedings of the IEEE International Parallel and Distributed Processing Symposium (IPDPS). Washington D.C., USA: IEEE Press, 2023: 156-166. | 
																													
																							| 16 |  | 
																													
																							| 17 | PENG X, SHI X H, DAI H L, et al. Capuchin: tensor-based GPU memory management for deep learning[C]//Proceedings of the 25th International Conference on Architectural Support for Programming Languages and Operating Systems. New York, USA: ACM Press, 2020: 891-905. | 
																													
																							| 18 | SUN Z B, CAO H Q, WANG Y W, et al. AdaPipe: optimizing pipeline parallelism with adaptive recomputation and partitioning[C]//Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems. New York, USA: ACM Press, 2024: 86-100. | 
																													
																							| 19 |  | 
																													
																							| 20 |  | 
																													
																							| 21 |  DALE R .  GPT-3: what's it good for?. Natural Language Engineering, 2021, 27 (1): 113- 118.  doi: 10.1017/S1351324920000601
 | 
																													
																							| 22 |  | 
																													
																							| 23 | KIM H, YU Y, JIANG L W, et al. ProsocialDialog: a prosocial backbone for conversational agents[C]//Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. [S. l.]: ACL, 2022: 4005-4029. | 
																													
																							| 24 | WANG Y Z, KORDI Y, MISHRA S, et al. Self-instruct: aligning language models with self-generated instructions[C]//Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics. [S. l.]: ACL, 2023: 13484-13508. | 
																													
																							| 25 | WANG S Q, YANG H L, WANG X Z, et al. Minions: accelerating large language model inference with aggregated speculative execution[EB/OL]. [2024-10-05]. https://arxiv.org/abs/2402.15678v2 . | 
																													
																							| 26 | KWON W, LI Z H, ZHUANG S Y, et al. Efficient memory management for large language model serving with PagedAttention[C]//Proceedings of the 29th Symposium on Operating Systems Principles. New York, USA: ACM Press, 2023: 611-626. | 
																													
																							| 27 | HOLMES C, TANAKA M, WYATT M, et al. DeepSpeed-FastGen: high-throughput text generation for LLMs via MII and DeepSpeed-inference[EB/OL]. [2024-10-05]. https://arxiv.org/abs/2401.08671v1 . |