[1] XIAO R. Well-known and influential corpora[M]//LUDELING A, KYTO M. Corpus linguistics: an international handbook. Berlin, Germany: Walter de Gruyter, 2008: 383-457. [2] 郝晓燕, 李济洪, 由丽萍, 等. 中文阅读理解语料库构建技术研究[J]. 中文信息学报, 2007, 21(6): 29-35. HAO X Y, LI J H, YOU L P, et al. A research on building of Chinese reading comprehension corpus[J]. Journal of Chinese Information Processing, 2007, 21(6): 29-35. (in Chinese) [3] 孔芳, 葛海柱, 周国栋. 篇章视角的汉语零指代语料库构建[J]. 软件学报, 2021, 32(12): 3782-3801. KONG F, GE H Z, ZHOU G D. Corpus construction for Chinese zero Anaphora from discourse perspective[J]. Journal of Software, 2021, 32(12): 3782-3801. (in Chinese) [4] 郭丽娟, 李正华, 彭雪, 等. 适应多领域多来源文本的汉语依存句法数据标注规范[J]. 中文信息学报, 2018, 32(10): 28-35, 52. GUO L J, LI Z H, PENG X, et al. Annotation guideline of Chinese dependency treebank from multi-domain and multi-source texts[J]. Journal of Chinese Information Processing, 2018, 32(10): 28-35, 52. (in Chinese) [5] CHOWDHERY A, NARANG S, DEVLIN J, et al. PaLM: scaling language modeling with pathways[J]. Journal of Machine Learning Research, 2023, 24(240): 1-113. [6] BROWN T, MANN B, RYDER N, et al. Language models are few-shot learners[C]//Proceedings of NIPS’20. Cambridge, USA: MIT Press, 2020: 1877-1901. [7] TAI C Y, CHEN Z R, ZHANG T S, et al. Exploring chain-of-thought style prompting for Text-to-SQL[EB/OL]. (2023-10-27)[2024-09-01]. https://arxiv.org/abs/2305.14215. [8] LI H T, AI Q Y, CHEN J, et al. BLADE: enhancing black-box large language models with small domain-specific models[EB/OL]. (2024-03-27)[2024-09-01]. https://arxiv.org/abs/2403.18365. [9] ZHONG V, XIONG C, SOCHER R. Seq2SQL: generating structured queries from natural language using reinforcement learning[EB/OL]. (2023-11-09)[2024-09-01]. https://arxiv.org/abs/1709.00103. [10] LI F, JAGADISH H V. Constructing an interactive natural language interface for relational databases[J]. Proceedings of the VLDB Endowment, 2014, 8(1): 73-84. [11] BAHDANAU D, CHO K, BENGIO Y. Neural machine translation by jointly learning to align and translate[EB/OL]. (2016-05-19)[2024-09-01]. https://arxiv.org/abs/1409.0473. [12] LI H Y, ZHANG J, LI C P, et al. RESDSQL: decoupling schema linking and skeleton parsing for Text-to-SQL[C]//Proceedings of the 37th AAAI Conference on Artificial Intelligence. Palo Alto, USA: AAAI Press, 2023: 13067-13075. [13] RAJKUMAR N, LI R, BAHDANAU D. Evaluating the Text-to-SQL capabilities of large language models[EB/OL]. (2022-03-15)[2024-09-01]. https://arxiv.org/abs/2204.00498. [14] LIU A W, HU X M, WEN L J, et al. A comprehensive evaluation of ChatGPT’s zero-shot Text-to-SQL capability[EB/OL]. (2023-03-12)[2024-09-01]. https://arxiv.org/abs/2303.13547. [15] DONG X M, ZHANG C, GE Y H, et al. C3: zero-shot Text-to-SQL with ChatGPT[EB/OL]. (2023-07-14)[2024-09-01]. https://arxiv.org/abs/2307.07306. [16] POURREZA M, RAFIEI D. DIN-SQL: decomposed in-context learning of Text-to-SQL with self-correction[C]//Proceedings of NIPS’23. Cambridge, USA: MIT Press, 2023: 36339-36348. [17] GAO D W, WANG H B, LI Y L, et al. Text-to-SQL empowered by large language models: a benchmark evaluation[EB/OL]. (2023-11-20)[2024-09-01]. https://arxiv.org/abs/2308.15363. [18] ZHANG H C, CAO R S, CHEN L, et al. ACT-SQL: in-context learning for Text-to-SQL with automatically-generated chain-of-thought[EB/OL]. (2023-10-26)[2024-09-01]. https://arxiv.org/abs/2310.17342. [19] LI M H, LV T C, CHEN J Y, et al. TrOCR: Transformer-based optical character recognition with pre-trained models[C]//Proceedings of the 37th AAAI Conference on Artificial Intelligence. Palo Alto, USA: AAAI Press, 2023: 13094-13102. [20] OUYANG L, WU J, JIANG X, et al. Training language models to follow instructions with human feedback[C]//Proceedings of NIPS’22. Cambridge, USA: MIT Press, 2022: 27730-27744. [21] SCHAEFFER R, MIRANDA B, KOYEJO S. Are emergent abilities of large language models a mirage?[C]//Proceedings of NIPS’23. Cambridge, USA: MIT Press, 2023: 55565-55581. [22] BESTA M, BLACH N, KUBICEK A, et al. Graph of thoughts: solving elaborate problems with large language models[C]//Proceedings of the 38th AAAI Conference on Artificial Intelligence. Palo Alto, USA: AAAI Press, 2024: 17682-17690. [23] CHANG Y P, WANG X, WANG J D, et al. A survey on evaluation of large language models[J]. ACM Transactions on Intelligent Systems and Technology, 2024, 15(3): 1-45. [24] JIANG A Q, SABLAYROLLES A, MENSCH A, et al. Mistral 7B[EB/OL]. (2023-10-10)[2024-09-01]. https://arxiv.org/abs/2310.06825. [25] TOUVRON H, LAVRIL T, IZACARD G, et al. LLaMA: Open and efficient foundation language models[EB/OL]. (2023-02-27)[2024-09-01]. https://arxiv.org/abs/2302.13971. [26] BROWN H, LEE K, MIRESHGHALLAH F, et al. What does it mean for a language model to preserve privacy?[C]//Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency. New York, USA: ACM Press, 2022: 2280-2292. [27] NAM D, MACVEAN A, HELLENDOORN V, et al. Using an LLM to help with code understanding[C]//Proceedings of the IEEE/ACM 46th International Conference on Software Engineering. New York, USA: ACM Press, 2024: 1-13. [28] KOIKE R, KANEKO M, OKAZAKI N. OUTFOX: LLM-generated essay detection through in-context learning with adversarially generated examples[C]//Proceedings of the 38th AAAI Conference on Artificial Intelligence. Palo Alto, USA: AAAI Press, 2024: 21258-21266. [29] DONG Q X, LI L, DAI D M, et al. A survey on in-context learning[EB/OL]. (2024-06-18)[2024-09-01]. https://arxiv.org/abs/2301.00234. [30] DOES J, NIESTADT J, DEPUYDT K. Creating research environments with BlackLab[EB/OL]. (2017-12-28)[2024-09-01]. https://github.com/INL/BlackLab. [31] NEELAKANTAN A, XU T, PURI R, et al. Text and code embeddings by contrastive pre-training[EB/OL]. (2022-01-24)[2024-09-01]. https://arxiv.org/abs/2201.10005. [32] LEE S, SHAKIR A, KOENIG D, et al. Open source strikes bread-new fluffy embeddings model[EB/OL]. (2024-03-19)[2024-09-01]. https://www.mixedbread.ai/blog/mxbai-embed-large-v1. [33] LI X M, LI J. Angle-optimized text embeddings[EB/OL]. (2024-07-17)[2024-09-01]. https://arxiv.org/abs/2309.12871. [34] WANG L, YANG N, HUANG X L, et al. Multilingual e5 text embeddings: a technical report[EB/OL]. (2024-02-08)[2024-09-01]. https://arxiv.org/abs/2402.05672. [35] HUANG L, YU W J, MA W T, et al. A survey on hallucination in large language models: principles, taxonomy, challenges, and open questions[EB/OL]. (2023-11-09)[2024-09-01]. https://arxiv.org/abs/2311.05232. [36] LIU N F, LIN K, HEWITT J, et al. Lost in the middle: how language models use long contexts[EB/OL]. (2023-11-20)[2024-09-01]. https://arxiv.org/abs/2307.03172. |