DARE-SQL: A Multi-Candidate SQL Generation and Selection Framework Based on Question Ambiguity

doi:10.19678/j.issn.1000-3428.0253372

Abstract

Abstract: The Text-to-SQL task aims to convert natural language queries (NLQ) into Structured Query Language (SQL). Although the rise of Large Language Models (LLMs) has redefined the paradigm of this task, most existing studies focus on optimizing the model's schema awareness and SQL generation capabilities through prompt engineering, while often neglecting the prevalent semantic ambiguity in natural language. This neglect leads to comprehension biases when models handle complex scenarios. To address this, we propose a Text-to-SQL framework with Disambiguation, Analysis, Refinement, and Election (DARE-SQL). The framework first leverages the semantic reasoning capabilities of LLMs to construct a semantic expansion module, which generates an expanded set of questions covering the user's potential intent space to explicate and capture fuzzy semantics. Subsequently, differentiated generation strategies are applied to questions from various sources, and a refinement mechanism based on execution feedback is introduced to optimize the results, thereby building a high-quality set of candidate SQLs. Finally, a two-stage selection strategy based on question consensus is employed to filter for the optimal solution that balances both accuracy and execution performance. Experimental results demonstrate that DARE-SQL achieves an Execution Accuracy (EX) of 71.71% and a Valid Efficiency Score (VES) of 70.41 on the challenging BIRD benchmark, and reaches 88.10% EX on the classic Spider dataset. These results validate the effectiveness of explicit ambiguity modeling in enhancing performance for complex Text-to-SQL tasks.

摘要： Text-to-SQL任务旨在将自然语言查询（NLQ）转化为结构化查询语言（SQL）。尽管大语言模型（LLM）的兴起重新定义了该任务的范式，但现有研究多侧重于通过提示工程优化模型对模式信息的感知及SQL生成能力，往往忽略了自然语言中普遍存在的语义歧义性，导致模型在处理复杂问题时易产生理解偏差。为此，本文提出一种基于歧义分析的多候选生成与选择框架——DARE-SQL（A Text-to-SQL Framework with Disambiguation, Analysis, Refinement and Election）。该框架首先利用LLM的语义推理能力构建语义扩展模块，针对潜在歧义生成覆盖用户意图空间的扩展问题集，以显化并捕捉模糊语义。随后，针对不同来源的问题采用差异化生成策略，并引入基于执行反馈的修正机制优化生成结果，构建高质量候选SQL集合。最后，通过问题共识的两阶段选择策略，筛选出兼顾准确性与执行性能的最优解。实验结果表明，DARE-SQL在具有挑战性的BIRD基准上取得了71.71%的执行准确率（EX）与70.41的有效效率得分（VES），并在Spider数据集上达到88.10%的EX，验证了显式建模语义歧义对提升复杂Text-to-SQL任务性能的有效性。

Liu Mingkai, He Peiwen, Liu Mengchi. DARE-SQL: A Multi-Candidate SQL Generation and Selection Framework Based on Question Ambiguity[J]. Computer Engineering, doi: 10.19678/j.issn.1000-3428.0253372.

刘明凯, 何佩雯, 刘梦赤. DARE-SQL：基于问题歧义的多候选SQL生成与选择框架[J]. 计算机工程, doi: 10.19678/j.issn.1000-3428.0253372.

/ Recommend / Download Citations

URL: https://www.ecice06.com/EN/10.19678/j.issn.1000-3428.0253372

References

[1] LI F, JAGADISH H V. Constructing an interactive natural language interface for relational databases[J]. Proceedings of the VLDB Endowment, 2014, 8(1): 73-84.
[2] IYER S, KONSTAS I, CHEUNG A, et al. Learning a Neural Semantic Parser from User Feedback[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Vancouver, Canada: Association for Computational Linguistics, 2017: 963-973.
[3] 曹合心, 赵亮, 李雪峰. 图神经网络在 Text-to-SQL 解析中的技术研究[J]. 计算机科学, 2022, 49(4): 110-115. CAO He-xin, ZHAO Liang, LI Xue-feng. Technical Research of Graph Neural Network for Text-to-SQL Parsing[J]. Computer Science, 2022, 49(4): 110-115.
[4] BOGIN B, BERANT J, GARDNER M. Representing Schema Structure with Graph Neural Networks for Text-to-SQL Parsing[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence, Italy: Association for Computational Linguistics, 2019: 4560-4565.
[5] WANG B, SHIN R, LIU X, et al. RAT-SQL: Relation-Aware Schema Encoding and Linking for Text-to-SQL Parsers[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Online: Association for Computational Linguistics, 2020: 7567-7578.
[6] 曹渝昆, 王天浩, 李云峰, 陈明, 李晶晶, 刘元旻. 基于关系感知图神经网络的Text-to-SQL方法[J]. 计算机工程, 2025, 51(9): 129-138. CAO Yukun, WANG Tianhao, LI Yunfeng, CHEN Ming, LI Jingjing, LIU Yuanmin. Text-to-SQL Method Based on Relation-aware Graph Neural Network[J]. Computer Engineering, 2025, 51(9): 129-138.
[7] DEVLIN J, CHANG M W, LEE K, et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding[C]//Proceedings of the 2019 Conference of the North. Minneapolis, Minnesota: Association for Computational Linguistics, 2019: 4171-4186.
[8] FU Y, OU W, YU Z, et al. MIGA: A Unified Multi-Task Generation Framework for Conversational Text-to-SQL[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2023, 37(11): 12790-12798.
[9] RAI D, WANG B, ZHOU Y, et al. Improving Generalization in Language Model-based Text-to-SQL Semantic Parsing: Two Simple Semantic Boundary-based Techniques[C]//Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics. Toronto, Canada: Association for Computational Linguistics, 2023: 150-160.
[10] ZHOU R, ZHANG F. Refining Zero-Shot Text-to-SQL Benchmarks via Prompt Strategies with Large Language Models[J]. Applied Sciences, 2025, 15(10): 5306.
[11] LI B, LUO Y, CHAI C, et al. The Dawn of Natural Language to SQL: Are We Fully Ready?[J]. Proceedings of the VLDB Endowment, 2024, 17(11): 3318-3331.
[12] SHI L, TANG Z, ZHANG N, et al. A Survey on Employing Large Language Models for Text-to-SQL Tasks[J]. ACM Computing Surveys, 2026, 58(2): 1-37.
[13] QU G, LI J, LI B, et al. Before Generation, Align it! A Novel and Effective Strategy for Mitigating Hallucinations in Text-to-SQL Generation[C]//Findings of the Association for Computational Linguistics ACL 2024. Bangkok, Thailand and virtual meeting: Association for Computational Linguistics, 2024: 5456-5471.
[14] BOSMA M, CHI E, ICHTER B, et al. Chain-Of-Thought Prompting Elicits Reasoning in Large Language Models[C]//Advances in Neural Information Processing Systems 35. New Orleans, Louisiana, USA: Neural Information Processing Systems Foundation,2022: 24824-24837.
[15] POURREZA M, RAFIEI D. DIN-SQL: Decomposed In-Context Learning of Text-to-SQL with Self-Correction[J]. Advances in Neural Information Processing Systems, 2023, 36: 36339-36348.
[16] WANG D, DOU L, ZHANG X, et al. DAC: Decomposed Automation Correction for Text-to-SQL[C]//Findings of the Association for Computational Linguistics: EMNLP 2025. Suzhou, China: Association for Computational Linguistics, 2025: 385-402.
[17] 于晓昕,何东,叶子铭,等.一种利用词典扩展数据库模式信息的Text2SQL方法[J].四川大学学报(自然科学版),2024,61(01):84-94. YU Xiao-Xin, HE Dong, YE Zi-Ming, et al. A Text2SQL method utilizing database schema information expanded by dictionary[J]. Journal of sichuan university (natural science edition), 2024, 61(1): 012004.
[18] ZHANG H, CAO R, XU H, et al. CoE-SQL: In-Context Learning for Multi-Turn Text-to-SQL with Chain-of-Editions[C]//Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Mexico City, Mexico: Association for Computational Linguistics, 2024: 6487-6508.
[19] TALAEI S, POURREZA M, CHANG Y C, et al. CHESS: Contextual Harnessing for Efficient SQL Synthesis[EB/OL]. [2025-12-01]. https://arxiv.org/abs/2405.16755.
[20] LI H, WU S, ZHANG X, et al. OmniSQL: Synthesizing High-Quality Text-to-SQL Data at Scale[J]. Proceedings of the VLDB Endowment, 2025, 18(11): 4695-4709.
[21] WANG B, REN C, YANG J, et al. MAC-SQL: A Multi-Agent Collaborative Framework for Text-to-SQL[C]//Proceedings of the 31st International Conference on Computational Linguistics. Abu Dhabi, UAE: Association for Computational Linguistics, 2025: 540-557.
[22] LIU Y, ZHU Y, GAO Y, et al. XiYan-SQL: A Novel Multi-Generator Framework for Text-to-SQL[J]. IEEE Transactions on Knowledge and Data Engineering, 2026, 38(4): 2474-2487.
[23] POURREZA M, LI H, SUN R, et al. CHASE-SQL: Multi-Path Reasoning and Preference Optimized Candidate Selection in Text-to-SQL[C]//The Thirteenth International Conference on Learning Representations. Singapore: ICLR, 2025: 27017-27048.
[24] DÖNDER Y D, HOMMEL D, WEN-YI A W, et al. Cheaper, Better, Faster, Stronger: Robust Text-to-SQL without Chain-of-Thought or Fine-Tuning[EB/OL]. [2025-12-01]. https://arxiv.org/abs/2505.14174.
[25] YU T, ZHANG R, YANG K, et al. Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Brussels, Belgium: Association for Computational Linguistics, 2018: 3911-3921.
[26] LI J, HUI B, QU G, et al. Can LLM Already Serve as A Database Interface? A BIg Bench for Large-Scale Database Grounded Text-to-SQLs[J]. Advances in Neural Information Processing Systems, 2023, 36: 42330-42357.
[27] XIE X, XU G, ZHAO L, et al. OpenSearch-SQL: Enhancing Text-to-SQL with Dynamic Few-shot and Consistency Alignment[J]. Proceedings of the ACM on Management of Data, 2025, 3(3): 1-24.
[28] CAO Z, ZHENG Y, FAN Z, et al. RSL-SQL: Robust Schema Linking in Text-to-SQL Generation[EB/OL]. [2025-12-01]. https://arxiv.org/abs/2411.00073.
[29] CAFEROĞLU H A, ULUSOY Ö. E-SQL: Direct Schema Linking via Question Enrichment in Text-to-SQL[EB/OL]. [2025-12-01]. https://arxiv.org/abs/2409.16751.
[30] LEE D, PARK C, KIM J, et al. MCS-SQL: Leveraging Multiple Prompts and Multiple-Choice Selection For Text-to-SQL Generation[C]//Proceedings of the 31st International Conference on Computational Linguistics. Abu Dhabi, UAE: Association for Computational Linguistics, 2025: 337-353.
[31] GAO D, WANG H, LI Y, et al. Text-to-SQL Empowered by Large Language Models: A Benchmark Evaluation[J]. Proceedings of the VLDB Endowment, 2024, 17(5): 1132-1145.
[32] SHEN C, WANG J, RAHMAN S, et al. MageSQL: Enhancing In-context Learning for Text-to-SQL Applications with Large Language Models[EB/OL]. [2025-12-01]. https://arxiv.org/abs/2504.02055.
[33] KONG Y, HU H, ZHANG D, et al. Bridging the Gap: Transforming Natural Language Questions into SQL Queries via Abstract Query Pattern and Contextual Schema Markup[EB/OL].[2025-12-01]. https://arxiv.org/abs/2502.14682.
[34] DU Z, QIAN Y, LIU X, et al. GLM: General Language Model Pretraining with Autoregressive Blank Infilling[C]//MURESAN S, NAKOV P, VILLAVICENCIO A. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. Dublin, Ireland: Association for Computational Linguistics, 2022: 320-335.
[35] OPENAI, ACHIAM J, ADLER S, et al. GPT-4 Technical Report[EB/OL]. [2025-12-01]. https://arxiv.org/abs/2303.08774.
[36] OPENAI, HURST A, LERER A, et al. GPT-4o System Card[EB/OL]. [2025-12-01]. https://arxiv.org/abs/2410.21276.
[37] COMANICI G, BIEBER E, SCHAEKERMANN M, et al. Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities[EB/OL]. [2025-12-01]. https://arxiv.org/abs/2507.06261.
[38] LING TEAM, SHEN A, LI B, et al. Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model[EB/OL]. [2025-12-01]. https://arxiv.org/abs/2510.18855.
[39] YANG A, LI A, YANG B, et al. Qwen3 Technical Report[EB/OL]. [2025-12-01]. https://arxiv.org/abs/2505.09388.
[40] GUO D, YANG D, ZHANG H, et al. DeepSeek-R1 incentivizes reasoning in LLMs through reinforcement learning[J]. Nature, 2025, 645(8081): 633-638.

Please choose a citation manager

Content to export