基于关系感知图神经网络的Text-to-SQL方法

doi:10.19678/j.issn.1000-3428.0069410

摘要/Abstract

摘要：

Text-to-SQL语义解析任务旨在将自然语言问题转化为可执行的SQL语句。近年来, 许多研究将预训练模型等方法应用到该任务中, 并取得了一定的进展。然而, 现有的预训练模型没有针对Text-to-SQL任务进行重新训练, 不能很好地适应任务的场景语义特征信息, 从而影响模型的解析性能。同时, 许多方法还容易忽略自然语言问题与数据库模式间的关系, 造成解析过程中语义模糊的问题。为解决这些问题, 提出一种新的RGA-T5模型来完成Text-to-SQL语义解析任务。该模型在预训练模型T5中引入了关系感知异构图神经网络(HGNN), 将输入的实体与关系构建为异构图上的节点, 并通过应用图神经网络(GNN)实现模型对输入序列的语义关系感知。同时, 还提出空间门控适配器, 对其参数进行训练实现对预训练模型的微调, 使模型能够针对该任务适应不同场景下的语义特征信息, 减少无关信息的引入。实验结果表明, 该模型在Spider数据集上相较于其他先进的Text-to-SQL解析方法取得了一定程度的性能提升, 验证了所提方法的有效性。

关键词: 语义解析, 预训练模型, 异构图神经网络, 空间门控单元, 适配器

Abstract:

Text-to-SQL semantic parsing task aims to transform natural language problems into executable SQL statements. In recent years, many researchers have applied methods such as pre-training models to this task and have made some progress. However, because existing pre-training models are not re-trained for Text-to-SQL tasks, they cannot adapt well to the scene semantic feature information of the task, which affects the parsing performance of the models. At the same time, many methods are prone to ignoring the relationship between natural language questions and database schemes, which results in semantic ambiguities in the parsing process. To solve these problems, this paper proposes a new RGA-T5 model for Text-to-SQL semantic parsing, which introduces a relation-aware Heterogeneous Graph Neural Network (HGNN) into the pre-training model T5, constructs the input entities and relations as nodes on the heterogeneous graph, and realizes semantic relation-awareness of the input sequences of the model by applying the Graph Neural Network (GNN). Simultaneously, the method also proposes a spatial gating adapter, the parameters of which are trained to realize fine-tuning of the model so that the model can be adapted to the semantic feature information in different scenarios for this task and reduce the introduction of irrelevant information. The experimental results show that the proposed method improves the performance over other advanced Text-to-SQL parsing methods on the Spider dataset, thereby verifying the model's effectiveness.

Key words: semantic parse, pre-training model, Heterogeneous Graph Neural Network (HGNN), spatial gating unit, adapter

曹渝昆, 王天浩, 李云峰, 陈明, 李晶晶, 刘元旻. 基于关系感知图神经网络的Text-to-SQL方法[J]. 计算机工程, 2025, 51(9): 129-138.

CAO Yukun, WANG Tianhao, LI Yunfeng, CHEN Ming, LI Jingjing, LIU Yuanmin. Text-to-SQL Method Based on Relation-aware Graph Neural Network[J]. Computer Engineering, 2025, 51(9): 129-138.

https://www.ecice06.com/CN/Y2025/V51/I9/129

图/表 9

图1 RGA-T5模型

Fig.1 RGA-T5 model

图2 实体与关系节点的相互更新

Fig.2 Mutual update of entity and relationship nodes

图3 RGA-T5与CodeT5在Spider开发集上的收敛效果

Fig.3 Convergence effect of RGA-T5 and CodeT5 on Spider development set

参考文献 33

1	QIN B W, WANG L H, HUI B Y, et al. SUN: exploring intrinsic uncertainties in Text-to-SQL parsers[C]//Proceedings of the 29th International Conference on Computational Linguistics. Gyeongju, Republic of Korea: [s. n. ], 2022: 5298-5308.
2	黄君扬, 王振宇, 梁家卿, 等. 基于自裁剪异构图的NL2SQL模型. 计算机工程, 2022, 48 (9): 71-77, 88. doi: 10.19678/j.issn.1000-3428.0064560
	HUANG J Y , WANG Z Y , LIANG J Q , et al. NL2SQL model based on self-pruning heterogeneous graph. Computer Engineering, 2022, 48 (9): 71-77, 88. doi: 10.19678/j.issn.1000-3428.0064560
3	胡亚红, 刘亚冬, 朱正东, 等. 辅助任务增强的中文跨域NL2SQL算法. 国防科技大学学报, 2024, 46 (2): 197- 204.
	HU Y H , LIU Y D , ZHU Z D , et al. Chinese cross-domain NL2SQL algorithm enhanced by auxiliary task. Journal of National University of Defense Technology, 2024, 46 (2): 197- 204.
4	CAI Z F, LI X Y, HUI B Y, et al. STAR: SQL guided pre-training for context-dependent Text-to-SQL parsing[C]//Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022. Stroudsburg, USA: ACL, 2022: 1235-1247.
5	RAI D, WANG B L, ZHOU Y L, et al. Improving generalization in language model-based Text-to-SQL semantic parsing: two simple semantic boundary-based techniques[C]//Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Stroudsburg, USA: ACL, 2023: 150-160.
6	CAO R S, CHEN L, CHEN Z, et al. LGESQL: line graph enhanced Text-to-SQL model with mixed local and non-local relations[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Stroudsburg, USA: ACL, 2021: 2541-2555.
7	XIANG Y, ZHANG Q W, ZHANG X, et al. G3R: a graph-guided generate-and-rerank framework for complex and cross-domain Text-to-SQL generation[C]//Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023. Stroudsburg, USA: ACL, 2023: 338-352.
8	LI J Y , HUI B Y , CHENG R , et al. Graphix-T5: mixing pre-trained transformers with graph-aware layers for Text-to-SQL parsing. Proceedings of the AAAI Conference on Artificial Intelligence, 2023, 37 (11): 13076- 13084. doi: 10.1609/aaai.v37i11.26536
9	COLIN R , NOAM S , ADAM R , et al. Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 2020, 21 (1): 5485- 5551.
10	VELI AČG KOVI AC'G P, CUCURULL G, CASANOVA A, et al. Graph attention networks[EB/OL]. [2024-01-02]. https://arxiv.org/abs/1710.10903.
11	CHEN Z R, CHEN S J, WHITE M, et al. Text-to-SQL error correction with language models of code[C]//Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Stroudsburg, USA: ACL, 2023: 1359-1372.
12	RUBIN O, BERANT J. SmBoP: semi-autoregressive bottom-up semantic parsing[C]//Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg, USA: ACL, 2021: 311-324.
13	VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. Berlin, Germany: Springer, 2017: 6000-6010.
14	ZHAO K , XU H , CHENG Y , et al. Representation iterative fusion based on heterogeneous graph neural network for joint entity and relation extraction. Knowledge-Based Systems, 2021, 219, 106888. doi: 10.1016/j.knosys.2021.106888
15	BANG N M, LEE J, KOO M W. Task-optimized adapters for an end-to-end task-oriented dialogue system[C]//Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023. Stroudsburg, USA: ACL, 2023: 7355-7369.
16	SANDLER M, HOWARD A, ZHU M L, et al. MobileNetV2: inverted residuals and linear bottlenecks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2018: 4510-4520.
17	HOULSBY N, GIURGIU A, JASTRZEBSKI S, et al. Parameter-efficient transfer learning for NLP[C]//Proceedings of International Conference on Machine Learning. [S. l. ]: PMLR, 2019: 2790-2799.
18	WANG Y, WANG W S, JOTY S, et al. CodeT5: identifier-aware unified pre-trained encoder-decoder models for code understanding and generation[C]//Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: ACL, 2021: 8696-8708.
19	YU T, ZHANG R, YANG K, et al. Spider: a large-scale human-labeled dataset for complex and cross-domain semantic parsing and Text-to-SQL task[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: ACL, 2018: 3911-3921.
20	周浩冉. 基于语义路径注意力网络的NL2SQL模型研究及应用[D]. 上海: 东华大学, 2023.
	ZHOU H R. Research and application of NL2SQL model based on semantic path attention network[D]. Shanghai: Donghua University, 2023. (in Chinese)
21	SHAZEER N, STERN M. Adafactor: adaptive learning rates with sublinear memory cost[EB/OL]. [2024-01-02]. https://arxiv.org/abs/1804.04235v1.
22	GAN Y J, CHEN X Y, XIE J X, et al. Natural SQL: making SQL easier to infer from natural language specifications[C]//Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021. Stroudsburg, USA: ACL, 2021: 2030-2042.
23	HUI B Y, GENG R Y, WANG L H, et al. S2SQL: injecting syntax to question-schema interaction graph encoder for Text-to-SQL parsers[C]//Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022. Stroudsburg, USA: ACL, 2022: 1254-1262.
24	ZHONG V, LEWIS M, WANG S I, et al. Grounded adaptation for zero-shot executable semantic parsing[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Stroudsburg, USA: ACL, 2020: 6869-6882.
25	LIN X V, SOCHER R, XIONG C M. Bridging textual and tabular data for cross-domain Text-to-SQL semantic parsing[C]//Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020. Stroudsburg, USA: ACL, 2020: 4870-4888.
26	SCHOLAK T, SCHUCHER N, BAHDANAU D. PICARD: parsing incrementally for constrained auto-regressive decoding from language models[EB/OL]. [2024-01-02]. https://arxiv.org/abs/2109.05093v1.
27	QI J X, TANG J Y, HE Z W, et al. RASAT: integrating relational structures into pretrained Seq2Seq model for Text-to-SQL[C]//Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP). Stroudsburg, USA: ACL, 2022: 3215-3229.
28	ZENG L, PARTHASARATHI S H K, HAKKANI-TUR D. N-best hypotheses reranking for Text-to-SQL systems[C]//Proceedings of the IEEE Spoken Language Technology Workshop (SLT). Washington D. C., USA: IEEE Press, 2023: 663-670.
29	LI H Y , ZHANG J , LI C P , et al. RESDSQL: decoupling schema linking and skeleton parsing for Text-to-SQL. Proceedings of the AAAI Conference on Artificial Intelligence, 2023, 37 (11): 13067- 13075. doi: 10.1609/aaai.v37i11.26535
30	YI J W, CHEN G. Decoupling SQL query hardness parsing for Text-to-SQL[EB/OL]. [2024-01-02]. https://arxiv.org/abs/2312.06172v2.
31	POURREZA M, RAFIEI D. DIN-SQL: decomposed in-context learning of Text-to-SQL with self-correction[EB/OL]. [2024-01-02]. https://arxiv.org/abs/2304.11015v3.
32	DONG X M, ZHANG C, GE Y H, et al. C3: zero-shot text-to-SQL with ChatGPT[EB/OL]. [2024-01-02]. https://arxiv.org/abs/2307.07306v1.
33	WANG B, REN C Y, YANG J, et al. MAC-SQL: a multi-agent collaborative framework for Text-to-SQL[EB/OL]. [2024-01-02]. https://arxiv.org/abs/2312.11242v6.

[1]	黄琦强, 安国成, 熊刚. 基于视觉-语言预训练模型的开集交通目标检测算法[J]. 计算机工程, 2025, 51(6): 375-384.
[2]	耿霞, 汪尧. 基于CLIP增强细粒度特征的换装行人重识别方法[J]. 计算机工程, 2025, 51(4): 293-302.
[3]	朱红, 王阔然, 朱彤. 基于多侧面信息表征联合的实体相似性度量及对齐方法[J]. 计算机工程, 2025, 51(3): 64-75.
[4]	王庆丰, 李旭, 姚春龙, 程腾腾. 面向研究生招生咨询的中文Text-to-SQL模型[J]. 计算机工程, 2025, 51(3): 362-368.
[5]	郭俊辰, 马御棠, 相艳, 赵学东, 郭军军. 基于Prompt打分的实体链接方法[J]. 计算机工程, 2025, 51(3): 334-341.
[6]	姚利峰, 蔡满春, 朱懿, 陈咏豪, 张溢文. 基于字节编码与预训练任务的加密流量分类模型[J]. 计算机工程, 2025, 51(2): 188-201.
[7]	饶东宁, 许正辉, 梁瑞仕. 基于知识库问答的回答生成研究[J]. 计算机工程, 2025, 51(2): 94-101.
[8]	费涛, 艾山·吾买尔, 杜文旭, 朱翠翠. 基于Squeezeformer的多颗粒度多方面发音质量评测方法[J]. 计算机工程, 2025, 51(1): 81-87.
[9]	魏嵬, 丁香香, 郭梦星, 杨钊, 刘辉. 文本相似度计算方法综述[J]. 计算机工程, 2024, 50(9): 18-32.
[10]	李田芳, 普园媛, 赵征鹏, 徐丹, 钱文华. 基于CLIP和双空间自适应归一化的图像翻译[J]. 计算机工程, 2024, 50(5): 229-240.
[11]	周昭辰, 方清茂, 吴晓红, 胡平, 何小海. 基于MacBERT与对抗训练的机器阅读理解模型[J]. 计算机工程, 2024, 50(5): 41-50.
[12]	侯钰涛, 阿布都克力木·阿布力孜, 史亚庆, 马依拉木·木斯得克, 哈里旦木·阿布都克里木. 面向"一带一路"的低资源语言机器翻译研究[J]. 计算机工程, 2024, 50(4): 332-341.
[13]	于明诚, 党亚固, 吴奇林, 吉旭, 毕可鑫. 基于多尺度上下文的英文作文自动评分研究[J]. 计算机工程, 2024, 50(3): 259-266.
[14]	张文博, 黄浩, 吴迪, 唐敏杰. 基于MEGA网络和分层预测的标点恢复方法[J]. 计算机工程, 2024, 50(12): 396-406.
[15]	孙仁科, 许靖昊, 皇甫志宇, 李仲年, 许新征. 基于视觉-语言预训练模型的零样本迁移学习方法综述[J]. 计算机工程, 2024, 50(10): 1-15.

选择文件类型/文献管理软件名称

选择包含的内容