基于类比学习的数学应用题求解模型

doi:10.19678/j.issn.1000-3428.0069517

摘要/Abstract

摘要：

目前基于类比学习的数学应用题(MWP)求解的研究多从语义相似度或浅层逻辑来筛选样本, 存在样本匹配度不足以及样本选取局限于数据集的问题。针对以上问题, 提出一种新的基于类比学习的数学应用题求解(MWP-AL)模型。该模型主要从2个角度对数学应用题进行类比学习。从文本编码的角度进行样本筛选, 从余弦相似度、树解顶节点以及树深度3个维度对样本进行限制。该方法从语义层面以及深层逻辑方面对样本进行选取, 得到的样本与原题的匹配度更高。从解方程的角度进行样本构建, 从方程本身出发, 针对不同类型的方程在逻辑方面对其进行变体从而构建样本。该方法不局限于从数据集中选取样本, 具有较强的泛化性。通过计算交叉熵损失函数对这2种样本进行类比学习。实验结果表明, 在2个基线模型上加入MWP-AL模型后, 其在英文数据集MathQA和中文数据集Math23K上的准确率分别提升了1.8、2.5和2.8、1.3个百分点, 同时较其他基线模型均有所提升。

关键词: 类比学习, 数学应用题求解, 语义相似度, 样本筛选, 样本构建

Abstract:

Currently, research on Math Word Problems (MWP) based on analogical learning mostly selects samples according to semantic similarity or shallow logic. These studies suffer from issues of insufficient sample matching and limited sample selection within their datasets. To address these issues, this study proposes a novel MWP with Analogical Learning (MWP-AL) model. The model mainly performs analogical learning of MWP from two perspectives. From the perspective of text encoding, samples are selected by limiting them to three dimensions: cosine similarity, tree-top nodes, and tree depth. This method selects samples from both semantic and deep logical perspectives, resulting in a better match between the obtained samples and the original question. From the perspective of solving equations, samples are constructed by logically modifying them for different types of equations. This method is not limited to selecting samples from a dataset and has strong generalization ability. Analogical learning is performed on the two samples by calculating the cross-entropy loss function. Experimental results show that adding the MWP-AL model to the two baseline models improves the accuracy of the English dataset MathQA and the Chinese dataset Math23K by 1.8, 2.5, and 2.8, respectively, and 1.3 percentage points. At the same time, the accuracy has been improved compared to other baseline models.

Key words: analogical learning, Math Word Problems(MWP) solving, semantic similarity, sample screening, sample construction

林加艺, 夏鸿斌, 刘渊. 基于类比学习的数学应用题求解模型[J]. 计算机工程, 2024, 50(7): 63-70.

Jiayi LIN, Hongbin XIA, Yuan LIU. Math Word Problems Solving Model Based on Analogical Learning[J]. Computer Engineering, 2024, 50(7): 63-70.

https://www.ecice06.com/CN/Y2024/V50/I7/63

图/表 9

图1 MWP-AL模型结构

Fig.1 Structure of MWP-AL model

图2 余弦相似度筛选样本示例

Fig.2 Sample examples of cosine similarity screening

图3 树解顶节点筛选样本示例

Fig.3 Sample examples of tree-top node screening

图4 模型编码样本可视化

Fig.4 Visualization of model encoding sample

图5 样本数量对模型性能的影响

Fig.5 The impact of sample size on model performance

参考文献 26

1	HOSSEINI M J, HAJISHIRZI H, ETZIONI O, et al. Learning to solve arithmetic word problems with verb categorization[C]∥Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: Association for Computational Linguistics, 2014: 523-533.
2	LIGUDA C, PFEIFFER T. Modeling math word problems with augmented semantic networks[C]∥Proceedings of International Conference on Application of Natural Language to Information Systems. Berlin, Germany: Springer, 2012: 247-252.
3	SUNDARAM S S, KHEMANI D. Natural language processing for solving simple word problems[C]∥Proceedings of the 12th International Conference on Natural Language Processing. Stroudsburg, USA: Association for Computational Linguistics, 2015: 394-402.
4	KONCEL-KEDZIORSKI R, HAJISHIRZI H, SABHARWAL A, et al. Parsing algebraic word problems into equations. Transactions of the Association for Computational Linguistics, 2015, 3, 585- 597. doi: 10.1162/tacl_a_00160
5	HU R H, ANDREAS J, ROHRBACH M, et al. Learning to reason: end-to-end module networks for visual question answering[C]∥Proceedings of the IEEE International Conference on Computer Vision. Washington D. C., USA: IEEE Press, 2017: 804-813.
6	CHIANG T R, CHEN Y N. Semantically-aligned equation generation for solving and reasoning math word problems[C]∥Proceedings of the 2019 Conference of the North. Stroudsburg, USA: Association for Computational Linguistics, 2019: 2656-2668.
7	HUANG S F, WANG J W, XU J, et al. Recall and learn: a memory-augmented solver for math word problems[EB/OL]. [2024-02-14]. http://arxiv.org/abs/2109.13112.
8	LIANG Z W, ZHANG J P, ZHANG X L. Analogical math word problems solving with enhanced problem-solution association[C]∥Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: Association for Computational Linguistics, 2022: 9454-9464.
9	KUSHMAN N, ARTZI Y, ZETTLEMOYER L, et al. Learning to automatically solve algebra word problems[C]∥Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA: Association for Computational Linguistics, 2014: 271-281.
10	ROY S, ROTH D. Solving general arithmetic word problems[C]∥Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: Association for Computational Linguistics, 2015: 1743-1752.
11	WANG Y, LIU X J, SHI S M. Deep neural solver for math word problems[C]∥Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: Association for Computational Linguistics, 2017: 845-854.
12	LIANG Z W, ZHANG J P, WANG L, et al. MWP-BERT: numeracy-augmented pre-training for math word problem solving[EB/OL]. [2024-02-14]. http://arxiv.org/abs/2107.13435.
13	RIBEIRO N. Reasoning and structured explanations in natural language via analogical and neural learning[D]. Chicago, USA: Northwestern University, 2023.
14	GOLDBERG Y, LEVY O. word2vec Explained: deriving Mikolov et al. 's negative-sampling word-embedding method[EB/OL]. [2024-02-14]. http://arxiv.org/abs/1402.3722.
15	GLOROT X, BORDES A, BENGIO Y. Deep sparse rectifier neural networks[EB/OL]. [2024-02-14]. https://www.researchgate.net/profile/Antoine-Bordes/publication/215616967_Deep_Sparse_Rectifier_Neural_Networks/links/0a85e537a7f4b21bb1000000/Deep-Sparse-Rectifier-Neural-Networks.pdf.
16	JIE Z M, LI J R, LU W. Learning to reason deductively: math word problem solving as complex relation extraction[C]∥Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA: Association for Computational Linguistics, 2022: 5944-5955.
17	WANG Y, LIU X J, SHI S M. Deep neural solver for math word problems[C]∥Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: Association for Computational Linguistics, 2017: 845-854.
18	XIE Z P, SUN S C. A goal-driven tree-structured neural model for math word problems[C]∥Proceedings of the 28th International Joint Conference on Artificial Intelligence. Stroudsburg, USA: Association for Computational Linguistics, 2019: 5299-5305.
19	LIANG Z W, ZHANG X L. Solving math word problems with teacher supervision[C]∥Proceedings of the 30th International Joint Conference on Artificial Intelligence. Stroudsburg, USA: Association for Computational Linguistics, 2021: 3522-3528.
20	ZHANG J P, WANG L, LEE R K W, et al. Graph-to-tree learning for solving math word problems[C]∥Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, USA: Association for Computational Linguistics, 2020: 3928-3937.
21	肖菁, 何岱俊, 曹阳. 一种自动求解数学应用题的双路文本编码器. 华南师范大学学报(自然科学版), 2023, 55(1): 36- 44. URL
	XIAO J, HE D J, CAO Y. A dual channel text encoder for solving math word problems. Journal of South China Normal University (Natural Science Edition), 2023, 55(1): 36- 44. URL
22	黄林嘉, 肖菁, 曹阳. 一种求解数学应用题的多粒度图神经网络编码器. 中文信息学报, 2023, 37(2): 148- 157. URL
	HUANG L J, XIAO J, CAO Y. Solving math word problems by multi-grained graph neural networks. Journal of Chinese Information Processing, 2023, 37(2): 148- 157. URL
23	SHEN J H, YIN Y C, LI L, et al. Generate & Rank: a multi-task framework for math word problems[C]∥Proceedings of Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: Association for Computational Linguistics, 2021: 2269-2279.
24	LI Z L, ZHANG W X, YAN C, et al. Seeking patterns, not just memorizing procedures: contrastive learning for solving math word problems[C]∥Proceedings of the Findings of the Association for Computational Linguistics. Stroudsburg, USA: Association for Computational Linguistics, 2022: 2486-2496.
25	JIE Z M, LI J R, LU W. Learning to reason deductively: math word problem solving as complex relation extraction[EB/OL]. [2024-02-14]. http://arxiv.org/abs/2203.10316.
26	LIU Q Y, GUAN W, LI S J, et al. Tree-structured decoding for solving math word problems[C]∥Proceedings of the Conference on Empirical Methods in Natural Language Processing. Stroudsburg, USA: Association for Computational Linguistics, 2019: 2370-2379.

[1]	李雪, 王雅文, 张前进. 基于信息检索的源代码自动命名[J]. 计算机工程, 2024, 50(6): 304-310.
[2]	杨振宇, 王磊, 马博, 杨雅婷, 董瑞, 艾孜麦提·艾瓦尼尔, 王震. 一种针对维汉的跨语言远程监督方法[J]. 计算机工程, 2023, 49(2): 271-278.
[3]	王淑媛,田生伟,禹龙,冯冠军,艾山·吾买尔,李圃,赵建国. 基于堆栈降噪自编码的维吾尔语事件共指关系识别[J]. 计算机工程, 2018, 44(6): 305-310.
[4]	荆琪,段利国,李爱萍,赵谦. 基于维基百科的短文本相关度计算[J]. 计算机工程, 2018, 44(2): 197-202.
[5]	李晓红,曹林,宿云,马慧芳. 融合统计信息与语义相似度的特征扩展算法[J]. 计算机工程, 2017, 43(6): 177-181.
[6]	贾静兰,董才林,喻莹,王静,张丽芬. 基于回溯树的语义Web服务自动组合优化方法[J]. 计算机工程, 2016, 42(4): 215-220.
[7]	马雷雷,李宏伟,连世伟,梁汝鹏,陈虎. 一种基于本体语义的灾害主题爬虫策略[J]. 计算机工程, 2016, 42(11): 50-56.
[8]	易军凯,刘慕凡,万静. 基于主题与语义的作弊网页检测方法[J]. 计算机工程, 2015, 41(9): 311-316.
[9]	张翔,朱明,孙吟龙,方雪峰. 网络电视直播中的虚拟频道生成算法[J]. 计算机工程, 2015, 41(12): 236-240.
[10]	胡令传,陶晓鹏. 客户评论中用户体验信息自动提取研究[J]. 计算机工程, 2015, 41(1): 49-53.
[11]	陶舒怡，王明文，万剑怡，罗远胜，左家莉. 一种基于簇相合性的文本增量聚类算法[J]. 计算机工程, 2014, 40(6): 195-200.
[12]	王小林,王东,杨思春,邰伟鹏,郑啸. 基于《知网》的词语语义相似度算法[J]. 计算机工程, 2014, 40(12): 177-181.
[13]	刘一松,王艳莲. 基于本体的语义虚拟环境查询与推理模型[J]. 计算机工程, 2014, 40(10): 181-185.
[14]	陈付龙，周雯，王杨，赵传信，杨娜娜. 基于Agent的Web教育资源预选择分层模型[J]. 计算机工程, 2013, 39(9): 119-122,127.
[15]	刘建明, 史一民, 张俊, 陈存衡. 一种RDF图的语义相似性度量方法[J]. 计算机工程, 2013, 39(3): 223-228,235.

选择文件类型/文献管理软件名称

选择包含的内容