作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2024, Vol. 50 ›› Issue (11): 399-408. doi: 10.19678/j.issn.1000-3428.0068419

• 开发研究与工程应用 • 上一篇    

基于序列公式树模型的电学问题解答方法

菅朋朋1, 刘浩宇1,*(), 闫鸣1, 王彦丽2, 杨阳蕊1, 刘雪梅1   

  1. 1. 华北水利水电大学信息工程学院, 河南 郑州 450046
    2. 河南财经政法大学, 河南 郑州 450046
  • 收稿日期:2023-09-20 出版日期:2024-11-15 发布日期:2024-03-05
  • 通讯作者: 刘浩宇
  • 基金资助:
    国家自然科学基金(62107014); 河南省青年人才托举工程项目(2023HYTP046); 河南省哲学社会科学基金(2022ZSZ008); 河南省高等教育教学改革研究与实践重大项目(2021SJGLX017)

Solving Electrical Text Problems Based on Sequence to Formula Tree Model

JIAN Pengpeng1, LIU Haoyu1,*(), YAN Ming1, WANG Yanli2, YANG Yangrui1, LIU Xuemei1   

  1. 1. School of Information Engineering, North China University of Water Resources and Electric Power, Zhengzhou 450046, Henan, China
    2. Henan University of Economics and Law, Zhengzhou 450046, Henan, China
  • Received:2023-09-20 Online:2024-11-15 Published:2024-03-05
  • Contact: LIU Haoyu

摘要:

自适应理解和求解语义多变的问题文本是机器解答电学问题的关键挑战, 现有方法多侧重于问题文本的语义分析和结构分析, 无法将问题文本解析为类人解答的求解形式。为此, 构建一种基于序列公式树模型的电学问题解答方法。首先通过问题文本预处理标准化文本元素、关系提取, 生成预编码序列和直陈关系序列。其次使用双向门控循环编码器对预编码序列进行特征编码, 生成隐藏状态序列。再通过构建电学定理图, 使用图卷积神经网络(GCNN)编码器建立直陈关系序列和定理之间的关联关系, 将定理图中的关系结点转化为向量表示, 生成公式结点嵌入状态序列, 并提取不同变量之间的隐含电学关系。最后构建树形结构的解码器对隐藏状态序列和公式节点嵌入状态序列进行解码, 形成问题的序列-公式树结构的求解式, 实现电学问题的可读解答。构建一个包含3 027个电学问题的数据集TexPE-3K, 并对其进行标准化和信息标注。在数据集TexPE-3K上的实验结果表明, 关系提取的平均准确率达到了96.8%, 可读解答的平均准确率达到了55.57%, 验证了该方法的可行性和有效性。

关键词: 电学问题, 序列公式树模型, 关系提取, 可读解答, 图神经网络

Abstract:

The primary challenge in machine-based electrical question answering lies in effectively comprehending and addressing the semantically diverse nature of question text. Existing approaches predominantly emphasize semantic and structural analyses but struggle to generate human-like responses. An electrical question-answering method based on the sequence to formula tree model has been developed to address this issue. This method comprises several key steps. First, pre-coded sequences and directly extracted relation sequences are generated through standardized text elements and relation extraction during question text preprocessing. Second, these pre-coded sequences undergo feature encoding using a bidirectional gated recurrent encoder, resulting in the generation of hidden state sequences. Subsequently, an electrical theorem graph is constructed, and a Graph Convolutional Neural Network(GCNN) encoder is employed to establish the correlation relationships between the direct relation sequence and relevant theorems. This process transforms the relation nodes within the theorem graph into vector representations, facilitating the generation of formula node-embedded state sequences and the extraction of implicit electrical relationships among various variables. Finally, a tree-structured decoder is designed to decode the hidden state and formula node-embedded state sequences, resulting in the formulation of a solution equation within the sequence-formula tree structure. This solution aims to provide a readable resolution to the electrical problem by meticulously constructing, standardizing, and labeling the TexPE-3K dataset, comprising 3027 electrical problems. Through experiments conducted on the TexPE-3K, the method consistently achieves an average accuracy of 96.8% in relation extraction and 55.57% in generating readable solutions, confirming its feasibility and effectiveness.

Key words: electrical problems, sequence-to-formula tree model, relations extraction, readable solver, Graph Neural Network(GNN)