Author Login Chief Editor Login Reviewer Login Editor Login Remote Office

Computer Engineering

   

Integrating Large Language Models with Hybrid Strategies for Geometry Problem Understanding

  

  • Published:2026-05-26

基于大语言模型和混合策略的几何题意理解

Abstract: Problem understanding is a critical prerequisite for achieving automated geometric theorem proving. However, existing approaches commonly suffer from excessive reliance on feature engineering and limited generalization capabilities, making them inadequate for effectively supporting automated problem solving. To address this challenge, this paper proposes a large language model-based method for geometric problem understanding by fine-tuning the Qwen2.5 base model and integrating chain-of-thought reasoning with k-nearest neighbor (KNN) retrieval-augmented generation. Furthermore, to enhance the accuracy of semantic translation, we introduce an agent-based hallucination detection and correction mechanism, which significantly mitigates hallucination issues during problem understanding. Experimental results demonstrate that the proposed method achieves an accuracy of 88.85% and a recall of 89.12% on the intent understanding task of the self-constructed dataset, significantly outperforming the baseline model. On the Geometry3K dataset, it attains an accuracy of 94.86% and a recall of 94.18%, exhibiting superior performance compared to the Inter-GPS method. Additionally, comprehensive ablation studies and comparative analyses under various parameter configurations further validate the superior performance and adaptability of our multi-strategy hybrid approach.

摘要: 题意理解是实现几何自动证明的关键前提。然而,现有方法普遍存在对特征工程依赖过重、泛化能力有限等问题,难以有效支撑自动解题的需求。针对这一挑战,本文在微调Qwen2.5基座模型的基础上,结合思维链推理与K近邻检索增强技术,提出了一种基于大语言模型的几何题意理解方法。为进一步提升语义翻译的准确性,本文还引入了一种基于智能体的幻觉检测与纠错机制,以缓解题意理解过程中的幻觉问题。实验结果表明,该方法在自建数据集上的准确率与召回率分别达到88.85%和89.12%,性能显著优于多种基线模型;在公开基准Geometry3K上的准确率与召回率分别为94.86%与94.18%,同样优于Inter-GPS等现有方法。此外,通过系统的消融实验与多参数配置对比分析,进一步验证了所提出的多策略融合方法在性能与适应性方面的优越性。