作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2024, Vol. 50 ›› Issue (8): 363-371. doi: 10.19678/j.issn.1000-3428.0068333

• 开发研究与工程应用 • 上一篇    下一篇

基于主题感知和语义增强的作文自动评分方法

陈宇航1, 杨勇1, 先木斯亚·买买提明2, 帕力旦·吐尔逊1,*(), 樊小超1,2, 任鸽1, 刁宇峰3   

  1. 1. 新疆师范大学计算机科学技术学院, 新疆 乌鲁木齐 830054
    2. 和田师范专科学校数学与信息学院, 新疆 和田 848000
    3. 内蒙古民族大学计算机科学与技术学院, 内蒙古 通辽 028000
  • 收稿日期:2023-09-05 出版日期:2024-08-15 发布日期:2024-02-22
  • 通讯作者: 帕力旦·吐尔逊
  • 基金资助:
    新疆维吾尔自治区自然科学基金(2021D01B72); 国家自然科学基金(62066044); 国家自然科学基金(62167008); 国家自然科学基金(62006130)

Automatic Essay Scoring Method Based on Topic Perception and Semantic Enhancement

Yuhang CHEN1, Yong YANG1, Xianmusiya·Maimaitiming2, Palidan·Tuerxun1,*(), Xiaochao FAN1,2, Ge REN1, Yufeng DIAO3   

  1. 1. School of Computer Science and Technology, Xinjiang Normal University, Urumqi 830054, Xinjiang, China
    2. School of Mathematics and Informatics, Hetian Normal College, Hetian 848000, Xinjiang, China
    3. School of Computer Science and Technology, Inner Mongolia University for Nationalities, Tongliao 028000, Inner Mongolia, China
  • Received:2023-09-05 Online:2024-08-15 Published:2024-02-22
  • Contact: Palidan·Tuerxun

摘要:

作文自动评分(AES)是教育领域中应用自然语言处理(NLP)技术的重要研究方向之一, 其旨在提高评分效率, 增强评价的客观性和可靠性。针对主题相关性缺失和长文本信息丢失问题以及预训练语言模型BERT不同层次能够提取不同维度特征的特点, 提出一种基于主题感知和语义增强的作文自动评分模型。该模型采用多头注意力机制提取作文的浅层语义特征并感知作文主题特征, 同时利用BERT的中间层句法特征和深层语义特征增强对作文语义的理解。在此基础上, 融合不同维度的特征并用于作文自动评分。实验结果表明, 该模型在公共数据集ASAP的8个子集上均表现出了显著的性能优势, 相比于通义千问等基线模型, 其能够有效提升作文自动评分性能, 平均二次加权的卡帕值(QWK)达到80.25%。

关键词: 作文自动评分, 语义增强, 主题感知, 特征融合, 预训练语言模型

Abstract:

Automatic Essay Scoring (AES) is an important research topic for the application of Natural Language Processing (NLP) technology in the field of education. AES aims to improve scoring efficiency and enhance the objectivity and reliability of evaluations. This study proposes a topic perception and semantic enhancement approach for AES, addressing the issues of missing thematic relevance and loss of information in long texts, as well as leveraging the different levels of feature extraction capability in the pre-training language model, Bidirectional Encoder Representations from Transformers (BERT). This approach utilizes a multi-head attention mechanism to extract shallow semantic features of an essay and perceive its thematic characteristics. Additionally, it leverages the mid-level syntactic and deep semantic features of BERT to enhance the understanding of the semantics of the essay. Finally, the fused features from different dimensions are used for the AES. Experimental results indicate that the proposed model exhibits significant performance advantages for eight subsets of the ASAP public dataset. The proposed model effectively improves the performance of AES compared to that of baseline models, such as Qwen-7B; its average Quadratic Weighted Kappa (QWK) is 80.25%.

Key words: Automatic Essay Scoring(AES), semantic enhancement, topic perception, feature fusion, pre-training language model