作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2024, Vol. 50 ›› Issue (3): 259-266. doi: 10.19678/j.issn.1000-3428.0067224

• 开发研究与工程应用 • 上一篇    下一篇

基于多尺度上下文的英文作文自动评分研究

于明诚, 党亚固*(), 吴奇林, 吉旭, 毕可鑫   

  1. 四川大学化学工程学院, 四川 成都 610041
  • 收稿日期:2023-03-22 出版日期:2024-03-15 发布日期:2023-06-16
  • 通讯作者: 党亚固
  • 基金资助:
    国家重点研发计划(2021YFB40005)

Research on Automatic Scoring for English Essay Based on Multi-Scale Context

Mingcheng YU, Yagu DANG*(), Qilin WU, Xu JI, Kexin BI   

  1. School of Chemical Engineering, Sichuan University, Chengdu 610041, Sichuan, China
  • Received:2023-03-22 Online:2024-03-15 Published:2023-06-16
  • Contact: Yagu DANG

摘要:

目前作文自动评分模型缺乏对不同尺度上下文语义特征的提取,未能从句子级别计算与作文主题关联程度的特征。提出基于多尺度上下文的英文作文自动评分研究方法MSC。采用XLNet英文预训练模型提取原始作文文本单词嵌入和句嵌入,避免在处理长序列文本时无法准确捕捉到符合上下文语境的向量嵌入,提升动态向量语义表征质量,解决一词多义问题,并通过一维卷积模块提取不同尺度的短语级别嵌入。多尺度上下文网络通过结合内置自注意力简单循环单元和全局注意力机制,分别捕捉单词、短语和句子级别的作文高维潜在上下文语义关联关系,利用句向量与作文主题计算语义相似度提取篇章主题层次特征,将所有特征输入融合层通过线性层得到自动评分结果。在公开的标准英文作文评分数据集ASAP上的实验结果表明,MSC模型平均二次加权的Kappa值达到了80.5%,且在多个子集上取得了最佳效果,优于实验对比的深度学习自动评分模型,证明了MSC在英文作文自动评分任务上的有效性。

关键词: 英文作文自动评分, 预训练模型, 多尺度上下文, 全局注意力, 主题层次特征

Abstract:

Presently, the automatic scoring model for essays lacks extraction of semantic features from different context scales, and fails to calculate the degree of correlation between the topic of the essay from the sentence level. This study proposes a method MSC for automatic scoring of English esssay based on a multi-scale context. The method uses an XLNet English pre-training model to extract word and sentence embeddings from the original essay text, accurately captures vector embeddings that match the context when processing long sequence texts, improves the quality of dynamic vector semantic representation, addresses the problem of polysemy, and extracts phrase level embeddings at different scales through a one-dimensional convolution module. The MSC network captures high-dimensional latent contextual semantic associations at the word, phrase, and sentence levels by combining Built-in Self-Attention Simple Recurrence Units (BSASRU) and global attention mechanisms. It uses sentence vectors to calculate semantic similarity with the essay topic and extracts topic level features. All features are input into the fusion layer and are automatically graded through a linear layer. The experimental results on the publicly available standard English essay scoring dataset ASAP demonstrate that the MSC model achieves an average Quadratic Weighted Kappa (QWK) value of 80.5%. Moreover, it achieves the best performance on multiple subsets, outperforming the deep learning automatic scoring model in experimental comparison, thereby proving its effectiveness in English essay automatic scoring tasks.

Key words: automatic scoring for English essay, pre-training model, multi-scale context, global attention, topic level characteristics