基于主题感知和语义增强的作文自动评分方法

doi:10.19678/j.issn.1000-3428.0068333

摘要/Abstract

摘要：

作文自动评分(AES)是教育领域中应用自然语言处理(NLP)技术的重要研究方向之一, 其旨在提高评分效率, 增强评价的客观性和可靠性。针对主题相关性缺失和长文本信息丢失问题以及预训练语言模型BERT不同层次能够提取不同维度特征的特点, 提出一种基于主题感知和语义增强的作文自动评分模型。该模型采用多头注意力机制提取作文的浅层语义特征并感知作文主题特征, 同时利用BERT的中间层句法特征和深层语义特征增强对作文语义的理解。在此基础上, 融合不同维度的特征并用于作文自动评分。实验结果表明, 该模型在公共数据集ASAP的8个子集上均表现出了显著的性能优势, 相比于通义千问等基线模型, 其能够有效提升作文自动评分性能, 平均二次加权的卡帕值(QWK)达到80.25%。

关键词: 作文自动评分, 语义增强, 主题感知, 特征融合, 预训练语言模型

Abstract:

Automatic Essay Scoring (AES) is an important research topic for the application of Natural Language Processing (NLP) technology in the field of education. AES aims to improve scoring efficiency and enhance the objectivity and reliability of evaluations. This study proposes a topic perception and semantic enhancement approach for AES, addressing the issues of missing thematic relevance and loss of information in long texts, as well as leveraging the different levels of feature extraction capability in the pre-training language model, Bidirectional Encoder Representations from Transformers (BERT). This approach utilizes a multi-head attention mechanism to extract shallow semantic features of an essay and perceive its thematic characteristics. Additionally, it leverages the mid-level syntactic and deep semantic features of BERT to enhance the understanding of the semantics of the essay. Finally, the fused features from different dimensions are used for the AES. Experimental results indicate that the proposed model exhibits significant performance advantages for eight subsets of the ASAP public dataset. The proposed model effectively improves the performance of AES compared to that of baseline models, such as Qwen-7B; its average Quadratic Weighted Kappa (QWK) is 80.25%.

Key words: Automatic Essay Scoring(AES), semantic enhancement, topic perception, feature fusion, pre-training language model

陈宇航, 杨勇, 先木斯亚·买买提明, 帕力旦·吐尔逊, 樊小超, 任鸽, 刁宇峰. 基于主题感知和语义增强的作文自动评分方法[J]. 计算机工程, 2024, 50(8): 363-371.

Yuhang CHEN, Yong YANG, Xianmusiya·Maimaitiming, Palidan·Tuerxun, Xiaochao FAN, Ge REN, Yufeng DIAO. Automatic Essay Scoring Method Based on Topic Perception and Semantic Enhancement[J]. Computer Engineering, 2024, 50(8): 363-371.

https://www.ecice06.com/CN/Y2024/V50/I8/363

图/表 7

图1 TASE模型结构

Fig.1 TASE model structure

图2 采用BERT不同中间层作为句法特征对模型性能的影响

Fig.2 The impact of using different middle layers of BERT as syntactic features on model performance

图3 采用BERT不同层次作为语义特征对模型性能的影响

Fig.3 The impact of using different levels of BERT as semantic features on model performance

参考文献 37

1	CHEN Y Y, LIU C L, CHANG T H, et al. An unsupervised automated essay scoring system. IEEE Intelligent Systems, 2010, 25(5): 61- 67.
2	KE Z X, NG V. Automated essay scoring: a survey of the state of the art[EB/OL]. [2023-08-05]. https://www.ijcai.org/Proceedings/2019/879.
3	ATTALÍ Y, BURSTEÍN J. Automated essay scoring with e-rater® V. 2. Journal of Technology, Learning, and Assessment, 2006, 4(3): 1- 29.
4	周明, 贾艳明, 周彩兰, 等. 基于篇章结构的英文作文自动评分方法. 计算机科学, 2019, 46(3): 234- 241. URL
	ZHOU M, JIA Y M, ZHOU C L, et al. English automated essay scoring methods based on discourse structure. Computer Science, 2019, 46(3): 234- 241. URL
5	PAGE E B. Grading essays by computer: progress report[EB/OL]. [2023-08-05]. https://psycnet.apa.org/record/1967-15820-001.
6	余立清. 英语命题作文自动评分系统的研究与实现[D]. 武汉: 华中师范大学, 2019.
	YU L Q. Research and implementation of the automated English essay scoring system[D]. Wuhan: Central China Normal University, 2019. (in Chinese)
7	ZUPANC K, BOSNIĆ Z. Automated essay evaluation with semantic analysis. Knowledge-Based Systems, 2017, 120, 118- 132. doi: 10.1016/j.knosys.2017.01.006
8	刘浩坤. 英语作文自动评分算法的研究与设计[D]. 合肥: 中国科学技术大学, 2018.
	LIU H K. Research and design of the automated English essay scoring algorithm[D]. Hefei: University of Science and Technology of China, 2018. (in Chinese)
9	LARKEY L S. Automatic essay grading using text categorization techniques[EB/OL]. [2023-08-05]. https://dl.acm.org/doi/epdf/10.1145/290941.290965.
10	BENNETT R E, ELLIOT N, MARIANO G, et al. Automated essay scoring using Bayes' theorem. The Journal of Technology, 2013, 24(1): 57- 67.
11	赵瑞雪. 基于词向量聚类及随机森林的英语作文自动评分研究. 微型电脑应用, 2020, 36(6): 104- 107. URL
	ZHAO R X. Study on automatic English composition scoring based on word vector clustering and random forest. Microcomputer Applications, 2020, 36(6): 104- 107. URL
12	TAGHIPOUR K, NG H T. A new dataset and method for automatically grading ESOL texts[EB/OL]. [2023-08-05]. https://aclanthology.org/P11-1019.pdf.
13	TAGHIPOUR K, NG H T. A neural approach to automated essay scoring[C]//Proceedings of 2016 Conference on Empirical Methods in Natural Language Processing. [S. l.]: Association for Computational Linguistics, 2016: 1882-1891.
14	RAMALINGAM V, RAJESWARI S, RAJAMANI K. Automatic text scoring using neural networks[EB/OL]. [2023-08-05]. https://arxiv.org/abs/1606.04289.
15	COZMA M, BUTNARU A, IONESCU R T. Automated essay scoring with string kernels and word embeddings[EB/OL]. [2023-08-05]. https://arxiv.org/abs/1804.07954.
16	黄凯. 英语作文自动评分关键技术的研究与实现[D]. 武汉: 华中师范大学, 2019.
	HUANG K. Research and implementation of the key technology of English automatic essay scoring[D]. Wuhan: Central China Normal University, 2019. (in Chinese)
17	WANG Y, WANG C, LI R, et al. On the use of BERT for automated essay scoring: joint learning of multi-scale essay representation[EB/OL]. [2023-08-05]. https://arxiv.org/abs/2205.03835.
18	SCHRAMOWSKI P, TURAN C, ANDERSEN N, et al. Large pre-trained language models contain human-like biases of what is right and wrong to do. Nature Machine Intelligence, 2022, 4(3): 258- 268.
19	陈宇航, 杨勇, 帕力旦·吐尔逊. 多维度特征增强的作文自动评分. 新疆师范大学学报(自然科学版), 2023, 42(3): 43-49, 58. URL
	CHEN Y H, YANG Y, Palidan·Tuerxun. Enhance multi-dimensional features for automatic essay scoring. Journal of Xinjiang Normal University (Natural Sciences Edition), 2023, 42(3): 43-49, 58. URL
20	于明诚, 党亚固, 吴奇林, 等. 基于多尺度上下文的英文作文自动评分研究. 计算机工程, 2024, 50(3): 259- 266. URL
	YU M C, DANG Y G, WU Q L, et al. Research on automatic scoring for English essay based on multi-scale context. Computer Engineering, 2024, 50(3): 259- 266. URL
21	LI X, YANG H L, HU S Z, et al. Enhanced hybrid neural network for automated essay scoring. Expert Systems, 2022, 39(10): 1- 22.
22	SINGH S, PUPNEJA A, MITAL S, et al. H-AES: towards automated essay scoring for Hindi[EB/OL]. [2023-08-05]. https://arxiv.org/abs/2302.14635.
23	JAWAHAR G, SAGOT B, SEDDAH D. What does BERT learn about the structure of language? [EB/OL]. [2023-08-05]. https://aclanthology.org/P19-1356/.
24	DONG Y, WEI F, ZHOU M. Neural automated essay scoring incorporating handcrafted features[EB/OL]. [2023-08-05]. https://aclanthology.org/2020.coling-main.535.pdf.
25	DONG F, ZHANG Y, YANG J. Attention-based recurrent convolutional neural network for automatic essay scoring[EB/OL]. [2023-08-05]. https://aclanthology.org/K17-1017.pdf.
26	FARAG Y, YANNAKOUDAKIS H, BRISCOE T. Neural automated essay scoring and coherence modeling for adversarially crafted input[EB/OL]. [2023-08-05]. https://arxiv.org/abs/1804.06898.
27	MAYFIELD E, BLACK A W. Should you fine-tune BERT for automated essay scoring? [EB/OL]. [2023-08-05]. https://aclanthology.org/2020.bea-1.15.pdf.
28	LI F, XI X, CUI Z, et al. Automatic essay scoring method based on multi-scale features. Applied Sciences, 2023, 13(11): 67- 75.
29	RODRÍGUEZ P U, JAFARI A, ORMEROD C. Language models and automated essay scoring[EB/OL]. [2023-08-05]. https://arxiv.org/abs/1909.09482.
30	SHEN D, WANG G, WANG W, et al. On the use of word embeddings alone to represent natural language sequences[EB/OL]. [2023-08-05]. https://www.semanticscholar.org/paper/On-the-Use-of-Word-Embeddings-Alone-to-Represent-Shen-Wang/930efa747f5d1e34737e1f2bcc3301406ec12586.
31	CHEN M P, LI X. Relevance-based automated essay scoring via hierarchical recurrent model[C]//Proceedings of International Conference on Asian Language Processing. Washington D. C., USA: IEEE Press, 2018: 378-383.
32	YANG R S, CAO J N, WEN Z Y, et al. Enhancing automated essay scoring performance via fine-tuning pre-trained language models with combination of regression and ranking[EB/OL]. [2023-08-05]. https://aclanthology.org/2020.findings-emnlp.141/.
33	CAO Y, JIN H Q, WAN X J, et al. Domain-adaptive neural automated essay scoring[C]//Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, USA: ACM Press, 2020: 1011-1020.
34	ORMEROD C, MALHOTRA A, JAFARI A. Automated essay scoring using efficient Transformer-based language models[EB/OL]. [2023-08-05]. https://arxiv.org/abs/2102.13136.
35	CLARK K, LUONG M T, LE Q V, et al. ELECTRA: pre-training text encoders as discriminators rather than generators[EB/OL]. [2023-08-05]. https://arxiv.org/abs/2003.10555.
36	SUN Z Q, YU H K, SONG X D, et al. Mobile-BERT: a compact task-agnostic BERT for resource-limited devices[EB/OL]. [2023-08-05]. https://arxiv.org/abs/2004.02984.
37	周险兵, 樊小超, 任鸽, 等. 基于多层次语义特征的英文作文自动评分方法. 计算机应用, 2021, 41(8): 2205- 2211. URL
	ZHOU X B, FAN X C, REN G, et al. Automated English essay scoring method based on multi-level semantic features. Journal of Computer Applications, 2021, 41(8): 2205- 2211. URL

[1]	李俊仪, 李向阳, 龙朝勋, 李海燕, 李红松, 余鹏飞. 基于多级区域选择与跨层特征融合的野生菌分类[J]. 计算机工程, 2024, 50(9): 179-188.
[2]	张华青, 夏张涛, 陆晓庆, 童基均. 基于字形特征的血管外科命名实体识别[J]. 计算机工程, 2024, 50(8): 13-21.
[3]	李华昱, 张智康, 闫阳, 岳阳. 基于知识图谱增强的领域多模态实体识别[J]. 计算机工程, 2024, 50(8): 31-39.
[4]	刘锁兰, 王炎, 王洪元, 朱生升. 基于多流语义图卷积网络的人体行为识别[J]. 计算机工程, 2024, 50(8): 64-74.
[5]	赵婉秋, 张俊虎, 李海涛. 用于建筑物分割的平行结构特征融合网络[J]. 计算机工程, 2024, 50(8): 239-248.
[6]	赵宏, 王枭. 基于Swin-Transformer的黑色素瘤图像病灶分割研究[J]. 计算机工程, 2024, 50(8): 249-258.
[7]	王富平, 刘鸿玮, 张锲石, 段冠庄. 基于深度特征抑制的遮挡人脸识别网络[J]. 计算机工程, 2024, 50(8): 259-269.
[8]	闵莉, 董冰洁, 安冬. 基于多注意力机制与跨特征融合的语义分割算法[J]. 计算机工程, 2024, 50(8): 282-289.
[9]	谭巨全, 王然. 特征融合下田径录像3D人体动作DTW捕捉算法[J]. 计算机工程, 2024, 50(7): 71-78.
[10]	刘娟, 段友祥, 陆誉翕, 张鲁. 引入知识增强和对比学习的知识图谱补全[J]. 计算机工程, 2024, 50(7): 112-122.
[11]	张溢文, 蔡满春, 陈咏豪, 朱懿, 姚利峰. 融合空间特征的多尺度深度伪造检测方法[J]. 计算机工程, 2024, 50(7): 240-250.
[12]	王晋涛, 秦昂, 张元, 陈一飞, 王廷凤, 谢承霖, 邹刚. 基于注意力增强与特征融合的中文医学实体识别[J]. 计算机工程, 2024, 50(7): 324-332.
[13]	李亚康, 陈刚. 小角中子散射物理模型自动化筛选[J]. 计算机工程, 2024, 50(6): 56-64.
[14]	陈佳玉, 王元龙, 张虎. 基于文本知识增强的问题生成模型[J]. 计算机工程, 2024, 50(6): 86-93.
[15]	杨硕, 王一丁. 基于改进薄板样条运动模型的人脸动画算法[J]. 计算机工程, 2024, 50(6): 255-265.

选择文件类型/文献管理软件名称

选择包含的内容