作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2024, Vol. 50 ›› Issue (9): 344-355. doi: 10.19678/j.issn.1000-3428.0069597

• 开发研究与工程应用 • 上一篇    下一篇

智慧教育下基于改进YOLOv8的学生课堂行为检测算法

曾钰琦, 刘博*(), 钟柏昌, 钟瑾   

  1. 华南师范大学教育信息技术学院, 广东 广州 510631
  • 收稿日期:2024-03-18 出版日期:2024-09-15 发布日期:2024-08-13
  • 通讯作者: 刘博
  • 基金资助:
    国家留学基金项目(202206755007); 华南师范大学教学改革项目(20210217); 华南师范大学金种子课题(23JXKA04)

Student Classroom Behavior Detection Algorithm Based on Improved YOLOv8 in Smart Education

ZENG Yuqi, LIU Bo*(), ZHONG Baichang, ZHONG Jin   

  1. School of Information Technology in Education, South China Normal University, Guangzhou 510631, Guangdong, China
  • Received:2024-03-18 Online:2024-09-15 Published:2024-08-13
  • Contact: LIU Bo

摘要:

为了加快教育的数字化转型, 人工智能技术融入教与学全过程行为的精准分析与实证应用已成为当前的研究热点。针对目前学生课堂行为检测中存在的检测精度低、目标框密度高、重叠遮挡严重、尺度变化大以及数据量不平衡等问题, 创建学生课堂行为数据集DBS Dataset, 并提出一种基于改进YOLOv8的学生课堂行为检测算法VWE-YOLOv8。首先引入注意力机制CSWin-Transformer, 增强模型对图像全局信息的提取能力, 提高网络的检测精度; 然后集成大可分离核心注意力(LSKA) 模块到SPPF架构中, 增加模型在多尺度目标上的识别能力; 接着将遮挡感知注意力机制融入到检测头的设计中, 将原有的Head结构修改为SEAMHead, 实现模型对遮挡物体的有效检测; 最后引入权重调整函数Slide Loss来处理样本不均衡问题。实验结果表明, 与YOLOv8相比, 在DBS Dataset和公开数据集SCB Dataset上, 改进后VWE-YOLOv8的mAP@0.50分别提高了1.16%、1.70%, mAP@0.50∶0.95分别提高了7.36%、2.13%, 精度分别提升了4.17%、6.74%, 召回率分别提升了1.96%、3.13%, 说明该算法具有更高的检测精度和较强的泛化能力, 能够胜任学生课堂行为的检测任务, 有力支撑智慧教育应用, 助力教育数字化转型。

关键词: 智慧教育, 学生行为检测, 目标检测, 注意力机制, 大可分离核心注意力模块

Abstract:

To accelerate the digital transformation of education, the precise analysis and empirical application of AI technology integrated into the entire process of teaching and learning behaviors have become a current research hotspot. To address the problems of low detection accuracy, high density of bonding boxes, severe overlap and occlusion, large scale variations, and imbalance of data volume in student classroom behavior detection, this paper establishes a student classroom behavior dataset (DBS Dataset). Additionally, it proposes a student classroom behavior detection algorithm VWE-YOLOv8 based on improved YOLOv8. First, it introduces the CSWin-Transformer attention mechanism to enhance the model's capability to extract global information from images. This improves the network's detection accuracy. Second, it increases the model's recognition capability on multi-scale targets by integrating the Large Separable Kernel Attention (LSKA) module into the SPPF architecture. Additionally, it incorporates an occlusion-aware attention mechanism into the design of the detection head (which modifies the original Head structure to SEAMHead) to effectively detect occluded objects. Finally, it introduces a weight adjustment function (Slide Loss) to address the issue of sample imbalance. The experimental results reveal that compared with YOLOv8, the improved VWE-YOLOv8 achieves increases of 1.16% and 1.70% in mAP@0.50 and 7.36% and 2.13% in mAP@0.50∶0.95, on the DBS Dataset and public SCB Dataset. Furthermore, it improves the precision by 4.17%, 6.74% and recall rate by 1.96% and 3.13% on these datasets, respectively. These results indicate that the improved algorithm has a higher detection accuracy and stronger generalization capability. Moreover, it is capable of detecting students' classroom behaviors. This can strongly support the application of smart education and aid the digital transformation of education.

Key words: smart education, student behavior detection, object detection, attention mechanism, Large Separable Kernel Attention(LSKA) module