基于注意力机制与特征融合的课堂抬头率检测算法

doi:10.19678/j.issn.1000-3428.0061107

摘要/Abstract

摘要： 课堂教学是整个教育任务中的重要环节，教育信息化的发展为提升教学管理水平提供了更多方案。为加强教学情况正反馈，提高课堂抬头率检测的准确性，提出一种结合注意力机制和特征融合的新型检测算法。将原图及视觉特征RGB difference作为网络输入，令其经过特征提取网络后得到信息更丰富的深层特征。在此基础上，提出一种改进的注意力模型（ICBAM）并加载至特征提取网络上，ICBAM使用通道注意力模块和空间注意力模块并行的双流结构，提升网络的特征提取能力。在通道注意力和空间注意力中加入空洞卷积以过滤输入特征中的冗余特征，减少网络对背景等无用特征的关注。此外，设计精炼模块优化预测结果，并在所提算法的基础上实现课堂行为分析软件的开发与应用。实验结果表明，该算法在抬头率检测数据集RDS上的平均抬头率误差为15.648%，相比于SolvePnP等主流检测算法具有更低的误差率。

关键词: 抬头率, 课堂视频, 注意力机制, 特征融合, 空洞卷积

Abstract: Classroom teaching is an important part of the educational task, and the development of educational informatization provides more schemes for improving the level of teaching management.To strengthen the positive feedback of teaching situations and improve the accuracy of classroom head-up rate detection, a detection algorithm for the classroom head-up rate combined with the attention mechanism and feature fusion is proposed.In addition to the original image, the network input uses another visual feature:the Red-Green-Blue (RGB) difference.Following the feature extraction network, the two kinds of inputs are fused to obtain more-abundant deep features.An improved attention model, the Improved Convolutional Block Attention Module (ICBAM), is proposed to be loaded on the feature extraction network.ICBAM uses the parallel dual-flow structure of the channel attention and spatial attention modules, which can improve the feature extraction ability of the network.Hole convolution is added to channel attention and spatial attention to filter the redundant features in the input features.In addition, the refining module is designed to optimize the prediction results further, and the development and application of classroom behavior analysis software are realized based on the proposed algorithm.The experimental results show that the average head-up rate error of this algorithm is 15.648% on the header rate detection data set RDS, which is lower than that of SolvePnP and other mainstream detection algorithms.

Key words: head up rate, classroom video, attention mechanism, feature fusion, dilated convolution

中图分类号:

TP391

倪童, 桑庆兵. 基于注意力机制与特征融合的课堂抬头率检测算法[J]. 计算机工程, 2022, 48(4): 262-268.

NI Tong, SANG Qingbing. Class Head up Rate Detection Algorithm Based on Attention Mechanism and Feature Fusion[J]. Computer Engineering, 2022, 48(4): 262-268.

http://www.ecice06.com/CN/Y2022/V48/I4/262

图/表 17

20230131202401

20230131202404

20230131202407

20230131202411

20230131202414

20230131202417

20230131202420

20230131202424

20230131202427

20230131202430

20230131202434

20230131202437

20230131202440

20230131202443

20230131202446

20230131202450

20230131202454

参考文献

[1] 覃基笙, 胡顺.教育信息化发展中高校教学管理信息化建设——评《高校教学管理理论与实践》[J].中国高校科技, 2020, 12:103-107. QIN J S, HU S.Informatization construction of teaching management in colleges and universities in the development of education informatization:a review of theory and practice of teaching management in colleges and universities[J].Science and Technology of Chinese Colleges and Universities, 2020, 12:103-107.(in Chinese)
[2] 周全, 李有增.面向高校智慧校园建设的"产学研用训"协同创新研究[J].中国电化教育, 2021, 3:104-110. ZHOU Q, LI Y Z.Research on collaborative innovation of "production, teaching, research, application and training" for smart campus construction in colleges and universities[J].China Audio Visual Education, 2021, 3:104-110.(in Chinese)
[3] PAYAL B, SAMIR K.Human face detection:manual vs.kohonen self organizing map[J].International Journal of Computer, 2020, 39(1):79-87.
[4] MOHAMMED G, AMERA I.Implementation of HOG feature extraction with tuned parameters for human face detection[J].International Journal of Machine Learning and Computing, 2020, 10(5):36-41.
[5] TAKOUA K.Human face detection improvement using incremental learning based on low variance directions[J].Signal, Image and Video Processing, 2019, 13(8):1503-1510.
[6] 方冠男, 胡骞鹤, 方书雅, 等.视频人脸图像质量评估及其在课堂点名系统中的应用[J].计算机应用与软件, 2018, 35(10):140-146, 251. FANG G N, HU Q H, FANG S Y, et al.Video face image quality assessment and its application in classroom roll call system[J].Computer Applications and Software, 2018, 35(10):140-146, 251.(in Chinese)
[7] 杨帆, 邸德海, 韩博, 等.高实时性分布式一卡通教室考勤系统建设[J].华中师范大学学报(自然科学版), 2017, 51(1):191-194, 199. YANG F, DI D H, HAN B, et al.Construction of high real-time distributed all-in-one card classroom attendance system[J].Journal of Central China Normal University (Natural Science Edition), 2017, 51(1):191-194, 199.(in Chinese)
[8] ZHI W X, XIAO J W, JOSEF K.STRNet:triple-stream spatiotemporal relation network for action recognition[J].International Journal of Automation and Computing, 2021, 2(13):1-13.
[9] NEWLIN S, ARIVAZHAGAN S.Fusion of spatial and dynamic CNN streams for action recognition[J].Multimedia Systems, 2021, 4(9):1-16.
[10] ZHAO B T."Reading pictures instead of looking":RGB-D image-based action recognition via capsule network and kalman filter[J].Sensors, 2021, 21(6):2217-2217.
[11] SHIKHAR C, RAHUL M, SANJAY B.Face recognition system based on PCA[J].Journal of Research in Science and Engineering, 2020, 2(11):91-97
[12] 陈得恩, 张建伟, 柯文俊.稳定的视频内头部姿态估计方法[J].计算机工程与设计, 2020, 41(12):3438-3443. CHEN D E, ZHANG J W, KE W J.Stable head posture estimation method in video[J].Computer Engineering and Design, 2020, 41(12):3438-3443.(in Chinese)
[13] JIANG W H.Dynamic proposal sampling for weakly supervised object detection[J].Neurocomputing, 2021, 441(1):248-259.
[14] WANG Y J.Towards a physical-world adversarial patch for blinding object detection models[J].Information Sciences, 2021, 556(13):459-471.
[15] LI F Y.PSANet:pyramid splitting and aggregation network for 3D object detection in point cloud[J].Sensors, 2020, 21(1):136-136.
[16] REN S, HE K, GIRSHICK R, et al.Faster R-CNN:towards real-time object detection with region proposal networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6):1137-1149.
[17] REDMON J, FARHADI A.YOLOv3:an incremental improvement[EB/OL].[2021-02-11].https://arxiv.org/abs/1804.02767.
[18] WOO S, PARK J, LEE J Y, et al.CBAM:convolutional block attention module[EB/OL].[2021-02-11].https://arxiv.org/abs/1807.06521.
[19] MEHDI H.A multiple multilayer perceptron neural network with an adaptive learning algorithm for thyroid disease diagnosis in the internet of medical things[J].the Journal of Supercomputing, 2021, 77(4):3616-3637.
[20] AMBARELLA I.Patent isued for non-maximum suppression in image data (USPTO 10, 733, 713)[J].Computer Weekly News, 2020, 9,:6904-6910.

选择文件类型/文献管理软件名称

选择包含的内容