基于SwinT-YOLOX模型的自动扶梯行人安全检测算法

doi:10.19678/j.issn.1000-3428.0067416

摘要/Abstract

摘要：

自动扶梯被广泛应用在公共场合，乘客摔倒事故如果不能被及时发现并处理，会造成严重的人身伤害，因此实现自动扶梯智能化监控管理势在必行。受自动扶梯运行环境复杂、行人多以及局部遮挡情况的影响，传统的人体姿态特征摔倒检测模型效果不佳且检测速度减慢。融合Swin Transformer和YOLOX目标检测算法的优秀策略，提出一种基于SwinT-YOLOX网络模型的自动扶梯行人摔倒检测算法。采用Swin Transformer模型作为骨干网络，颈部网络使用添加注意力机制的YOLOX模型，进一步提升特征图的多样性和表达能力。此外，利用漏斗修正线性单元视觉激活函数构建CBF模块，改进颈部网络和Head网络结构，从而获得更优的特征检测性能。实验结果表明，针对自建扶梯行人摔倒数据库和网络采集实际扶梯行人摔倒事故，与AlphaPose、OpenPose、YOLOv5等算法相比，该算法检测性能明显提高，行人摔倒平均检测精度可以达到95.92%，检测帧率为24.08帧/s，能够快速、精准地检测到乘客摔倒事故发生，监控管理平台立刻采取安全急停措施以保证乘客安全。

关键词: 自动扶梯, 摔倒检测, 深度学习, YOLOX模型, Swin Transformer模型, 漏斗修正线性单元视觉激活函数

Abstract:

Escalators are widely used in public places. If passenger fall accidents cannot be detected and handled in a timely manner, they will cause serious personal injury. Therefore, it is imperative to achieve intelligent monitoring and management of escalators. Owing to the complex operating environment, large number of pedestrians, and local occlusion of escalators, traditional human posture feature fall detection models have poor performance and slow detection speed. A pedestrian fall detection algorithm for escalators is proposed based on the SwinT-YOLOX network model, which combines the excellent strategy of the Swin Transformer and YOLOX object detection algorithms. Adopting the Swin Transformer model as the backbone network, the neck network uses the YOLOX model with an added attention mechanism to further enhance the diversity and expression ability of feature maps. In addition, utilizing the Funnel Rectified Linear Unit (FReLU) visual activation function to construct a CBF module improves the structure of the neck and Head networks, thereby achieving better feature detection performance. The experimental results demonstrate that compared with algorithms such as AlphaPose, OpenPose, and YOLOv5, the detection performance of this algorithm is significantly improved for self-built escalator pedestrian fall databases and network collection of actual escalator pedestrian fall accidents. The average detection accuracy of pedestrian falls can reach 95.92%, with a detection frame rate of 24.08 frames/s, which can quickly and accurately detect the occurrence of passenger fall accidents. The monitoring management platform immediately takes safety emergency stop measures to ensure passenger safety.

Key words: automatic escalator, fall detection, deep learning, YOLOX model, Swin Transformer model, Funnel Rectified Linear Unit(FReLU) visual activation function

侯颖, 杨林, 胡鑫, 贺顺, 宋婉莹, 赵谦. 基于SwinT-YOLOX模型的自动扶梯行人安全检测算法[J]. 计算机工程, 2024, 50(3): 277-289.

Ying HOU, Lin YANG, Xin HU, Shun HE, Wanying SONG, Qian ZHAO. Automatic Escalator Pedestrian Safety Detection Algorithm Based on SwinT-YOLOX Model[J]. Computer Engineering, 2024, 50(3): 277-289.

http://www.ecice06.com/CN/Y2024/V50/I3/277

图/表 19

图1 改进SwinT-YOLOX算法的网络结构

Fig.1 Network structure of improved SwinT-YOLOX algorithm

图2 Swin Transformer模块结构

Fig.2 Structure of Swin Transformer module

图3 CBAM注意力机制结构

Fig.3 Structure of CBAM attention mechanism

图4 SwinT-YOLOX与YOLOX算法的可视化热图

Fig.4 Visual heat map of SwinT-YOLOX and YOLOX algorithms

图5 FReLU视觉激活函数示意图

Fig.5 Schematic diagram of FReLU visual activation function

图6 Head预测网络输出支路特征整合示意图

Fig.6 Feature integration schematic diagram of Head prediction network output branch

图7 扶梯智能监控系统流程

Fig.7 Procedure of the escalator intelligent monitoring system

图8 摄像头安装位置及采集图像

Fig.8 Camera installation position and acquisition images

图9 低亮度图像直方图均衡增强预处理结果

Fig.9 Preprocessing results of low histogram equalization enhancement for low-brightness image

图10 Mosaic和Mixup数据增强方法处理结果

Fig.10 Processing results of Mosaic and Mixup data enhancement methods

图11 SwinT-YOLOX算法的损失和mAP变化曲线

Fig.11 Loss and mAP variation curves of SwinT-YOLOX algorithm

图12 不同算法的扶梯行人摔倒检测结果对比

Fig.12 Comparison of escalator pedestrian fall detection results using different algorithms

图13 在实际扶梯事故中行人摔倒检测结果图像视觉对比

Fig.13 Image visual comparison of pedestrian fall detection results in actual escalator accidents

图14 在实际扶梯事故中行人摔倒检测结果视频帧视觉对比

Fig.14 Visual comparison of video frames of pedestrian fall detection results in actual escalator accidents

参考文献 30

1	ALAM E, SUFIAN A, DUTTA P, et al. Vision-based human fall detection systems using deep learning: a review. Computers in Biology and Medicine, 2022, 146, 1- 22.
2	GUTIÉRREZ J, RODRÍGUEZ V, MARTIN S. Comprehensive review of vision-based fall detection systems. Sensors, 2021, 21(3): 1- 50. doi: 10.1109/JSEN.2020.3045950
3	PARMAR R, TRAPASIYA S. A comprehensive survey of various approaches on human fall detection for elderly people. Wireless Personal Communications, 2022, 126(2): 1679- 1703. doi: 10.1007/s11277-022-09816-6
4	杨志勇, 王俊杰, 金磊. 基于SE-CNN的人体摔倒检测方法. 计算机工程, 2022, 48(6): 270- 277. doi: 10.19678/j.issn.1000-3428.0061833
	YANG Z Y, WANG J J, JIN L. Human fall detection method based on SE-CNN. Computer Engineering, 2022, 48(6): 270- 277. doi: 10.19678/j.issn.1000-3428.0061833
5	张宇, 温光照, 米思娅, 等. 基于深度学习的二维人体姿态估计综述. 软件学报, 2022, 33(11): 4173- 4191. URL
	ZHANG Y, WEN G Z, MI S Y, et al. Overview on 2D human pose estimation based on deep learning. Journal of Software, 2022, 33(11): 4173- 4191. URL
6	马子越, 彭瑞阳, 孙晓晗, 等. 基于OpenPose的人体姿态估计技术研究综述. 软件导刊, 2022, 21(11): 247- 252. doi: 10.11907/rjdk.212574
	MA Z Y, PENG R Y, SUN X H, et al. Review of human pose estimation technology research based on OpenPose. Software Guide, 2022, 21(11): 247- 252. doi: 10.11907/rjdk.212574
7	CHEN W M, JIANG Z J, GUO H L, et al. Fall detection based on key points of human-skeleton using OpenPose. Symmetry, 2020, 12(5): 1- 17.
8	LIN C B, DONG Z Q, KAI K W, et al. A framework for fall detection based on OpenPose skeleton and LSTM/GRU models. Applied Sciences, 2021, 11(1): 1- 20. doi: 10.3969/j.issn.0255-8297.2021.01.001
9	卫少洁, 周永霞. 一种结合AlphaPose和LSTM的人体摔倒检测模型. 小型微型计算机系统, 2019, 40(9): 1886- 1890. doi: 10.3969/j.issn.1000-1220.2019.09.014
	WEI S J, ZHOU Y X. Human body fall detection model combining AlphaPose and LSTM. Journal of Chinese Computer Systems, 2019, 40(9): 1886- 1890. doi: 10.3969/j.issn.1000-1220.2019.09.014
10	马敬奇, 雷欢, 陈敏翼. 基于AlphaPose优化模型的老人跌倒行为检测算法. 计算机应用, 2022, 42(1): 294- 301. URL
	MA J Q, LEI H, CHEN M Y. Fall behavior detection algorithm for the elderly based on AlphaPose optimization model. Journal of Computer Applications, 2022, 42(1): 294- 301. URL
11	RAZA A, YOUSAF M H, VELASTIN S A. Human fall detection using YOLO: a real-time and AI-on-the-edge perspective[C]//Proceedings of the 12th International Conference on Pattern Recognition Systems. Washington D. C., USA: IEEE Press, 2022: 1-6.
12	YIN Y, LEI L, LIANG M, et al. Research on fall detection algorithm for the elderly living alone based on YOLO[C]//Proceedings of International Conference on Emergency Science and Information Technology. Washington D. C., USA: IEEE Press, 2021: 403-408.
13	王晓雯, 梁博, 刘芳芳. 基于注意力机制与加权盒函数的YOLOv5的行人摔倒检测算法. 山西大学学报(自然科学版), 2023, 46(2): 334- 341. URL
	WANG X W, LIANG B, LIU F F. YOLOv5 pedestrian fall detection algorithm based on attention mechanism and weighted box function. Journal of Shanxi University(Natural Science Edition), 2023, 46(2): 334- 341. URL
14	ZHAO X. Research on the application of OpenPose in escalator safety systems[C]//Proceedings of the 5th International Conference on Advanced Algorithms and Control Engineering. Sanya, China: [s. n. ], 2022: 1-8.
15	LIU S F, AN Z L, WANG N, et al. Research on elevator passenger fall detection based on machine vision[C]//Proceedings of the 3rd International Conference on Advances in Civil Engineering, Energy Resources and Environment Engineering. Qingdao, China: [s. n. ], 2021: 1-10.
16	JIAO Z Y, LEI H, ZONG H S, et al. Potential escalator-related injury identification and prevention based on multi-module integrated system for public health[EB/OL]. [2023-03-12]. https://arxiv.org/abs/2103.07620v1.
17	滕安. 基于人体姿态识别的行人乘坐自动扶梯跌倒检测方法的研究[D]. 大连: 大连交通大学, 2019.
	TENG A. Research of falling detection method of pedestrians taking the escalator based on human pose recognition[D]. Dalian: Dalian Jiaotong University, 2019. (in Chinese)
18	张建军. 基于手扶电梯监控视频的危险行为检测及研究[D]. 合肥: 安徽大学, 2021.
	ZHANG J J. Detection and research of dangerous behavior based on video monitoring of elevator[D]. Hefei: Anhui University, 2021. (in Chinese)
19	汪威, 胡旭晓, 吴跃成, 等. 基于深度学习的自动扶梯视频人体动作识别. 软件工程, 2021, 24(9): 24- 27. URL
	WANG W, HU X X, WU Y C, et al. Human motion recognition in escalator video based on deep learning. Software Engineering, 2021, 24(9): 24- 27. URL
20	邵延华, 张铎, 楚红雨, 等. 基于深度学习的YOLO目标检测综述. 电子与信息学报, 2022, 44(10): 3697- 3708. URL
	SHAO Y H, ZHANG D, CHU H Y, et al. A review of YOLO object detection based on deep learning. Journal of Electronics & Information Technology, 2022, 44(10): 3697- 3708. URL
21	JOCHER G, CHAURASIA A, STOKEN A, et al. YOLOv5 classification models[EB/OL]. [2023-03-12]. https://github.com/pjreddie/darknet.
22	LI C, LI L, JIANG H, et al. YOLOv6: a single-stage object detection framework for industrial applications[EB/OL]. [2023-03-12]. https://www.arXiv preprint arXiv:2209.02976, 2022.
23	WANG C Y, BOCHKOVSKIY A, LIAO H Y M. YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//Proceedings of Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2023: 17-24.
24	GE Z, LIU S T, WANG F, et al. YOLOX: exceeding YOLO series in 2021[EB/OL]. [2023-03-12]. https://www.arXiv preprint arXiv:2107.08430, 2021.
25	LIU Z, LIN Y T, CAO Y, et al. Swin Transformer: hierarchical vision Transformer using shifted windows[C]//Proceedings of International Conference on Computer Vision. Washington D. C., USA: IEEE Press, 2021: 10012-10022.
26	XU Y F, WEI H P, LIN M X, et al. Transformers in computational visual media: a survey. Computational Visual Media, 2022, 8, 33- 62.
27	衡红军, 范昱辰, 王家亮. 基于Transformer的多方面特征编码图像描述生成算法. 计算机工程, 2023, 49(2): 199- 205. doi: 10.19678/j.issn.1000-3428.0064450
	HENG H J, FAN Y C, WANG J L. Multifaceted feature coding image caption generation algorithm based on Transformer. Computer Engineering, 2023, 49(2): 199- 205. doi: 10.19678/j.issn.1000-3428.0064450
28	WOO S H, PARK J C, LEE J Y, et al. CBAM: convolutional block attention module[C]//Proceedings of European Conference on Computer Vision. Berlin, Germany: Springer, 2018: 3-19.
29	LIU S, QI L, QIN H F, et al. Path aggregation network for instance segmentation[C]//Proceedings of Conference on Computer Vision and Pattern Recognition. Washington D. C., USA: IEEE Press, 2018: 8759-8768.
30	MA N N, ZHANG X Y, SUN J. Funnel activation for visual recognition[C]//Proceedings of European Conference on Computer Vision. Berlin, Germany: Springer, 2020: 351-368.

[1]	姜百浩, 刘静, 仇大伟, 姜良. 深度学习在脊柱图像分割中的应用综述[J]. 计算机工程, 2024, 50(3): 1-15.
[2]	吴现, 吐松江·卡日, 王海龙, 马小晶, 李振恩, 邵罗. 基于时空长短时记忆神经网络的地基云图预测算法[J]. 计算机工程, 2024, 50(3): 298-305.
[3]	连哲, 殷雁君, 云飞, 智敏. 基于深度学习的自然场景文本检测综述[J]. 计算机工程, 2024, 50(3): 16-27.
[4]	陈虹, 王瀚文, 金海波. 融合改进自编码器和残差网络的入侵检测模型[J]. 计算机工程, 2024, 50(2): 188-195.
[5]	郑晨俊, 曾艳, 袁俊峰, 张纪林, 王鑫, 韩猛. 基于联邦学习的船舶AIS轨迹预测算法[J]. 计算机工程, 2024, 50(2): 298-307.
[6]	安峰民, 张冰冰, 董微, 张建新. 面向视频行为识别深度模型的数据预处理方法[J]. 计算机工程, 2024, 50(2): 281-287.
[7]	徐浩宸, 刘满华. 基于多层次自注意力网络的人脸特征点检测[J]. 计算机工程, 2024, 50(2): 239-246.
[8]	曾嘉忻, 张卫明, 张荣. 基于后门的鲁棒后向模型水印方法[J]. 计算机工程, 2024, 50(2): 132-139.
[9]	丁国辉, 刘宇琪, 王言开, 耿施展, 姜天昊. 基于翻转网络的低相关性序列数据预测研究[J]. 计算机工程, 2024, 50(2): 78-90.
[10]	祝冰艳, 陈志华, 盛斌. 基于感知增强Swin Transformer的遥感图像检测[J]. 计算机工程, 2024, 50(1): 216-223.
[11]	蒋心璐, 陈天恩, 王聪, 赵春江. 大田环境下的农业害虫图像小目标检测算法[J]. 计算机工程, 2024, 50(1): 232-241.
[12]	白尚旺, 王梦瑶, 胡静, 陈志泊. 多区域注意力的细粒度图像分类网络[J]. 计算机工程, 2024, 50(1): 271-278.
[13]	曹广硕, 黄瑞章, 陈艳平, 秦永彬. 基于多模态学习的乳腺癌生存预测研究[J]. 计算机工程, 2024, 50(1): 296-305.
[14]	圣文顺, 余熊峰, 林佳燕, 陈欣. 融合注意力与特征金字塔的小尺度目标检测算法[J]. 计算机工程, 2024, 50(1): 242-250.
[15]	池亚平, 岳梓岩, 林雨衡. 基于Transformer的SM4算法工作模式识别[J]. 计算机工程, 2023, 49(9): 109-117.

选择文件类型/文献管理软件名称

选择包含的内容