Fall Detection Method Integrating Multi-Feature and Semantic Graph Convolution Network

doi:10.19678/j.issn.1000-3428.0064047

Abstract

Abstract: Falls have a serious impact on the lives and health of elderly people.Fall detection can reduce the risk of elderly people falling again，thereby ensuring their ability to live and improving their quality of life.Currently，vision-based fall detection methods can achieve good accuracy on experimental datasets，but they do not generalize well to real-world environments and often do not conform to the action judgment logic in practical applications.To solve this problem，this paper proposes a robust fall detection method based on 2D human posture estimation，which combines the optical flow method and human posture-based estimation method. This study designed an optimized framework for fall detection and constructed a detection model that integrates multiple features and semantic graph convolution.The model is trained using a strategy that is more suitable for motion judgment logic to improve the generalization of the fall detection system for a real environment. Experiments on three public datasets，Le2i Fall Detection Dataset，UP Fall Detection Dataset，and Multiple Cameras Fall Detection Dataset，as well as self-collected datasets，show that the overall detection accuracy of the model reaches 98.3%.The model based on the proposed optimization framework and training strategy cooperates with YOLOv3 and Alpha_pose to implement an overall fall detection method that achieves a frame rate of approximately 25FPS on a GTX1060 graphics card.This method exhibits good robustness in real-world testing， and compared to previous vision-based detection methods，it is more suitable for deployment in practical application environments.

Key words: graph convolution network, lightweight network, fall detection, multi-feature fusion, semantic information, training strategy

摘要： 摔倒事件严重影响老年人的生命健康，对摔倒行为进行检测可以降低老年人再次跌倒的风险，从而保证其生活能力以及提高生活质量。目前基于视觉的摔倒检测方法在实验数据集上能够取得较好的精度，但是无法很好地泛化到现实环境中，在实际应用时往往并不符合动作判断逻辑。针对该问题，对比光流法以及基于人体姿态估计的方法，在2D人体姿态估计的基础上提出一种鲁棒的摔倒检测方法。设计一种摔倒检测优化框架，构建融合多特征与语义图卷积的检测模型，采用更贴合动作判断逻辑的训练策略对该模型进行训练，以提高摔倒检测系统在现实环境中的泛化性。在Le2i Fall Detection Dataset、UP Fall Detection Dataset和Multiple Cameras Fall Detection Dataset这3个公开数据集以及自收集数据集上进行实验，结果表明，该模型的总体检测准确率达到98.3%，基于所提优化框架与训练策略的模型配合YOLOv3和Alpha_pose实现的整体摔倒检测方法在GTX1060显卡中帧率达到约25FPS，在现实场景测试中体现出较好的鲁棒性，相较以往的基于视觉的检测方法更适合部署在实际应用环境中。

关键词: 图卷积网络, 轻量级网络, 摔倒检测, 多特征融合, 语义信息, 训练策略

CLC Number:

TP391

CHEN Wenxuan, ZENG Bi, GUO Zhixing. Fall Detection Method Integrating Multi-Feature and Semantic Graph Convolution Network[J]. Computer Engineering, 2023, 49(5): 277-285,294.

陈文轩, 曾碧, 郭植星. 融合多特征与语义图卷积网络的摔倒检测方法[J]. 计算机工程, 2023, 49(5): 277-285,294.

/ / Recommend / Download Citations

URL: http://www.ecice06.com/EN/10.19678/j.issn.1000-3428.0064047

http://www.ecice06.com/EN/Y2023/V49/I5/277

Figures/Tables 12

References

[1] 苗泰.基于视频的家居环境中人体摔倒检测算法研究[D].吉林:东北电力大学,2021. MIAO T.Research on human fall detection algorithm based on video in home environment[D].Jilin:Northeast Dianli University,2021.(in Chinese)
[2] 陈翔.基于加速度传感器的摔倒检测研究[D].福州:福州大学,2018. CHEN X.Research on fall detection based on acceleration sensor[D].Fuzhou:Fuzhou University,2018.(in Chinese)
[3] JEFIZA A,PRAMUNANTO E,BOEDINOEGROHO H,et al.Fall detection based on accelerometer and gyroscope using back propagation[C]//Proceedings of the 4th International Conference on Electrical Engineering,Computer Science and Informatics.Washington D.C.,USA:IEEE Press,2017:1-6.
[4] DENG Z F,MIN W D,ZOU S.A method for fall detection based on CNN and human elliptical contour motion features[J].Journal of Graphics,2018,39(6):1042-1047.
[5] MIN W D,CUI H,RAO H,et al.Detection of human falls on furniture using scene analysis based on deep learning and activity characteristics[J].IEEE Access,2018,6:9324-9335.
[6] NÚÑEZ-MARCOS A,AZKUNE G,ARGANDA-CARRERAS I.Vision-based fall detection with convolutional neural networks[J].Wireless Communications and Mobile Computing,2017,5:1-16.
[7] XU H J,DAS A,SAENKO K.R-C3D:region convolutional 3D network for temporal activity detection[C]//Proceedings of IEEE International Conference on Computer Vision.Washington D.C.,USA:IEEE Press,2017:5794-5803.
[8] FEICHTENHOFER C.X3D:expanding architectures for efficient video recognition[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2020:200-210.
[9] FANG H S,XIE S Q,TAI Y W,et al.RMPE:regional multi-person pose estimation[C]//Proceedings of IEEE International Conference on Computer Vision.Washington D.C.,USA:IEEE Press,2017:2353-2362.
[10] JOHN M.Spiking neural network based on joint entropy of optical flow features for human action recognition[J].The Visual Computer,2022,38(1):223-237.
[11] WANG P,LI W,OGUNBONA P,et al.RGB-D-based human motion recognition with deep learning:a survey[J].Computer Vision and Image Understanding,2018,171:118-139.
[12] 卫少洁,周永霞.一种结合Alphapose和LSTM的人体摔倒检测模型[J].小型微型计算机系统,2019,40(9):1886-1890. WEI S J,ZHOU Y X.Human body fall detection model combining Alphapose and LSTM[J].Journal of Chinese Computer Systems,2019,40(9):1886-1890.(in Chinese)
[13] ZHANG P F,LAN C L,XING J L,et al.View adaptive recurrent neural networks for high performance human action recognition from skeleton data[C]//Proceedings of IEEE International Conference on Computer Vision.Washington D.C.,USA:IEEE Press,2017:2136-2145.
[14] LI C,ZHONG Q Y,XIE D,et al.Skeleton-based action recognition with convolutional neural networks[C]//Proceedings of IEEE International Conference on Multimedia & Expo Workshops.Washington D.C.,USA:IEEE Press,2017:597-600.
[15] YAN S J,XIONG Y J,LIN D H.Spatial temporal graph convolutional networks for skeleton-based action recognition[EB/OL].[2022-01-05].https://arxiv.org/pdf/1801.07455.pdf.
[16] REDMON J,FARHADI A.YOLOv3:an incremental improvement[EB/OL].[2022-01-05].https://arxiv.org/abs/1804.02767.
[17] BEWLEY A,GE Z Y,OTT L,et al.Simple online and realtime tracking[C]//Proceedings of IEEE International Conference on Image Processing.Washington D.C.,USA:IEEE Press,2016:3464-3468.
[18] WOJKE N,BEWLEY A,PAULUS D.Simple online and realtime tracking with a deep association metric[C]//Proceedings of IEEE International Conference on Image Processing.Washington D.C.,USA:IEEE Press,2018:3645-3649.
[19] MANZI A,DARIO P,CAVALLO F.A human activity recognition system based on dynamic clustering of skeleton data[J].Sensors (Basel,Switzerland),2017,17(5):1100.
[20] SONG Y F,ZHANG Z,WANG L.Richly activated graph convolutional network for action recognition with incomplete skeletons[C]//Proceedings of IEEE International Conference on Image Processing.Washington D.C.,USA:IEEE Press,2019:1-5.
[21] SHI L,ZHANG Y,CHENG J,et al.Two-stream adaptive graph convolutional networks for skeleton-based action recognition[EB/OL].[2022-01-05].https://arxiv.org/abs/1805.07694.
[22] SONG Y F,ZHANG Z,SHAN C F,et al.Stronger,faster and more explainable:a graph convolutional baseline for skeleton-based action recognition[C]//Proceedings of the 28th ACM International Conference on Multimedia.New York,USA:ACM Press,2020:1625-1633.
[23] ZHANG P F,LAN C L,ZENG W J,et al.Semantics-guided neural networks for efficient skeleton-based human action recognition[C]//Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition.Washington D.C.,USA:IEEE Press,2020:1109-1118.
[24] BOCHKOVSKIY A,WANG C Y,LIAO H Y M.YOLOv4:optimal speed and accuracy of object detection[EB/OL].[2022-01-05].https://arxiv.org/abs/2004.10934.
[25] ZHOU Z H.A brief introduction to weakly supervised learning[J].National Science Review,2018,5(1):44-53.
[26] AUVINET E,ROUGIER C,MEUNIER J,et al.Multiple cameras fall dataset[EB/OL].[2022-01-05].http://www.iro.umontreal.ca/~labimage/Dataset/#:~:text=Multiple%20cameras%20fall%20dataset.%20This%20dataset%20contain%2024,the%20last%202%20ones%20contain%20only%20confounding%20events.
[27] MARTÍNEZ-VILLASEÑOR L,PONCE H,BRIEVA J,et al.UP-fall detection dataset:a multimodal approach[J].Sensors(Basel,Switzerland),2019,19(9):1988.
[28] FENG W,LIU R,ZHU M.Fall detection for elderly person care in a vision-based home surveillance environment using a monocular camera[J].Signal,Image and Video Processing,2014,8(6):1129-1138.

Please choose a citation manager

Content to export