Research on Cyberbullying Detection Based on Multimodal Spatial Feature Fusion

doi:10.19678/j.issn.1000-3428.0070114

Computer Engineering ›› 2026, Vol. 52 ›› Issue (3): 255-263. doi: 10.19678/j.issn.1000-3428.0070114

• Multimodal Information Fusion • Previous Articles Next Articles

Research on Cyberbullying Detection Based on Multimodal Spatial Feature Fusion

CHEN Guolian¹, FENG Ziyang², CAO Junkuo¹^,³^,*()

1. Key Laboratory of Data Science and Smart Education, Ministry of Education, Haikou 571158, Hainan, China
2. School of Information Science and Technology, Hainan Normal University, Haikou 571158, Hainan, China
3. Information Network and Data Center, Hainan Normal University, Haikou 571158, Hainan, China

Received:2024-07-12 Revised:2024-08-30 Online:2026-03-15 Published:2024-12-03
Contact: CAO Junkuo

基于多模态空间特征融合的网络欺凌检测研究

陈国莲¹, 冯梓洋², 曹均阔¹^,³^,*()

1. 数据科学与智慧教育教育部重点实验室, 海南海口 571158
2. 海南师范大学信息科学技术学院, 海南海口 571158
3. 海南师范大学信息网络与数据中心, 海南海口 571158

通讯作者: 曹均阔
作者简介:
陈国莲，女，高级工程师、硕士，主研方向为统计机器学习
冯梓洋，硕士研究生
曹均阔(通信作者)，研究员、博士
基金资助:
海南省自然科学基金(625MS081); 海口市科技专项(2025-008); 海南省高等学校教育教学改革研究项目(Hnjg2024ZD-19); 国家自然科学基金(61867001); 国家自然科学基金(61363032)

Abstract

Abstract:

To achieve faster and wider dissemination effects, social media platforms often use multimodal information, such as text, voice, and images, to publish cyberbullying comments. Multimodal information can express the emotions of information publishers in greater detail and provide multidimensional information sources for researchers to automatically detect cyberbullying. Current multimodal network bullying speech detection models primarily focus on the complex fusion of large-scale interactive spaces and lack an analysis of potential commonalities and differences between modalities. Therefore, multimodal network bullying detection based on simple feature fusion does not achieve ideal performance, and model training is significantly time-consuming and difficult to converge. This study proposes a multimodal detection model based on spatial features to address this issue. First, features are extracted for each single mode, and then the features are fused using the hierarchical attention mechanism of the Hadamard product by constructing shared and specific feature spaces. The fusion process does not simply rely on output attention scores for simple weighting but independently reassigns attention weights so that modalities do not interfere with each other and the feature integrity of shared and specific spaces are preserved. Finally, a dual layer perceptron structure is used to detect cyberbullying speech. Results show that the model achieves good detection performance and convergence on both the CMCAD and CMU-MOSI datasets.

Key words: cyberbullying detection, multimodal learning, multimodal feature fusion, layered attention mechanism, double-layer perceptron

摘要：

社交媒体平台为了达到更快更广的传播效应, 发布网络欺凌言论往往综合利用了文本、语音和图像等多模态信息。虽然多模态信息可以更充分地表达信息发布人的情感, 但同时也为研究人员进行网络欺凌自动检测提供了多维度信息源。当前多模态网络欺凌言论检测模型多聚焦于大规模交互空间的复杂融合, 缺乏模态间潜在共性和异性的关联分析。因此, 基于简单特征融合的多模态网络欺凌检测模型性能不够理想, 而且模型的训练过程也非常耗时、不易收敛。针对这一问题, 提出一种基于空间特征的多模态检测模型。首先对各单一模态进行特征提取, 然后通过共享特征空间和特定特征空间的构建, 使用哈达玛积的分层注意力机制进行特征融合。该融合过程不是单纯依靠输出注意力分数进行简单加权, 而是独立地重新分配注意力权重, 从而使得模态之间互不干扰, 保留了共享空间和特定空间的特征完整性。最后使用双层感知机结构实现网络欺凌言论检测, 结果表明, 该模型在CMCAD和CMU-MOSI数据集上均取得了良好的检测效果和收敛性能。

关键词: 网络欺凌检测, 多模态学习, 多模态特征融合, 分层注意力机制, 双层感知机

CHEN Guolian, FENG Ziyang, CAO Junkuo. Research on Cyberbullying Detection Based on Multimodal Spatial Feature Fusion[J]. Computer Engineering, 2026, 52(3): 255-263.

陈国莲, 冯梓洋, 曹均阔. 基于多模态空间特征融合的网络欺凌检测研究[J]. 计算机工程, 2026, 52(3): 255-263.

/ Recommend / Download Citations

URL: https://www.ecice06.com/EN/10.19678/j.issn.1000-3428.0070114

https://www.ecice06.com/EN/Y2026/V52/I3/255

Figures/Tables 7

References 27

1	UNICEF. Cyberbullying: what is it and how to stop it[EB/OL]. [2024-05-05]. https://www.unicef.org/end-violence/how-to-stop-cyberbullying.
2	宋宇琦, 高旻, 李骏东, 等. 网络欺凌检测综述. 电子学报, 2020, 48 (6): 1220- 1229.
	SONG Y Q, GAO M, LI J D, et al. A survey of cyberbullying detection. Acta Electronica Sinica, 2020, 48 (6): 1220- 1229.
3	CHENG L, LI J D, SILVA Y, et al. PI-bully: personalized cyberbullying detection with peer influence[EB/OL]. [2024-05-05]. https://ysilva.cs.luc.edu/BullyBlocker/documents/pibully-ijcai2019.pdf.
4	共青团中央维护青少年权益部. 2020年全国未成年人互联网使用情况研究报告[EB/OL]. [2024-05-05]. http://www.cnnic.cn/hlwfzyj/hlwxzbg/qsnbg/202107/P020210720571098696248.pdf.
	Youth Rights Department of the Central Committee of the Communist Youth League of China. Research report on the use of Internet by minors in China in 2020[EB/OL]. [2024-05-05]. http://www.cnnic.cn/hlwfzyj/hlwxzbg/qsnbg/202107/P020210720571098696248.pdf. (in Chinese)
5	曹文, 张香兰. 小学生网络欺凌现状及其对策——基于山东14所小学的调查. 少年儿童研究, 2020 (6): 16- 23.
	CAO W, ZHANG X L. The situation and countermeasure of cyberbullying—based on the survey of 14 primary schools in Shandong province. Children's Study, 2020 (6): 16- 23.
6	中共中央办公厅, 国务院办公厅. 关于加强网络文明建设的意见[EB/OL]. [2024-05-05]. https://www.gov.cn/xinwen/2021-09/14/content_5637195.htm.
	General Office of the Communist Party of China Central Committee, General Office of the State Council. Opinions on strengthening the construction of network civilization [EB/OL]. [2024-05-05]. https://www.gov.cn/xinwen/2021-09/14/content_5637195.htm. (in Chinese)
7	BALAKRISHNAN V , KHAN S , ARABNIA H R . Improving cyberbullying detection using Twitter users' psychological features and machine learning. Computers & Security, 2020, 90, 101710.
8	GATTULLI V , IMPEDOVO D , PIRLO G , et al. Human activity recognition for the identification of bullying and cyberbullying using smartphone sensors. Electronics, 2023, 12 (2): 261. doi: 10.3390/electronics12020261
9	LEPE-FAÚNDEZ M , SEGURA-NAVARRETE A , VIDAL-CASTRO C , et al. Detecting aggressiveness in tweets: a hybrid model for detecting cyberbullying in the Spanish language. Applied Sciences, 2021, 11 (22): 10706. doi: 10.3390/app112210706
10	PARUCHURI V L , RAJESH P . CyberNet: a hybrid deep CNN with N-gram feature selection for cyberbullying detection in online social networks. Evolutionary Intelligence, 2023, 16 (6): 1935- 1949. doi: 10.1007/s12065-022-00774-3
11	RAJ C , AGARWAL A , BHARATHY G , et al. Cyberbullying detection: hybrid models based on machine learning and natural language processing techniques. Electronics, 2021, 10 (22): 2810. doi: 10.3390/electronics10222810
12	HUANG G, LIU Z, VAN DER MAATEN L, et al. Densely connected convolutional networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Washington D.C., USA: IEEE Press, 2017: 2261-2269.
13	LAXMI S T, RISMALA R, NURRAHMI H. Cyberbullying detection on Indonesian Twitter using Doc2Vec and convolutional neural network[C]// Proceedings of the 9th International Conference on Information and Communication Technology (ICoICT). Washington D.C., USA: IEEE Press, 2021: 82-86.
14	HASAN M T , AL EMRAN HOSSAIN M , MUKTA M S H , et al. A review on deep-learning-based cyberbullying detection. Future Internet, 2023, 15 (5): 179. doi: 10.3390/fi15050179
15	WU F , CHEN G L , CAO J K , et al. Multimodal hateful meme classification based on transfer learning and a cross-mask mechanism. Electronics, 2024, 13 (14): 2780. doi: 10.3390/electronics13142780
16	AHSAN S, HOSSAIN E, SHARIF O, et al. A multimodal framework to detect target aware aggression in memes[C]//Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics. [S. l. ]: ACL, 2024: 2487-2500.
17	WANG K , CUI Y P , HU J W , et al. Cyberbullying detection, based on the FastText and word similarity schemes. ACM Transactions on Asian and Low-Resource Language Information Processing, 2021, 20 (1): 1- 15.
18	IWENDI C , SRIVASTAVA G , KHAN S , et al. Cyberbullying detection solutions based on deep learning architectures. Multimedia Systems, 2023, 29 (3): 1839- 1852. doi: 10.1007/s00530-020-00701-5
19	FATI S M . Detecting cyberbullying across social media platforms in Saudi Arabia using sentiment analysis: a case study. The Computer Journal, 2022, 65 (7): 1787- 1794. doi: 10.1093/comjnl/bxab019
20	ZADEH A, CHEN M H, PORIA S, et al. Tensor fusion network for multimodal sentiment analysis[EB/OL]. [2024-05-05]. https://arxiv.org/abs/1707.
21	LIU Z, SHEN Y, LAKSHMINARASIMHAN V B, et al. Efficient low-rank multimodal fusion with modality-specific factors[EB/OL]. [2024-05-05]. https://arxiv.org/abs/1806.00064.
22	TISHBY N, ZASLAVSKY N. Deep learning and the information bottleneck principle[C]//Proceedings of the IEEE Information Theory Workshop (ITW). Washington D.C., USA: IEEE Press, 2015: 1-5.
23	ANSARI G , KAUR P , SAXENA C . Data augmentation for improving explainability of hate speech detection. Arabian Journal for Science and Engineering, 2024, 49 (3): 3609- 3621. doi: 10.1007/s13369-023-08100-4
24	ZHANG J , LI P . Bimodal deep autoencoder neural network for multimodal learning. Neural Computing and Applications, 2020, 32 (2): 461- 471.
25	LIU P F, QIU X P, HUANG X J. Adversarial multi-task learning for text classification[EB/OL]. [2024-05-05]. https://arxiv.org/abs/1704.05742.
26	WU R , WANG J , LIU Y , et al. Multimodal interaction sentiment analysis for customer service. Journal of Intelligent Information Systems, 2019, 53 (1): 23- 38.
27	WANG X , JIANG Y G , YANG J , et al. Learning multimodal fusion of speech and text for video search. IEEE Transactions on Multimedia, 2016, 18 (11): 2257- 2269. doi: 10.1109/TMM.2016.2614225

[1]	LI Yakang, LI Jianfang, HU Peng, CHEN Juan, WANG Shengxiang, QI Fazhi, CHEN Gang. A Survey of Artificial Intelligence Applications Throughout the Full Lifecycle of Neutron Scattering Experiments [J]. Computer Engineering, 2025, 51(10): 53-70.
[2]	ZOU Pinrong, XIAO Feng, ZHANG Wenjuan, ZHANG Wanyu, WANG Chenyang. Multi-Module Co-Attention Model for Visual Question Answering [J]. Computer Engineering, 2022, 48(2): 250-260.

Please choose a citation manager

Content to export