作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2026, Vol. 52 ›› Issue (5): 60-80. doi: 10.19678/j.issn.1000-3428.0070287

• 前沿观点与综述 • 上一篇    下一篇

社交媒体虚假信息检测技术研究综述

许旻辰1, 屈丹1,2,*(), 司念文1,3, 彭思思1, 陈雅淇1   

  1. 1. 信息工程大学信息系统工程学院, 河南 郑州 450000
    2. 先进计算与智能工程(国家级)实验室, 河南 郑州 450000
    3. 清华大学电子工程系, 北京 100084
  • 收稿日期:2024-08-23 修回日期:2024-11-28 出版日期:2026-05-15 发布日期:2025-01-03
  • 通讯作者: 屈丹
  • 作者简介:

    许旻辰, 男, 硕士研究生, 主研方向为虚假新闻检测、自然语言处理

    屈丹(通信作者), 教授、博士

    司念文, 讲师、博士

    彭思思, 博士研究生

    陈雅淇, 博士研究生

  • 基金资助:
    国家自然科学基金(62171470); 河南省中原科技创新领军人才项目(234200510019); 河南省自然科学基金面上项目(232300421240)

Technologies for Detecting Disinformation in Social Media: A Comprehensive Review

XU Minchen1, QU Dan1,2,*(), SI Nianwen1,3, PENG Sisi1, CHEN Yaqi1   

  1. 1. School of Information Systems Engineering, Information Engineering University, Zhengzhou 450000, Henan, China
    2. Laboratory for Advanced Computing and Intelligence Engineering, Zhengzhou 450000, Henan, China
    3. Department of Electronic Engineering, Tsinghua University, Beijing 100084, China
  • Received:2024-08-23 Revised:2024-11-28 Online:2026-05-15 Published:2025-01-03
  • Contact: QU Dan

摘要:

实现及时有效的虚假信息检测有助于遏止虚假信息传播, 降低社会危害。目前已有大量深度学习方法被用于虚假信息检测, 总结现有研究的检测原理和检测范式对于明确技术优化方向至关重要。因此, 结合虚假信息检测的原理和实现路径对现有研究进行全面综述, 并首次对大语言模型在该领域的应用进行总结对比。首先, 介绍虚假信息检测任务的相关概念, 并汇总分析常用虚假信息检测数据集的数据结构; 然后, 根据检测原理和实现方式, 分别介绍如何通过语义特征表示、辅助任务设计、内部知识推断和事实核查来检测文本和多模态虚假信息, 将其细化为10个子类别, 并总结分析各个子类别检测方法的潜在特性; 最后, 对基于深度神经网络和大语言模型的虚假信息检测范式进行总结, 对比两种检测范式的代表性方法在7个虚假信息检测数据集中的检测性能, 并归纳大语言模型检测虚假信息的优势和局限性, 展望大语言模型给虚假信息检测领域带来的机遇与挑战, 为后续研究提供参考。

关键词: 深度学习, 自然语言处理, 虚假信息检测, 大语言模型, 事实核查

Abstract:

Timely and effective disinformation detection is crucial for curbing the spread of disinformation and minimizing social harm. Numerous deep learning methods have been employed for disinformation detection. Summarizing the detection principles and paradigms of existing research is essential for identifying directions for technical optimization. Therefore, this paper comprehensively reviews existing research based on the principles and implementation paths of disinformation detection, and for the first time, summarizes and compares the applications of large language models in this field. First, the relevant concepts of disinformation detection tasks are introduced and the data structures of commonly used disinformation detection datasets are summarized. Then, based on detection principles and implementation methods, the paper presents ways to detect textual and multimodal disinformation through semantic feature representation, auxiliary task design, internal knowledge inference, and fact verification, refining them into ten subcategories and summarizing the potential characteristics of detection methods for each subcategory. Finally, the paper summarizes disinformation detection paradigms based on deep neural networks and large language models, compares the detection performance of representative methods from these paradigms across seven disinformation detection datasets, and highlights the advantages and limitations of large language models in detecting disinformation. It also presents the anticipated opportunities and challenges brought about by large language models in the field of disinformation detection, providing a reference for future research.

Key words: deep learning, natural language processing, disinformation detection, large language models, fact checking