作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2021, Vol. 47 ›› Issue (1): 66-71. doi: 10.19678/j.issn.1000-3428.0056606

• 人工智能与模式识别 • 上一篇    下一篇

面向社交媒体评论的上下文语境讽刺检测模型

韩虎1,2, 赵启涛1, 孙天岳1, 刘国利1   

  1. 1. 兰州交通大学 电子与信息工程学院, 兰州 730070;
    2. 甘肃省人工智能与图形图像工程研究中心, 兰州 730070
  • 收稿日期:2019-11-15 修回日期:2020-01-07 发布日期:2020-01-17
  • 作者简介:韩虎(1977-),男,副教授、博士,主研方向为机器学习、数据挖掘;赵启涛、孙天岳、刘国利,硕士研究生。
  • 基金资助:
    国家自然科学基金(61562057);国家社会科学基金(17BXW071);甘肃省科技计划项目(18JR3RA104)。

Contextual Sarcasm Detection Model for Social Media Comments

HAN Hu1,2, ZHAO Qitao1, SUN Tianyue1, LIU Guoli1   

  1. 1. School of Electronic and Information Engineering, Lanzhou Jiaotong University, Lanzhou 730070, China;
    2. Gansu Provincial Engineering Research Center for Artificial Intelligence and Graphic and Image Processing, Lanzhou 730070, China
  • Received:2019-11-15 Revised:2020-01-07 Published:2020-01-17

摘要: 讽刺是日常交际中一种常见的语用现象,能够丰富说话者的观点并间接地表达说话者的深层含义。讽刺检测任务的研究目标是挖掘目标语句的讽刺倾向。针对讽刺语境表达变化多样以及不同用户、不同主题下的讽刺含义各不相同等特征,构建融合用户嵌入与论坛主题嵌入的上下文语境讽刺检测模型。该模型借助ParagraphVector方法的序列学习能力对用户评论文档与论坛主题文档进行编码,从而获取目标分类句的用户讽刺特征与主题特征,并利用一个双向门控循环单元神经网络得到目标句的语句编码。在标准讽刺检测数据集上进行的实验结果表明,与传统Bag-of-Words、CNN等模型相比,该模型能够有效提取语句的上下文语境信息,具有较高的讽刺检测分类准确率。

关键词: 自然语言处理, 上下文语境讽刺检测, 深度学习, ParagraphVector模型, 双向门控循环单元模型

Abstract: Sarcasm is a common pragmatic phenomenon in daily communication that enriches the views of speakers and indirectly expresses the their deep meaning.The research goal of sarcasm detection task is to mine the sarcasm tendency of target sentences.As the contexts and expressions of sarcasm is diverse,and the meaning of sarcasm varies according to users and topics,this paper proposes a contextual sarcasm detection model fusing users' embedding and forum topic embedding.The model uses the sequence learning ability of ParagraphVector method to encode the documents of user comments and forum topics to obtain the satirical features of users and topic features of the target sentence.Then a Bi-directional-Gated Recurrent Unit(Bi-GRU) neural network is used to obtain the sentence code of the target sentence.Experimental results on the standard sarcasm detection dataset show that compared with traditional Bag-of-Words,CNN and other models,this model can effectively extract the contextual information of sentences,and has a higher accuracy of sarcasm detection and classification.

Key words: Natural Language Processing(NLP), contextual sarcasm detection, deep learning, ParagraphVector model, Bi-directional-Gated Recurrent Unit(Bi-GRU) model

中图分类号: