计算机工程

• •    

基于音形义特征和层次注意力机制的幽默识别

  

  • 发布日期:2020-03-25

  • Published:2020-03-25

摘要: 针对幽默识别中深度学习方法较少考虑幽默的语言学特点的问题,提出了利用幽默文本的音形义语言学特征的神经 网络幽默识别模型。同时,针对不同单词对不同幽默特征贡献程度不同,且不同幽默语言学特征和幽默语句的关联性不同的 问题,采用了层次注意力机制进一步提高了模型的性能。该方法在特征提取阶段,分别将文本表示成音素形式、字符形式以 及携带歧义性信息的语义形式,分别采用了卷积神经网络、双向门控循环神经网络和注意力机制提取幽默的语音、字形和语 义特征。在特征融合阶段,采用了层次注意力机制,更好的调节了不同语言学特征对模型性能的影响。在 pun-of-the-day 和 onliner-16000 两个基准数据集上的实验结果表明,该文提出的基于音形义和层次注意力机制的神经网络模型的 F1 值分别为 91.03 和 91.11。

Abstract: 】 In this paper, the authors proposed a neural network to extract the features of different dimensions of a humorous text, such as pronunciation, spelling and meaning, aiming at the shortness of less consideration of linguistic features in humor detection. Meanwhile, the authors use the hierarchical attention mechanism to improve the performance of the model, aiming at the problems that the contribution of different words is different and also the correlation is different between the different linguistic features and humorous sentences. In feature extraction, the texts are represented as the phonetic forms, spelling form and semantic form with ambiguous level information. We use the convolution neural network, the bidirectional gated recurrent unit and the attention mechanism to extract the features of humor text. In feature fusion, the hierarchical attention mechanism is adopted to better adjust the influence of different linguistic features on performance. The experimental results on two public datasets of pun-of-the-day and onliner-16000 show that the F1 values of the proposed model are 91.03 and 91.11, respectively.