作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2019, Vol. 45 ›› Issue (1): 165-171,177. doi: 10.19678/j.issn.1000-3428.0049403

• 人工智能及识别技术 • 上一篇    下一篇

基于语义扩展与注意力网络的问题细粒度分类

谢雨飞,吕钊   

  1. 华东师范大学 计算机科学技术系,上海 200062
  • 收稿日期:2017-11-22 出版日期:2019-01-15 发布日期:2019-01-15
  • 作者简介:谢雨飞(1993—),男,硕士研究生,主研方向为大数据分析;吕钊,副教授
  • 基金资助:

    上海市科学技术委员会科研计划项目(16511102702)

Question Fine-grained Classification Based on Semantic Expansion and Attention Network

XIE Yufei,L Zhao   

  1. Department of Computer Science and Technology,East China Normal University,Shanghai 200062,China
  • Received:2017-11-22 Online:2019-01-15 Published:2019-01-15

摘要:

针对问题文本细粒度分类中文本特征稀疏、文本整体特征相似、局部差异特征较难提取的特点,提出基于语义扩展与注意力网络相结合的分类方法。通过依存句法分析树提取语义单元,在向量空间模型中计算语义单元周围的相似语义区域并进行扩展。利用长短期记忆网络模型对扩展后的文本进行词编码,引入注意力机制生成问题文本的向量表示,根据Softmax分类器对问题文本进行分类。实验结果表明,与传统的基于深度学习网络的文本分类方法相比,该方法能够提取出更重要的分类特征,具有较好的分类效果

关键词: 细粒度分类, 依存句法, 语义扩展, 长短期记忆网络, 注意力网络

Abstract:

For the fine-grained classification of question texts,which include that the features of text are sparse,the overall features of the text are similar,and the features of local differences are difficult to extract,a classification method based on the combination of semantic expansion and attention network is proposed.The semantic unit is extracted by the dependency syntax analysis tree,and the similar semantic regions around the semantic unit are calculated and expanded in the vector space model.The Long Short Term Memory(LSTM) network model is used to encode the extended text,the attention mechanism is introduced to generate the vector representation of the question text,and the problem text is classified according to the Softmax classifier.Experimental results show that compared with the traditional text classification method based on deep learning network,this method can extract more important classification features and has better classification effect.

Key words: fine-grained classification, dependency syntax, semantic expansion, Long Short Term Memory(LSTM) network, attention network

中图分类号: