计算机工程 ›› 2018, Vol. 44 ›› Issue (7): 172-176.doi: 10.19678/j.issn.1000-3428.0047156

• 人工智能及识别技术 • 上一篇    下一篇

基于子主题增强的演化式多文档摘要生成

江璐璐,胡珀,贝超   

  1. 华中师范大学 计算机学院,武汉 430079
  • 收稿日期:2017-05-11 出版日期:2018-07-15 发布日期:2018-07-15
  • 作者简介:江璐璐(1993—),女,硕士研究生,主研方向为自然语言处理;胡珀(通信作者),副教授、博士;贝超,硕士研究生。
  • 基金项目:

    国家自然科学基金青年基金(61402191);国家语委科研项目(WT135-11);华中师范大学中央高校基本科研业务费专项资金(CCNU16JYKX15)。

Evolutionary Multi-document Summary Generation Based on Sub-theme Enhancement

JIANG Lulu,HU Po,BEI Chao   

  1. School of Computer,Central China Normal University,Wuhan 430079,China
  • Received:2017-05-11 Online:2018-07-15 Published:2018-07-15

摘要:

时间轴摘要可帮助用户获取感兴趣的新闻话题发展轨迹,但现有研究中大多仅考虑句子间的关系来对句子进行打分排序,忽视了文档主题层面的信息影响。为此,提出一种新的基于子主题增强的摘要算法。考虑句子间的关系,分析每个时间段内的子主题对句子的影响,使得与重要子主题越相关的句子得分越高,通过句子与子主题的互强化来对句子进行主题层面的综合打分排序。实验结果表明,与现有的时间轴摘要算法相比,该算法可移植性较好,且可准确获取新闻演化轨迹。

关键词: 静态文摘, 动态演化文摘, 层次狄利克雷过程, 子主题, 时间轴摘要

Abstract:

The Timeline Summary(TS) can help users obtain the development trajectory of interesting news topics.However,most of the existing researches only consider the relationship between sentences to score sentences and ignore the information impact on the topic level of documents.For this reason,a new subtopic-based summarization method is proposed.Considering the relationship between sentences and introducing the influence of subtopics in each time period on the sentences,and to make the score of the sentences higher which is more related to the important subtopic,it sorts the sentence comprehensively in subtopic level through the mutual reinforcement of the sentence and the subtopic.Experimental results show that compared with the existing timeline summary algorithm,this method has better portability and can accurately obtain news evolution trajectory.

Key words: static abstract, dynamic evolution abstract, hierarchical Dirichlet process, sub-theme, Timeline Summary(TS)

中图分类号: