计算机工程 ›› 2018, Vol. 44 ›› Issue (10): 292-297,302.doi: 10.19678/j.issn.1000-3428.0050770

• 开发研究与工程应用 • 上一篇    下一篇

基于加权AMR图的语义子图预测摘要算法

明拓思宇,陈鸿昶,黄瑞阳,柳杨   

  1. 国家数字交换系统工程技术研究中心,郑州 450002
  • 收稿日期:2018-03-14 出版日期:2018-10-15 发布日期:2018-11-14
  • 作者简介:明拓思宇(1994—),男,硕士,主研方向为自动文本摘要;陈鸿昶,教授、博士生导师;黄瑞阳,助理研究员、博士;柳杨,硕士。
  • 基金项目:

    国家自然科学基金(61601513)

Semantic subgraph predictive summary algorithm based on weighted AMR graph

MING Tuosiyu,CHEN Hongchang,HUANG Ruiyang,LIU Yang   

  1. National Digital Switching System Engineering and Technological R&D Center,Zhengzhou 450002,China
  • Received:2018-03-14 Online:2018-10-15 Published:2018-11-14

摘要:

现有的文本摘要方法多数停留在挖掘词与词之间的浅层语义关系,没有很好地利用词句之间的完整语义信息,为此,提出一种改进的语义子图预测摘要的算法。将原始文本转化为相应的抽象语义表示(AMR)图,融合成一个AMR总图,基于WordNet语义词典对其进行冗余信息的过滤。在此基础上利用综合统计特征对不具有权值的AMR图节点赋予权值,通过筛选重要性程度高的部分构成语义摘要子图,并基于ROUGE指标和Smatch指标综合衡量生成摘要的质量。实验结果表明,与仅挖掘浅层语义关系的文本摘要基准算法相比,该算法ROUGE值和Smatch值明显提高。

关键词: 抽象语义表示图, 语义摘要子图, 语义信息, 冗余信息, 摘要评价指标

Abstract:

Most of the existing text abstract methods stay in the shallow semantic relationship between words and words,and do not make good use of the complete semantic information between words.Therefore,an improved algorithm for semantic subgraph predictive summary is proposed.The algorithm transforms the original text into corresponding Abstract Meaning Representation(AMR) graphs,merges them into an AMR total graph,and filters the redundant information based on the WordNet semantic dictionary.On this basis,using the comprehensive statistical features assigns weights to the AMR graph nodes that do not have weights,and constructs the semantic summary subgraphs by filtering the parts with high importance,and comprehensively measures the quality of the abstracts based on the ROUGE index and the Smatch index.Experimental results show that compared with the text abstraction benchmark algorithm which only mines shallow semantic relations,the ROUGE value and Smatch value of the algorithm are significantly improved.

Key words: Abstarct Meaning Representation(AMR) graph, semantic abstract subgraph, semantic information, redundant information, summary evaluation index

中图分类号: