作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2010, Vol. 36 ›› Issue (21): 260-261,264. doi: 10.3969/j.issn.1000-3428.2010.21.093

• 开发研究与设计技术 • 上一篇    下一篇

一种基于标签的网页摘要方法

尚书杰,王 灿,朱俊彦   

  1. (浙江大学计算机科学学院,杭州 310027)
  • 出版日期:2010-11-05 发布日期:2010-11-03
  • 作者简介:尚书杰(1985-),男,硕士研究生,主研方向:信息检索;王 灿,讲师;朱俊彦,硕士研究生
  • 基金资助:
    国家科技支撑计划基金资助项目(2008BAH26B00)

Tag-based Web Page Summarization Approach

SHANG Shu-jie, WANG Can, ZHU Jun-yan   

  1. (College of Computer Science, Zhejiang University, Hangzhou 310027, China)
  • Online:2010-11-05 Published:2010-11-03

摘要: 提出一种基于标签的网页摘要方法。根据优质用户和优质标签之间的相互加强关系,利用二分图排序算法对标签进行排序和打分,构建标签?-文档图,应用Manifold Ranking算法对句子按其重要性进行排序,将排序靠前的句子组成网页摘要。实验结果证明,该方法的摘要准确性有明显改进。

关键词: 标签, 摘要, 图排序

Abstract: This paper proposes a two-stage Web page summarization approach by exploiting both the page contents and the tags annotated on that page. Observing the mutually reinforcing relationship between quality tags and quality users, it uses a bipartite ranking algorithm to score tags in the first stage, derives a graph representation for tags and sentences on a Web page and applies the Manifold Ranking algorithm to rank sentences and generates the summary accordingly. Experimental results show that the method has significant improvement in summary accuracy.

Key words: tag, summarization, graph-ranking

中图分类号: