作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2008, Vol. 34 ›› Issue (7): 81-83. doi: 10.3969/j.issn.1000-3428.2008.07.028

• 软件技术与数据库 • 上一篇    下一篇

利用关联规则挖掘文本主题词的方法

刘 菲,黄萱菁,吴立德   

  1. (复旦大学计算机科学与工程系,上海 200433)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2008-04-05 发布日期:2008-04-05

Approach for Extracting Thematic Terms Based on Association Rules

LIU Fei, HUANG Xuan-jing, WU Li-de   

  1. (Department of Computer Science and Technology, Fudan University, Shanghai 200433)
  • Received:1900-01-01 Revised:1900-01-01 Online:2008-04-05 Published:2008-04-05

摘要: 主题词抽取是目前信息检索领域研究的热点,与一系列数据挖掘相关的任务密切相关。该文提出一种新的利用关联规则挖掘中文文本主题词的方法,该方法抽取的主题词包括关键词和相关检索词两部分。在关键词抽取的基础上,采用数据挖掘中的关联规则挖掘算法抽取相关检索词,用于扩展检索或相关检索,提高了用户对于文档的理解。实验表明该方法取得了较好的效果。

关键词: 关键词抽取, 关联规则挖掘, 文本挖掘

Abstract: Thematic terms extraction is one of the hot topics in the field of information retrieval, and in tight relationship with a variety of data mining tasks. This paper presents an approach for extracting thematic terms based on association rules, which include both keyphrases and related terms. Based on keyphrase extraction, related terms could be used in extended search and related information retrieval, as well as providing users with a better understanding of the topic. Experimental results show that the method is effective in extracting both keyphrases and related terms.

Key words: keyphrase extraction, association rules, text mining

中图分类号: