Author Login Editor-in-Chief Peer Review Editor Work Office Work

Computer Engineering ›› 2009, Vol. 35 ›› Issue (7): 46-48. doi: 10.3969/j.issn.1000-3428.2009.07.015

• Software Technology and Database • Previous Articles     Next Articles

Search Results Clustering Algorithm Based on Named Entities

CHEN Yong-chao, LIU Gui-quan   

  1. (Department of Computer Science Teclndogy, University of Science and Technology of China, Hefei 230027)
  • Received:1900-01-01 Revised:1900-01-01 Online:2009-04-05 Published:2009-04-05

一种基于命名实体的搜索结果聚类算法

陈永超,刘贵全   

  1. (中国科学技术大学计算机科学技术系,合肥 230027)

Abstract: A new way of clustering the search results based on named entities——NEC is introduced. This paper proposes an algorithm which improves the readability of the cluster labels. Named entities all have certain meanings with themselves, can indicate themes of the documents which they are in, and more readable for the users. It uses the named entities in the documents as cluster labels, and gets the final result after label-selecting and cluster-merging strategies. Experiments show that it is a feasible way for search results clustering.

Key words: named entities, search results clustering, index

摘要: 针对现有搜索结果聚类方法中形成的聚类标签可读性比较差的情况,提出一种基于命名实体的搜索结果聚类方法——NEC。命名实体作为文本中的基本信息元素,具有一定的实际意义,表征主题的能力比一般词语更强,也更具可读性。算法以搜索结果文档中存在的命名实体作为聚类的标签,经过一定的标签选择和聚类合并策略,形成最终的聚类结果,提高聚类标签的可读性。实验证明,该方法是一种可行的搜索结果聚类方法。

关键词: 命名实体, 搜索结果聚类, 索引

CLC Number: