摘要: 针对XML文档的半结构化特点,提出一种建模XML检索结果片段的新思路,设计综合内容和结构语义信息度量相应文档相似性的方法,给出一种适应检索结果聚类应用需求的动态均值软聚类算法。实验表明,面向XML的检索结果聚类方法聚类效果优于传统方法。
关键词:
XML检索结果聚类,
结构语义相似度,
内容相似度,
聚类算法
Abstract: According to feature of semi-structure of XML documents, a new effective method for modeling documents of XML retrieval result segment is brought forward, and a method for computing relativity of keywords and measuring similarity of structure semantics between documents is designed. A new algorithm named Dynamic k-means Soft Clustering(DKMSC) is brought forward to meet requirement of clustering retrieval results. Experiment indicates that the method of clustering XML retrieval results is obviously better than the traditional way.
Key words:
XML retrieval result clustering,
structure semantic similarity,
content similarity,
clustering algorithm
中图分类号:
余 宏;万常选. 基于XML的检索结果聚类方法[J]. 计算机工程, 2010, 36(1): 85-86,9.
YU Hong; WAN Chang-xuan. Retrieval Result Clustering Method Based on XML[J]. Computer Engineering, 2010, 36(1): 85-86,9.