Abstract:
This paper presents an approach for metadata clustering based on maximal frequent path with a feature vector matrix. The metadata is depicted as a metadata tree. Then, maximal frequent sequences mining is conducted with the metadata tree according to the common features. In order to construct a feature vector matrix, it is necessary to weight the common features. The last step is to calculate the similarity between metadata trees for further clustering. Various examples in this study confirm that the presented approach can significantly improve the efficiency and effectiveness on metadata clustering by reducing the number of paths and endowing weights to paths.
Key words:
metadata clustering,
metadata tree,
frequent path,
feature vector matrix
摘要: 探讨元数据树的最大频繁路径以及实现元数据聚类的有效途径。构建元数据树后以最大频繁路径作为元数据树的公共特征,对相关路径赋权重并构建特征矩阵、计算元数据树的相似度,对元数据进行聚类。经实例分析,该方法通过减少参与聚类的路径数量和赋予路径权重,能够较好地提高元数据聚类效率和效果。
关键词:
元数据聚类,
元数据树,
频繁路径,
特征向量矩阵
CLC Number:
FENG Xiu-Zhen, CHEN Ni. Metadata Clustering Method Based on Maximal Frequent Path[J]. Computer Engineering, 2010, 36(21): 40-42.
冯秀珍, 陈旎. 基于最大频繁路径的元数据聚类方法[J]. 计算机工程, 2010, 36(21): 40-42.