Abstract:
Similar topic detection and topic excursion are two important factors which affect the performance of topic detection. For these two problems, this paper proposes a topic detection approach based on adaptive center vector. By using information of name-entity in feature representation, it combines name-entity vector and keyword vector to construct topic center vector, which can detect similar topic efficiently. Based on the idea of single-pass clustering, the algorithm modifies topic center dynamically. Experimental results show that the algorithm can improve the performance of topic detection effectively.
Key words:
topic detection,
topic excursion,
name-entity,
topic center vector
摘要: 针对影响主题检测性能的2个重要因素——相似主题的判定和主题漂移问题,提出一种基于自适应重心向量的主题检测方法。该方法将命名实体信息应用到特征表示上,将命名实体向量和关键词向量相结合表示主题的重心向量,以有效区分相似主题。采用增量聚类检测主题,在增量聚类过程中不断修正主题重心,以解决主题漂移的问题。实验结果与性能比较表明,该方法能有效提高主题检测的性能。
关键词:
主题检测,
主题漂移,
命名实体,
主题重心向量
CLC Number:
PAN Yuan; LI Bi-cheng; ZHANG Xian-fei. Topic Detection Approach Based on Adaptive Center Vector[J]. Computer Engineering, 2009, 35(3): 80-82.
潘 渊;李弼程;张先飞. 一种基于自适应重心向量的主题检测方法[J]. 计算机工程, 2009, 35(3): 80-82.