摘要: 根据中文短信文本分类的特点,提出同义概念归并、上下位概念的聚焦以及短信文本重点词汇的确定方法,利用主题句选取算法获取短信文本的主题,采用KNN算法将短信文本的主题进行分类。仿真实验结果表明,该算法能够有效提高短信文本的分类速度。
关键词:
短信文本,
KNN算法,
主题句
Abstract: According to characteristics of Chinese short message text categorization, some contents are proposed, such as the synonymy concept merging, the superior concept and sub-concept semantic focusing and using of topic sentences. The algorithm getting theme of short text is used to obtain the text theme. KNN algorithm is also used to classify the short text subject. Simulation experimental results show this algorithm can improve the classification speed of the short text.
Key words:
short message text,
KNN algorithm,
theme sentence
中图分类号:
刘金岭. 基于主题的中文短信文本分类研究[J]. 计算机工程, 2010, 36(4): 30-32.
LIU Jin-ling. Study on Chinese Short Message Text Classification Based on Theme[J]. Computer Engineering, 2010, 36(4): 30-32.