作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2006, Vol. 32 ›› Issue (20): 183-184. doi: 10.3969/j.issn.1000-3428.2006.20.067

• 人工智能及识别技术 • 上一篇    下一篇

不同粒度下的文档分类

赵欣欣,朱铁丹,刘玉树   

  1. (北京理工大学信息科学技术学院计算机科学工程系,北京 100081)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2006-10-20 发布日期:2006-10-20

Document Classification in Different Granularity

ZHAO Xinxin, ZHU Tiedan, LIU Yushu   

  1. (Department of Computer Science & Engineering, School of Information Science & Technology, Beijing Institute of Technology, Beijing 100081)
  • Received:1900-01-01 Revised:1900-01-01 Online:2006-10-20 Published:2006-10-20

摘要: 提出了句子空间模型及基于句子空间模型的分类算法。比较了从词、句子两个不同粒度对文档进行表示的向量空间模型和句子空间模型在对同一问题进行分类时的召回率和准确率。实验表明,与向量空间模型相比,句子空间模型在许多情况下具有较好的分类性能。

关键词: 粒度, 向量空间模型, 句子空间模型

Abstract: This paper proposes a sentence space model(SSM)and a classification algorithm based on SSM. It compares the vector space model and the sentence space model in classifying the same document with the recall and the precision from different granularity, word granularity and sentence granularity. Experiments show SSM has a better classification performance than vector space model in many circumstances.

Key words: Granularity, Vector space model, Sentence space model