Improved Algorithm of Web Document Representation Based on Vector Space Model

doi:10.3969/j.issn.1000-3428.2006.03.048

Computer Engineering ›› 2006, Vol. 32 ›› Issue (3): 134-135，139.

• Networks and Communications • Previous Articles Next Articles

Improved Algorithm of Web Document Representation Based on Vector Space Model

ZENG Zhiyuan, ZHANG Li

College of Hydroelectricity and Digital Engineering, Huazhong University of Science and Technology, Wuhan 430074

Online:2006-02-05 Published:2006-02-05

基于向量空间模型的网页文本表示改进算法

曾致远，张莉

华中科技大学水电与数字化工程学院，武汉 430074

Abstract

Abstract: This paper introduces a new algorithm of text representation, which applies in Web document information filtering system. Compared with the traditional VSM, such an improved algorithm based on VSM makes more rapid filtering speed and higher filtering precision. This algorithm straight picks out attribute from attribute aggregate of filtering template, just disposes of the place where this attribute appeared in Web document. Then it gives different coefficient of weighting according to Web label which attribute perched on, and gets more exact weightiness of attribute. Finally it finds Web document representation model from the above result

Key words: Web document; Text representation; VSM; Attribute; Weighting

摘要： 介绍了一种新的文本表示算法，应用在网页文本过滤系统中。比起传统的向量空间模型，这种建立在其上的改进算法有更快的过滤速度和更高的过滤精度。该算法直接从过滤模板的特征集中取出词条，只在网页文本出现该词的地方进行精确处理。根据特征项所在的网页标签，赋予不同的权值系数，以准确定义特征词在文中的重要程度，最后建立该网页的文本表示模型。

关键词: 网页；文本表示；向量空间模型；特征项；权值

ZENG Zhiyuan, ZHANG Li. Improved Algorithm of Web Document Representation Based on Vector Space Model[J]. Computer Engineering, 2006, 32(3): 134-135，139.

曾致远，张莉. 基于向量空间模型的网页文本表示改进算法[J]. 计算机工程, 2006, 32(3): 134-135，139.

/ Recommend / Download Citations

URL:

https://www.ecice06.com/EN/Y2006/V32/I3/134

Please choose a citation manager

Content to export

Improved Algorithm of Web Document Representation Based on Vector Space Model

基于向量空间模型的网页文本表示改进算法

PDF

Knowledge

Cited

Abstract

Cite this article

share this article

References

Related Articles 0

Recommended Articles

Metrics

Comments

模态框（Modal）标题

Please choose a citation manager

Content to export

Improved Algorithm of Web Document Representation Based on Vector Space Model

基于向量空间模型的网页文本表示改进算法

PDF

Knowledge

Cited

Abstract

Cite this article

share this article

References

Related Articles 0

Recommended Articles

Metrics

Comments