Author Login Editor-in-Chief Peer Review Editor Work Office Work

Computer Engineering ›› 2011, Vol. 37 ›› Issue (01): 78-80. doi: 10.3969/j.issn.1000-3428.2011.01.027

• Networks and Communications • Previous Articles     Next Articles

English Texts Retrieval Algorithm Based on SVD

GAO Shi-long   

  1. (Department of Mathematics, Leshan Normal University, Leshan 614000, China)
  • Online:2011-01-05 Published:2010-12-31

基于奇异值分解的英文文本检索算法

高仕龙   

  1. (乐山师范学院数学系,四川 乐山 614000)
  • 作者简介:高仕龙(1975-),男,副教授、博士研究生,主研方向:弱信号检测,数据挖掘
  • 基金资助:
    四川省教育厅基金资助项目“基于混沌系统的线性调频信号检测与参数估计”(09ZB026)

Abstract: A new retrieval algorithm for English texts is proposed. Keywords are extracted from the English texts. The state matrix of keywords is calculated based on transition probabilities matrix and the first singular value vector is got through Singular Value Decomposition(SVD) as the complex feature vectors. The cosine similarity of texts is used to measure the similarity between the query and documents. Experimental results indicate that this algorithm gets the advantage over the traditional LSA algorithm in precision and computational efficiency.

Key words: texts retrieval, transition probability, Singular Value Decomposition(SVD), state matrix

摘要: 提出一种英文文本检索算法,从文本中提取关键词项,根据转移概率计算出关键词项的状态矩阵,并通过奇异值分解,提取第一奇异值向量作为复特征向量,利用向量间的余弦相似度作为文本检索的相似度度量。实验结果表明,该算法在检索准确率和运算效率上都优于传统的LSA算法。

关键词: 文本检索, 转移概率, 奇异值分解, 状态矩阵

CLC Number: