作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2007, Vol. 33 ›› Issue (18): 95-96,1. doi: 10.3969/j.issn.1000-3428.2007.18.034

• 软件技术与数据库 • 上一篇    下一篇

Lucene搜索引擎

周登朋,谢康林   

  1. (上海交通大学计算机科学与工程系,上海200240)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2007-09-20 发布日期:2007-09-20

Lucene Search Engine

ZHOU Deng-peng, XIE Kang-lin   

  1. (Department of Computer Science and Engineering, Shanghai Jiaotong University, Shanghai 200240)
  • Received:1900-01-01 Revised:1900-01-01 Online:2007-09-20 Published:2007-09-20

摘要: Lucene是一个高性能、易扩展的基于Java技术的全文信息检索工具包,它能非常方便地为各种应用程序加入全文索引和搜索功能。该文探讨了Lucene中使用的向量空间模型,分析了Lucene索引文件的结构以及搜索排序算法,讨论了Lucene的压缩算法并且通过实验验证了Lucene的建立索引的过程。

关键词: Lucene, 向量空间模型, 排序算法, 信息检索

Abstract: As an information retrieval library written in Java, Lucene, with high performance and easy to scale, can easily add searching and indexing capabilities to applications. This paper discusses the vector space model used in Lucene, analyzes the structure of index files and ranking algorithm, and describes the compressing algorithm in Lucene. An experiment is done to test the indexing process of Lucene.

Key words: Lucene, vector space model, ranking algorithm, information retrieval

中图分类号: