一种基于语言概念空间聚类的信息检索方法

doi:10.3969/j.issn.1000-3428.2007.08.017

计算机工程 ›› 2007, Vol. 33 ›› Issue (08): 51-53. doi: 10.3969/j.issn.1000-3428.2007.08.017

一种基于语言概念空间聚类的信息检索方法

吴晨1,2，张全2

(1. 中国科学院研究生院，北京 100039；2. 中国科学院声学研究所，北京 100080)

收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2007-04-20 发布日期:2007-04-20

An Information Retrieval Method Based on Language Concept Space Using Clustering Method

WU Chen1,2, ZHANG Quan2

(1. Graduate School, Chinese Academy of Sciences, Beijing 100039; 2. Institute of Acoustics, Chinese Academy of Sciences, Beijing 100080)

Received:1900-01-01 Revised:1900-01-01 Online:2007-04-20 Published:2007-04-20

摘要/Abstract

摘要： 提出了一种以语言概念空间中的概念为聚类对象的信息检索方法以及适合于该方法的聚类算法。该聚类算法通过曲线拟合技术来实现文本的自动阈值确定和聚类划分，并最终通过聚类间的迭代和结果修正来完成整个聚类过程。概念的引入为解决词语的同义、多义问题提供了有力保障。实验表明，采用该方法的信息检索系统，与Jelinek-Mercer、k-means模型相比有较高的准确率和召回率，效果理想。

关键词: 信息检索, 语言概念空间, 聚类, 自动阈值下的聚类划分

Abstract: An information retrieval model based on language concept space and a clustering method which serves the IR model is propsed. The clustering method uses curve-fitting to implement the text clustering by auto threshold-detection means, and complete the whole clustering process through result revising phase. The use of word concept can reduce the word sense ambiguity as drastically as possible when processing the text. The experiments indicate that the method presented in this paper has good performance. Compared with Jelinek-Mercer smoothing model and k-means model, the precision and the recall of the system are higher to a certain degree.

Key words: Information retrieval, Language concept space, Clustering, Auto threshold-detection and classification

吴晨;张全. 一种基于语言概念空间聚类的信息检索方法[J]. 计算机工程, 2007, 33(08): 51-53.

WU Chen; ZHANG Quan. An Information Retrieval Method Based on Language Concept Space Using Clustering Method[J]. Computer Engineering, 2007, 33(08): 51-53.

http://www.ecice06.com/CN/Y2007/V33/I08/51

[1]	江雨燕, 陶承凤, 李平. 数据增强和自适应自步学习的深度子空间聚类算法[J]. 计算机工程, 2023, 49(8): 96-103, 110.
[2]	郑美光, 杨泳. 基于互信息软聚类的个性化联邦学习算法[J]. 计算机工程, 2023, 49(8): 20-28.
[3]	李泽水, 冀俊忠, 杨翠翠. 基于边权重信息深度网络嵌入的PPIN功能模块检测[J]. 计算机工程, 2023, 49(8): 69-76.
[4]	邱天晨, 郑小盈, 祝永新, 封松林. 面向非独立同分布数据的联邦学习架构[J]. 计算机工程, 2023, 49(7): 110-117.
[5]	高小方, 原玉梁, 温静, 白雪飞. 面向相交多流形聚类的标签传播算法[J]. 计算机工程, 2023, 49(6): 90-98.
[6]	位雅, 张正军, 何凯琳, 唐莉. 基于相对密度的密度峰值聚类算法[J]. 计算机工程, 2023, 49(6): 53-61.
[7]	戴浩磊, 黄永慧, 周郭许. 基于超图正则化非负张量链分解的聚类分析[J]. 计算机工程, 2023, 49(6): 81-89.
[8]	李晓腾, 张盼盼, 勾智楠, 高凯. 基于多任务学习的多模态命名实体识别方法[J]. 计算机工程, 2023, 49(4): 114-119.
[9]	程小辉, 李钰, 康燕萍. 基于中间图特征提取的卷积网络双标准剪枝[J]. 计算机工程, 2023, 49(3): 105-112.
[10]	袁立宁, 胡皓, 刘钊. 基于多通道图卷积自编码器的图表示学习[J]. 计算机工程, 2023, 49(2): 150-160,174.
[11]	蔡瑞初, 伍运金, 陈薇, 郝志峰. 面向多元时间序列的群体因果关系发现算法[J]. 计算机工程, 2023, 49(2): 127-135.
[12]	胡慧旗, 张维强, 徐晨. 判别性增强的稀疏子空间聚类[J]. 计算机工程, 2023, 49(2): 98-104.
[13]	李林珂, 康昭, 龙波. 基于黎曼流形的多视角谱聚类算法[J]. 计算机工程, 2023, 49(1): 113-120,129.
[14]	孙扬威, 戚湧. 基于聚类混合采样与PSO-Stacking的车载CAN入侵检测方法[J]. 计算机工程, 2023, 49(1): 138-145.
[15]	李海林, 夏燕燕, 邹金串. 基于CPET时序聚类的中长跑耐力运动员选拔方法[J]. 计算机工程, 2022, 48(9): 262-268.

选择文件类型/文献管理软件名称

选择包含的内容

一种基于语言概念空间聚类的信息检索方法

An Information Retrieval Method Based on Language Concept Space Using Clustering Method

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价

模态框（Modal）标题

选择文件类型/文献管理软件名称

选择包含的内容

一种基于语言概念空间聚类的信息检索方法

An Information Retrieval Method Based on Language Concept Space Using Clustering Method

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价