作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2006, Vol. 32 ›› Issue (21): 6-8. doi: 10.3969/j.issn.1000-3428.2006.21.003

• 博士论文 • 上一篇    下一篇

基于统计学的最近邻查询中维数灾难的研究

薄树奎,李盛阳,朱重光   

  1. (中国科学院遥感应用研究所,北京100101)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2006-11-05 发布日期:2006-11-05

Study on Dimensionality Curse in the Nearest Neighbor Queries
Based on Statistics

BO Shukui, LI Shengyang, ZHU Chongguang   

  1. (Institute of Remote Sensing Applications, Chinese Academy of Sciences, Beijing 100101)
  • Received:1900-01-01 Revised:1900-01-01 Online:2006-11-05 Published:2006-11-05

摘要: 对高维数据空间中维数对最近邻查询结果的影响作了研究,提出了对这种影响的评估方法,基于统计学,证明了在一定条件下,相似性查询的不稳定性,以及其随维数的增加恶化程度的分布规律。给出了两个关于距离的统计量的分布,可以对最近邻查询问题进行理论估计,并通过实验结果验证了理论的正确性。

关键词: 不稳定性, 统计, 维数灾难, 相似性, 最近邻

Abstract: This paper explores the effect of dimensionality on the “nearest neighbor” problem. Based on statistics, it shows that under some conditions, as dimensionality increases, the distances between query point and data points approach to each other. So the “nearest neighbor” is becoming meaningless. The way of how to evaluate the dimensionality effect is presented. From two distributions of statistics about distance, the effect of dimensionality on the “nearest neighbor” problem is evaluated. Empirical result is presented to demonstrate the two distributions.

Key words: Instability, Statistics, Dimensionality curse, Similarity, Nearest neighbor

中图分类号: