流形学习算法中邻域大小参数的递增式选取

doi:10.3969/j.issn.1000-3428.2014.08.037

计算机工程

流形学习算法中邻域大小参数的递增式选取

邵超,万春红,赵静玉

(河南财经政法大学计算机与信息工程学院,郑州 450002)

收稿日期:2013-10-22 出版日期:2014-08-15 发布日期:2014-08-15
作者简介:邵超(1977-)，男，副教授、博士，主研方向:机器学习，数据可视化;万春红、赵静玉，讲师。
基金资助:
国家自然科学基金资助项目（61202285）;河南省基础与前沿技术研究基金资助项目（112300410201）;河南省教育厅科学技术研究重点基础研究计划基金资助项目（13B520899）。

Incremental Selection of Neighborhood Size Parameter for Manifold Learning Algorithms

SHAO Chao,WAN Chun-hong,ZHAO Jing-yu

(School of Computer and Information Engineering,Henan University of Economics and Law,Zhengzhou 450002,China)

Received:2013-10-22 Online:2014-08-15 Published:2014-08-15

摘要/Abstract

摘要： 流形学习算法能否成功应用依赖于邻域大小参数的选取是否合适,但该参数在实际中通常难以高效选取。为此,提出一种邻域大小参数的递增式选取方法。按照流形的局部欧氏性,邻域图上的所有邻域都呈线性或近似线性,邻域大小参数若合适,此时所有邻域的线性度量可聚成一类;而邻域大小参数若不合适,邻域图上就会有部分邻域不再线性,其线性度量也不能聚成一类。对邻域图上的每一个邻域执行加权主成分分析,用重建误差对其线性程度进行度量,并计算相应的贝叶斯信息准则,以探测其聚类个数,从而实现对邻域大小参数的递增式选取。实验结果表明,该方法无需任何额外参数,具有较高的运行效率。

关键词: 流形学习, 邻域大小, 局部欧氏性, 加权主成分分析, 重建误差, 贝叶斯信息准则

Abstract: The success of manifold learning algorithms depends greatly upon selecting a suitable neighborhood size parameter,however,it is an open problem how to do this efficiently.To solve this problem,this paper proposes an efficient method to incrementally select a suitable neighborhood size.According to the local Euclidean property of the manifold,that all the neighborhoods in the neighborhood graph are linear or almost linear is the basis to think the corresponding neighborhood size suitable,when their linearity measures can remain small and fall into one cluster.However,once the neighborhood size becomes unsuitable,some neighborhoods are nonlinear,and their linearity measures can not fall into one cluster any more.So,this method runs the weighted Principal Component Analysis(PCA) on each neighborhood in the neighborhood graph,to obtain its reconstruction error as its linearity measure,and computes the corresponding Bayesian Information Criterion(BIC) to detect the number of clusters of all the reconstruction errors in the neighborhood graph,by which the neighborhood size can be selected incrementally.Experimental results that this method does not require any extra parameter,and has high run efficiency. 

Key words: manifold learning, neighborhood size, local Euclidean property, weighted Principal Component Analysis(PCA), reconstruction error, Bayesian Information Criterion(BIC)

中图分类号:

TP18

邵超,万春红,赵静玉. 流形学习算法中邻域大小参数的递增式选取[J]. 计算机工程, doi: 10.3969/j.issn.1000-3428.2014.08.037.

SHAO Chao,WAN Chun-hong,ZHAO Jing-yu. Incremental Selection of Neighborhood Size Parameter for Manifold Learning Algorithms[J]. Computer Engineering, doi: 10.3969/j.issn.1000-3428.2014.08.037.

http://www.ecice06.com/CN/Y2014/V40/I8/194

参考文献

［1］ Seung H S,Lee D D.The Manifold Ways of Perception［J］.Science,2000,290(5500):2268-2269.  ［2］杨剑,李伏欣,王珏.一种改进的局部切空间排列算法［J］.软件学报,2005,16(9):1584-1590.  ［3］ Tenenbaum J B,de Silva V,Langford J C.A Global Geometric Framework for Nonlinear Dimensionality Reduction［J］.Science,2000,290(5500):2319-2323.  ［4］王耀南,张莹,李春生.基于核矩阵的Isomap增量学习算法研究［J］.计算机研究与发展,2009,46(9):1515-1522.  ［5］ Roweis S T,Saul L K.Nonlinear Dimensionality Reduction by Locally Linear Embedding［J］.Science,2000,290(5500):2323-2326.  ［6］ Zhang S.Enhanced Supervised Locally Linear Embedding［J］.Pattern Recognition Letters,2009,30(13):1208-1218.  ［7］ Balasubramanian M,Shwartz E L,Tenenbaum J B,et al.The ISOMAP Algorithm and Topological Stability［J］.Science,2002,295(5552):7-17.  ［8］ Saul L K,Roweis S T.Think Globally,Fit Locally:Unsupervised Learning of Low Dimensional Manifolds［J］.Journal of Machine Learning Research,2003,4(1):119-155.  ［9］詹德川,周志华.基于集成的流形学习可视化［J］.计算机研究与发展,2005,42(9):1533-1537.  ［10］邵超,黄厚宽,赵连伟.一种更具拓扑稳定性的ISOMAP算法［J］.软件学报,2007,18(4):869-877.  ［11］ Kouropteva O,Okun O,Pietikainen M.Selection of the Optimal Parameter Value for the Locally Linear Embedding Algorithm［C］//Proc.of the 1st International Conference on Fuzzy Systems and Knowledge Discovery,Orchid Country Club.Singapore:IEEE Press,2002:359-363.   ［12］ Samko O,Marshall A D,Rosin P L.Selection of the Optimal Parameter Value for the Isomap Algorithm［J］.Pattern Recognition Letters,2006,27(1):968-979.  ［13］黄启宏,刘钊.流形学习中非线性维数约简方法概述［J］.计算机应用研究,2007,24(11):19-25.  ［14］〖JP3〗Saxena A,Gupta A,Mukerjee A.Non-linear Dimensionality Reduction by Locally Linear Isomaps［C］//Proc.of the 11th International Conference on Neural Information Processing.Calcutta,India:Springer,2004:1038-1043.   ［15］ Wen G,Jiang L,Shadbolt N R.Using Graph Algebra to Optimize Neighborhood for Isometric Mapping［C］//Proc.of the 20th International Joint Conference on Artificial Intelligence.Hyderabad,India:AAAI Press,2007,2398-2403.  ［16］ Carreira-Perpinan M A,Zemel R S.Proximity Graphs for Clustering and Manifold Learning［C］//Proc.of the 18th Annual Conference on Neural Information Processing Systems.Vancouver,Canada:MIT Press,2004:225-232.  ［17］ Yang L.Building k Edge-disjoint Spanning Trees of Minimum Total Length for Isometric Data Embedding［J］.IEEE Transactions on Pattern Analysis and Machine Intelligence,2005,27(10):1680-1683.  ［18］ Zhang Z,Wang J,Zha H.Adaptive Manifold Learning［J］.IEEE Transactions on Pattern Analysis and Machine Intelligence,2012,34(2):1473-1480.  ［19］文贵华,江丽君,文军.邻域参数动态变化的局部线性嵌入［J］.软件学报,2008,19(7):1666-1673.  ［20］邵超,张斌,万春红.流形学习中邻域大小参数的合适性判定［J］.计算机工程与应用,2010,46(20):172-175.  ［21］ Chang H,Yeung D.Robust Locally Linear Embedding［J］.Pattern Recognition,2006,39(6):1053-1065.  ［22］〖JP3〗Pelleg D,Moore A.X-means:Extending K-means With Efficient Estimation of the Number of Clusters［C］//Proc.of the 17th International Conference on Machine Learning.San Francisco，USA:Morgan Kaufmann Publishers,2000:727-734. 编辑索书志

[1]	高小方, 原玉梁, 温静, 白雪飞. 面向相交多流形聚类的标签传播算法[J]. 计算机工程, 2023, 49(6): 90-98.
[2]	毕然, 王轶, 周喜. 基于重建误差的任务型对话未知意图检测[J]. 计算机工程, 2023, 49(2): 54-60.
[3]	蔡瑞初, 吴思宇, 乔杰. 面向故障间格兰杰因果发现的霍克斯过程研究[J]. 计算机工程, 2023, 49(1): 65-72.
[4]	李林珂, 康昭, 龙波. 基于黎曼流形的多视角谱聚类算法[J]. 计算机工程, 2023, 49(1): 113-120,129.
[5]	杨登舟,刘加,夏善红. 基于计算听觉场景分析的说话人转换检测[J]. 计算机工程, 2018, 44(2): 316-321.
[6]	黄涛涛,顾晶晶,庄毅. 基于半监督拉普拉斯映射的移动定位算法[J]. 计算机工程, 2018, 44(1): 144-148,153.
[7]	梁金平,董唯光,毛向德. 变流器故障特征提取与维数约简方法研究[J]. 计算机工程, 2015, 41(12): 280-287.
[8]	龚劬,马家军. 基于改进二维保局投影算法的人脸识别[J]. 计算机工程, 2014, 40(9): 252-256.
[9]	林晨，李宏宇，牛军钰. 基于流形学习的蒙赛尔颜色光谱分析[J]. 计算机工程, 2014, 40(4): 198-202.
[10]	赵辽英,李富杰,厉小润. 泛化改进的局部切空间排列算法[J]. 计算机工程, 2014, 40(11): 160-166.
[11]	王小攀, 马丽, 刘福江. 一种基于线性邻域传播的加权K近邻算法[J]. 计算机工程, 2013, 39(7): 288-292.
[12]	孙洋, 叶庆卫, 王晓东, 周宇. 基于稀疏约束的LLE改进算法[J]. 计算机工程, 2013, 39(5): 53-56,60.
[13]	秦娜, 桑凤娟. 基于自适应邻域选择的局部判别投影算法[J]. 计算机工程, 2013, 39(4): 194-198.
[14]	郭丽，郑忠龙，贾炯，张海新，付芳梅. 一种有监督的线性降维人脸识别算法[J]. 计算机工程, 2013, 39(11): 169-173.
[15]	曾宪华, 段文强. 基于近邻非负线性组合的高分辨率图像重建[J]. 计算机工程, 2012, 38(22): 211-215.

选择文件类型/文献管理软件名称

选择包含的内容

流形学习算法中邻域大小参数的递增式选取

Incremental Selection of Neighborhood Size Parameter for Manifold Learning Algorithms

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价

模态框（Modal）标题

选择文件类型/文献管理软件名称

选择包含的内容

流形学习算法中邻域大小参数的递增式选取

Incremental Selection of Neighborhood Size Parameter for Manifold Learning Algorithms

PDF

可视化

被引次数

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价