Denoising and Sample Reduction for Large-scale Sample Set  Based on Distance of Nearest Neighbors

doi:10.3969/j.issn.1000-3428.2011.05.062

Computer Engineering ›› 2011, Vol. 37 ›› Issue (5): 184-186. doi: 10.3969/j.issn.1000-3428.2011.05.062

• Networks and Communications • Previous Articles Next Articles

Denoising and Sample Reduction for Large-scale Sample Set Based on Distance of Nearest Neighbors

CHEN Sheng-bing¹, LI Long-shu²

(1. Department of Computer Science and Technology, Hefei University, Hefei 230601, China; 2. Key Lab of Intelligent Computing and Signal Processing of Ministry of Education, Anhui University, Hefei 230039, China)

Online:2011-03-05 Published:2012-10-31

基于近邻距离的大规模样本集去噪与减样

陈圣兵¹，李龙澍²

(1. 合肥学院计算机科学与技术系，合肥 230601；2. 安徽大学计算智能与信号处理教育部重点实验室，合肥 230039)

作者简介:陈圣兵(1973－)，男，博士研究生，主研方向：机器学习算法，人工智能；李龙澍，教授、博士生导师
基金资助:
国家自然科学基金资助项目(60273043)；安徽省自然科学基金资助项目(090412054)

Abstract

Abstract: Based on the analysis of the limitation of traditional sample reduction method, a new distance model is proposed, and the measurement method of intra-class distance and inter-class distance of samples is given. By using the new distance mode, the methods of noise identification and importance evaluation are described, and the algorithm of training sample reduction is proposed. The algorithm removes noise samples. According to the similarity of sample, inter-class distance of sample and the number of deleted samples around, the algorithm removes lesser important training samples from the original sample space directly. Simulation results show that the distance model has lesser contingency and better anti-noise ability, and the reduction performance of the algorithm is better than traditional methods.

Key words: support vector, denoising, sample reduction, large-scale sample set

摘要： 在分析传统样本缩减方法局限性的基础上，提出一种距离模型及样本的类内距离和类间距离的度量方法。给出利用该距离模型进行噪声识别和样本重要性评价方法及训练样本的缩减算法。该算法剔除噪声样本，根据样本相似性、类间距离和周围被剔除样本的数目，直接从原始样本空间剔除次要样本。仿真结果表明，该距离模型偶然性小，抗噪能力强，缩减效果优于传统的样本缩减方法。

关键词: 支持向量, 去噪, 减样, 大规模样本集

CLC Number:

TP181

CHEN Ku-Bing, LI Long-Shu. Denoising and Sample Reduction for Large-scale Sample Set Based on Distance of Nearest Neighbors[J]. Computer Engineering, 2011, 37(5): 184-186.

陈圣兵, 李龙澍. 基于近邻距离的大规模样本集去噪与减样[J]. 计算机工程, 2011, 37(5): 184-186.

/ / Recommend / Download Citations

URL: http://www.ecice06.com/EN/10.3969/j.issn.1000-3428.2011.05.062

http://www.ecice06.com/EN/Y2011/V37/I5/184

[1]	YANG Jingjing, XIE Haiyan, XUE Nini, ZHANG Aoming. Research on Underwater Image Denoising Based on Dual-Channels Residual Network [J]. Computer Engineering, 2023, 49(4): 188-198.
[2]	XI Rongkang, CAI Manchun, LU Tianliang. Tor Traffic Analysis Model Based on Data Enhancement and Stream Data Processing [J]. Computer Engineering, 2023, 49(3): 177-184.
[3]	ZHOU Bochao, HAN Yunan, GUI Zhiguo, LI Yufeng, ZHANG Quan. Low-Dose CT Image Denoising Algorithm Based on VGG Network and Deep Dictionary [J]. Computer Engineering, 2022, 48(4): 191-196,205.
[4]	WANG Zhijiang, QIN Pinle, CHAI Rui, WU Feng, CHENG Yitong, SHI Yue. Automatic Identification Method of Tooth Impaction Based on Deep Learning [J]. Computer Engineering, 2022, 48(4): 307-313.
[5]	XU Benye, GU Binjie, PAN Feng, XIONG Weili. Weighted Smooth Projection Twin Support Vector Regression Algorithm [J]. Computer Engineering, 2022, 48(12): 104-111,118.
[6]	CHEN Zhonghan, ZHAO Junli, HUANG Ruikun. Skull Restoration Method Based on Radial Curve and Support Vector Regression [J]. Computer Engineering, 2022, 48(1): 305-311.
[7]	LIAN Weifang, CHAO Hao, LIU Yongli. EEG Emotion Recognition Method Based on SDAE and RELM [J]. Computer Engineering, 2021, 47(9): 75-83.
[8]	TANG Chao, ZUO Wentao, LI Xiaofei. Image Denoising Algorithm Combining Trimmed Mean and Gaussian Weighted Median Filtering [J]. Computer Engineering, 2021, 47(9): 210-216.
[9]	GENG Junjie, LI Xiaoming, YAN Jinyao. Optimization of DASH System Based on Network Traffic Prediction [J]. Computer Engineering, 2021, 47(5): 292-300.
[10]	WANG Hai, WENG Chenao, LI Ke, LUO Xi. An Improved SVM Algorithm for Azimuth Estimation of Base Station Sector [J]. Computer Engineering, 2021, 47(4): 120-126.
[11]	ZHANG Bingyu, PAN Qing, TIAN Nili, Everett Xiaolin Wang. A Source Number Estimation Method Based on Multiple Feature Fusion [J]. Computer Engineering, 2021, 47(4): 115-119,126.
[12]	LIAN Xiaowei, MA Yao, CHEN Yongle, ZHANG Zhuangzhuang, WANG Jianhua. Shodan Traffic Identification Based on Load Characteristics and Statistical Characteristics [J]. Computer Engineering, 2021, 47(1): 117-122.
[13]	ZHANG Guoling, WANG Xiaodan, LI Rui, LAI Jie, XIANG Qian. Extreme Learning Machine Based on Stacked Denoising Sparse Auto-Encoder [J]. Computer Engineering, 2020, 46(9): 61-67.
[14]	YUAN Zheming, YANG Jingjing, CHEN Yuan. Feature Selection Method Based on Maximum Information Coefficient and Redundancy Sharing [J]. Computer Engineering, 2020, 46(8): 101-105.
[15]	FU Zixi, XU Yang, WU Zhaodi, XU Dandan, XIE Xiaoyao. SVM-KNN Network Intrusion Detection Method Based on Incremental Learning [J]. Computer Engineering, 2020, 46(4): 115-122.

Please choose a citation manager

Content to export

Denoising and Sample Reduction for Large-scale Sample Set Based on Distance of Nearest Neighbors

基于近邻距离的大规模样本集去噪与减样

PDF

Knowledge

Cited

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics

Comments

模态框（Modal）标题

Please choose a citation manager

Content to export

Denoising and Sample Reduction for Large-scale Sample Set Based on Distance of Nearest Neighbors

基于近邻距离的大规模样本集去噪与减样

PDF

Knowledge

Cited

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics

Comments