Influence of Data Normalization Methods on K-Nearest Neighbor Classifier

doi:10.3969/j.issn.1000-3428.2010.22.063

Computer Engineering ›› 2010, Vol. 36 ›› Issue (22): 175-177. doi: 10.3969/j.issn.1000-3428.2010.22.063

• Networks and Communications • Previous Articles Next Articles

Influence of Data Normalization Methods on K-Nearest Neighbor Classifier

CAI Wei-ling1,2, CHEN Dong-xia1,2

(1. College of Computer Science and Technology, Nanjing Normal University, Nanjing 210097, China; 2. Jiangsu Research Center of Information Security & Confidential Engineering, Nanjing 210097, China)

Online:2010-11-20 Published:2010-11-18

数据规范化方法对K近邻分类器的影响

蔡维玲1,2，陈东霞1,2

(1. 南京师范大学计算机科学与技术学院，南京 210097；2. 江苏省信息安全保密技术工程研究中心，南京 210097)

作者简介:蔡维玲(1982－)，女，讲师、博士，主研方向：模式识别，机器学习；陈东霞，讲师、硕士
基金资助:
江苏省高校自然科学研究基金资助项目(09KJB520007)；南京师范大学科研启动基金资助项目(2009101XGQ0066)；航空科学基金资助项目(20090152001)；江苏省产学研前瞻性联合研究基金资助项目(BY2009100)

Abstract

Abstract: This paper discusses the influence of three data normalization methods on the performance of K-Nearest Neighbor(KNN) classifier. The simulation results on the 12 real-life benchmark datasets and 1 artificial dataset show that on most datasets, the data normalization methods can enhance the recognition rate of KNN classifier. Motivated by these results, it explores why the data normalization methods work and presents a rule to indicate when the data normalization method is applied on the dataset according to the distribution characteristic of data.

Key words: K-Nearest Neighbor(KNN) classifier, data normalization method, Euclidian distance

摘要： 讨论最小-最大规范化、z-score规范化及小数定标规范化3种方法对K近邻分类器性能的影响，在12个标准UCI真实数据集和1个人工数据集上进行实验比较。实验结果表明，规范化方法在大部分数据集能上提高K近邻分类器的识别率。针对实验结果研究据规范化方法提升分类器性能的内在原因，给出根据数据属性的数值分布特点决定是否使用数据规范化方法的一般准则。

关键词: K近邻分类器, 数据规范化方法, 欧式距离

CLC Number:

TP391

CA Wei-Ling, CHEN Dong-Xia. Influence of Data Normalization Methods on K-Nearest Neighbor Classifier[J]. Computer Engineering, 2010, 36(22): 175-177.

蔡维玲, 陈东霞. 数据规范化方法对K近邻分类器的影响[J]. 计算机工程, 2010, 36(22): 175-177.

/ / Recommend / Download Citations

URL: http://www.ecice06.com/EN/10.3969/j.issn.1000-3428.2010.22.063

http://www.ecice06.com/EN/Y2010/V36/I22/175

Please choose a citation manager

Content to export

Influence of Data Normalization Methods on K-Nearest Neighbor Classifier

数据规范化方法对K近邻分类器的影响

PDF

Knowledge

Cited

Abstract

Cite this article

share this article

References

Related Articles 1

Recommended Articles

Metrics

Comments

模态框（Modal）标题

Please choose a citation manager

Content to export

Influence of Data Normalization Methods on K-Nearest Neighbor Classifier

数据规范化方法对K近邻分类器的影响

PDF

Knowledge

Cited

Abstract

Cite this article

share this article

References

Related Articles 1

Recommended Articles

Metrics

Comments