Text Classification Algorithm    Based on Neighborhood Component Analysis

doi:10.3969/j.issn.1000-3428.2012.15.038

Computer Engineering ›› 2012, Vol. 38 ›› Issue (15): 139-141. doi: 10.3969/j.issn.1000-3428.2012.15.038

• Networks and Communications • Previous Articles Next Articles

Text Classification Algorithm Based on Neighborhood Component Analysis

LIU Cong-shan, LI Xiang-bao, YANG Yu-pu

(Key Laboratory of System Control and Information Processing, Ministry of Education, Department of Automation, Shanghai Jiaotong University, Shanghai 200240, China)

Received:2011-09-29 Online:2012-08-05 Published:2012-08-05

一种基于近邻元分析的文本分类算法

刘丛山，李祥宝，杨煜普

(上海交通大学自动化系系统控制与信息处理教育部重点实验室，上海 200240)

作者简介:刘丛山(1986－)，男，硕士研究生，主研方向：机器学习；李祥宝，博士研究生；杨煜普，教授、博士
基金资助:
国家“863”计划基金资助项目“云制造服务平台关键技术”(2011AA040605)

Abstract

Abstract: This paper proposes a novel algorithm named K-NCA based on Neighborhood Component Analysis(NCA). It uses NCA to learn a Mahalanobis distance measure and reduces the dimension of the input dataset. The algorithm defines a class imbalance factor and introduces K Nearest Neighbor(KNN) to compute the test sample’s class-conditional probability estimation. The sample’s class label is decided by this probability. A text classifier is designed to accomplish the algorithm. Experimental results show that K-NCA algorithm can improve the accuracy of text classification.

Key words: Neighborhood Component Analysis(NCA), distance metric learning, dimension reduction, K Nearest Neighbor(KNN), text classification

摘要： 在近邻元分析(NCA)算法的基础上，提出K近邻元分析分类算法K-NCA。利用NCA算法完成对训练样本集的距离测度学习和降维，定义类偏斜因子，引入K近邻思想，得到测试样本的类条件概率估计，并通过该概率进行类别判定，实现文本分类器功能。实验结果表明，K-NCA算法的分类效果较好。

关键词: 近邻元分析, 距离测度学习, 降维, K近邻, 文本分类

CLC Number:

TP18

LIU Cong-Shan, LI Xiang-Bao, YANG Yu-Pu. Text Classification Algorithm Based on Neighborhood Component Analysis[J]. Computer Engineering, 2012, 38(15): 139-141.

刘丛山, 李祥宝, 杨煜普. 一种基于近邻元分析的文本分类算法[J]. 计算机工程, 2012, 38(15): 139-141.

/ / Recommend / Download Citations

URL: http://www.ecice06.com/EN/10.3969/j.issn.1000-3428.2012.15.038

http://www.ecice06.com/EN/Y2012/V38/I15/139

[1]	Junhang CHEN, Zuyuan YANG, Mingyang LIU, Lingjiang LI. Generalized Separable Nonnegative Matrix Factorization Algorithm Based on Orthogonal Constraints [J]. Computer Engineering, 2023, 49(8): 46-53.
[2]	ZHANG Boxu, PU Zhi, CHENG Xi. Research on Uyghur Text Classification Based on Prompt Learning [J]. Computer Engineering, 2023, 49(6): 292-299,313.
[3]	WANG Chundong, SUN Jiaqi, YANG Wenjun. Method for Generating Chinese Text Adversarial Examples Based on Rectification Understanding [J]. Computer Engineering, 2023, 49(2): 37-45.
[4]	CHEN Kejia, LIU Hui. Chinese Text Classification Method Based on Improved BiGRU-CNN [J]. Computer Engineering, 2022, 48(5): 59-66,73.
[5]	ZHANG Heng, CHEN Xiaohong, LAN Yuxiang, LI Shunming. Supervised Canonical Correlation Analysis Based on Deep Learning [J]. Computer Engineering, 2022, 48(5): 222-228.
[6]	JIN Yucheng, WANG Qingqin, GAO Jian, MIAO Zhongchen, LIN Yuefeng, XIANG Yali, XIONG Yun. Multi-label Financial Text Classification Algorithm Based on Graph Deep Learning [J]. Computer Engineering, 2022, 48(4): 16-21.
[7]	LI Ranran, LIU Daming, LIU Zheng, CHANG Gaoxiang. Text Classification Using Capsule Network Integrating Stroke Features [J]. Computer Engineering, 2022, 48(3): 69-73,80.
[8]	LU Yi, WANG Peng, WANG Wei. Time-Series Semantic Mining Algorithm Based on Sub-Series Similarity [J]. Computer Engineering, 2022, 48(10): 88-94.
[9]	GE Junwei, YANG Guangxin. Spectral Clustering Algorithm for Density Adaptive Neighborhood Based on Shared Nearest Neighbors [J]. Computer Engineering, 2021, 47(8): 116-123.
[10]	WU Jiao, HONG Caifeng, GU Yongchun, GU Xingquan, JIN Shiju. Linear Regression Text Classification Based on Class-wise Nearest Neighbor Dictionary [J]. Computer Engineering, 2021, 47(8): 93-99,108.
[11]	PENG Junli, GU Yu, ZHANG Zhen, GENG Xiaohang. Document Representation Fused with Term Contribution and Word2Vec Word Vector [J]. Computer Engineering, 2021, 47(4): 62-67.
[12]	ZHOU Weixiao, LAN Wenfei. Summarization Model Using Multi-Task Learning Fused with Text Classification [J]. Computer Engineering, 2021, 47(4): 48-55.
[13]	HE Li, ZHENG Zaoxian, XIANG Fengtao, WU Jianzhai, TAN Lin. Research Progress of Text Classification Technology Based on Deep Learning [J]. Computer Engineering, 2021, 47(2): 1-11.
[14]	ZHOU Peichun, WU Lan'an. Multi-Scale Multi-Kernel Gaussian Process Latent Variable Model [J]. Computer Engineering, 2021, 47(2): 285-292.
[15]	YUAN Ziyong, GAO Shu, CAO Jiao, CHEN Liangchen. Method for Few-Shot Short Text Classification Based on Heterogeneous Graph Convolutional Network [J]. Computer Engineering, 2021, 47(12): 87-94.

Please choose a citation manager

Content to export

Text Classification Algorithm Based on Neighborhood Component Analysis

一种基于近邻元分析的文本分类算法

PDF

Knowledge

Cited

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics

Comments

模态框（Modal）标题

Please choose a citation manager

Content to export

Text Classification Algorithm Based on Neighborhood Component Analysis

一种基于近邻元分析的文本分类算法

PDF

Knowledge

Cited

Abstract

Cite this article

share this article

References

Related Articles 15

Recommended Articles

Metrics

Comments