作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2022, Vol. 48 ›› Issue (11): 111-119. doi: 10.19678/j.issn.1000-3428.0062962

• 人工智能与模式识别 • 上一篇    下一篇

基于k近邻的多尺度超球卷积神经网络学习

刘子巍, 骆曦, 李克, 陈富强   

  1. 北京联合大学 智慧城市学院, 北京 100101
  • 收稿日期:2021-10-14 修回日期:2021-11-18 发布日期:2021-11-25
  • 作者简介:刘子巍(1996—),男,硕士研究生,主研方向为机器学习、知识图谱;骆曦,讲师、博士;李克(通信作者),教授、博士;陈富强,硕士研究生。
  • 基金资助:
    国家自然科学基金(61972040);北京联合大学校内科研专项(ZK50201911,ZB10202004)。

Multi-Scale Hypersphere Convolutional Neural Network Learning Based on k-Nearest Neighbor

LIU Ziwei, LUO Xi, LI Ke, CHEN Fuqiang   

  1. College of Smart City, Beijing Union University, Beijing 100101, China
  • Received:2021-10-14 Revised:2021-11-18 Published:2021-11-25

摘要: 以卷积神经网络(CNN)为代表的深度学习模型主要面向图像、语音等均匀采样的同质欧氏空间数据,通常不适用于大量存在于工业等领域的异质、非均匀稀疏采样的结构化数据。针对异质、非均匀稀疏采样结构化数据集的预测任务,提出一种基于k近邻(kNN)算法和CNN的超球卷积神经网络学习模型。通过kNN预处理建立各样本在高维属性空间中的结构关系,将样本邻域内各样本的标记作为其属性重构样本集合,实现数据属性集从异质到同质的转化,进而通过合理设计CNN的卷积窗,有效提取和利用各样本的邻域空间中样本的标记分布特征,完成对未知样本的预测。在不同邻域尺度、软硬标记以及混淆非混淆等条件下进行实验,结果表明,该模型预测准确率达到98.04%,其准确率和召回率较FC-CNN、CNN、kNN和Radar-CNN算法分别提升0.28%~1.66%和4.78%~31.92%。

关键词: 卷积神经网络, k近邻算法, 超球卷积, 结构化数据, 深度学习

Abstract: The deep learning model represented by the Convolutional Neural Network(CNN) model is primarily used for homogeneous Euclidean domain data, such as images and speech, and is typically difficult to directly apply to a large number of heterogeneous, unevenly, and sparsely sampled structured data from industrial fields.Aiming at the prediction task of heterogeneous, nonuniform, and sparsely sampled structured datasets, a hypersphere CNN learning model based on the k-Nearest Neighbor(kNN) algorithm and CNN is proposed.Through kNN preprocessing, the structural relationship of each sample in the high-dimensional attribute space is established, and the markers of each sample in the neighborhood of the sample are used as attributes to reconstruct the sample set to realize the transformation of the data attribute set from heterogeneous to homogeneous.Subsequently, by reasonably designing the convolution window of the CNN, the marker distribution characteristics of each sample in the neighborhood space are effectively extracted and utilized, and the prediction of unknown samples is completed.Experiments are conducted under different neighborhood scales, soft markers and hard labels, and confusion and non-confusion, and the results show that the prediction accuracy of this model reached 98.04%.Compared with the FC-CNN, CNN, kNN, and Radar-CNN algorithms, the accuracy rate increased by 0.28%~1.66%, and the recall rate increased by 4.78%~31.92%.

Key words: Convolutional Neural Network(CNN), k-Nearest Neighbor(kNN) algorithm, hypersphere convolution, structured data, deep learning

中图分类号: