作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程

• 人工智能及识别技术 • 上一篇    下一篇

基于POWER8的动态自适应池化算法

景维鹏,张兴革   

  1. (东北林业大学信息与计算机工程学院,哈尔滨 150040)
  • 收稿日期:2015-12-16 出版日期:2016-05-15 发布日期:2016-05-13
  • 作者简介:景维鹏(1979-),男,副教授、博士,主研方向为语音识别、云计算;张兴革,硕士研究生。
  • 基金资助:
    黑龙江省自然科学基金资助项目(ZD201403);公益性行业科研专项基金资助项目(201504307)。

Dynamic Adaptive Pooling Algorithm Based on POWER8

JING Weipeng,ZHANG Xingge   

  1. (College of Information and Computer Engineering,Northeast Forestry University,Harbin 150040,China)
  • Received:2015-12-16 Online:2016-05-15 Published:2016-05-13

摘要: 针对当前卷积神经网络(CNN)模型中池化层关键语音特征提取效率低下的问题,提出一种基于POWER8架构的动态自适应池化(DA-Pooling)算法。在深度学习工具Caffe上实现CNN模型,输入经过卷积层的梅尔域滤波带系数,提取局部相邻语音的特征数据,通过计算Spearman相关 系数确定数据间的相关程度。根据特征权重对具有不同相关性的语音数据动态分配池化算法,以提高池化层对不同相关性数据的适应能力。DA-Pooling利用POWER8的高效浮点运算和多线程并行计算优势,提高了海量语音数据的处理效率。实验结果证明,相比现有主流 Pooling算法,DA-Pooling可提高关键语音数据的识别准确率,保证CNN中语音识别的稳定性。

关键词: 卷积神经网络, POWER8架构, 池化算法, Caffe深度学习工具, 语音特征提取, 数据相关性

Abstract: Aiming at the problem of low efficiency to extract the key speech feature in the pooling layer of the current Convolutional Neural Network(CNN) model,a Dynamic Adaptive Pooling(DA-Pooling) algorithm based on POWER8 architecture is proposed.The algorithm implements a CNN model on the deep learning tool called Caffe.The implementation method is as follows:taking filter bank features by means of the convolutional operation as input firstly,extracting local adjacent acoustic characteristic data,calculating the Spearman correlation coefficient of the extracted data to determine data correlation,making appropriate the pooling algorithm for different correlation of data according to weight.The DA-Pooling algorithm is based on the POWER8’s high-performance processing platform which has high efficient floating-point arithmetic unit and multi thread parallel technology to improve the efficiency of processing massive data.Experimental result shows that DA-Pooling algorithm can improve the recognition accuracy of the key speech data compared with the popular Pooling algorithm,and thereby improve the stability of speech signal recognition in the entire CNN.

Key words: Convolutional Neural Network(CNN), POWER8 architecture, pooling algorithm, Caffe deep learning tool, speech feature extraction, data correlation

中图分类号: