计算机工程 ›› 2018, Vol. 44 ›› Issue (9): 256-262.doi: 10.19678/j.issn.1000-3428.0047850

• 多媒体技术及应用 • 上一篇    下一篇

基于神经网络的AVS-P10开环模式选择算法优化

崔佰会 1,高戈 1,姜林 1,2   

  1. 1.武汉大学 国家多媒体软件工程技术研究中心,武汉 430072; 2.东华理工大学 软件学院,南昌 330013
  • 收稿日期:2017-07-06 出版日期:2018-09-15 发布日期:2018-09-15
  • 作者简介:崔佰会(1991—),男,硕士研究生,主研方向为音频信号编码;高戈,副教授;姜林,副教授、博士研究生。
  • 基金项目:

    国家自然科学基金“基于上下文相关的音频非盲带宽扩展编码研究”(61762005)。

Optimization of AVS-P10 Open-loop Mode Selection Algorithm Based on Neural Network

CUI Baihui  1,GAO Ge  1,JIANG Lin  1,2   

  1. 1.National Engineering Research Center for Multimedia Software,Wuhan University,Wuhan 430072,China; 2.Software College,East China University of Technology,Nanchang 330013,China
  • Received:2017-07-06 Online:2018-09-15 Published:2018-09-15

摘要:

现有的开环模式选择算法依赖信号分类的准确率,但多数情况下准确率较低,造成开环模式下编码音质较差。为此,提出一种改进的基于神经网络的开环模式选择算法。使用神经网络替换原开环模式选择的决策树算法,拟合闭环模式选择结果进行训练得到模式选择分类器,按 照闭环模式选择的逻辑过程,运用神经网络预测输入的信号,在ACELP256和TVC256两种编码模式的信噪比取代编码尝试计算得到的信噪比。实验结果表明,与原AVS-P10开环选择方法相比,提出的2种模式在语音分类准确率上分别提升5.96%和18.07%,在音乐分类准确率上分别 提升3.84%和20.29%,其主客观编码音质评测明显提升。

关键词: 神经网络, 先进音视频编码, 模式选择, 特征选择, 信号分类, 信噪比估计

Abstract:

The existing open-loop mode selection algorithm relies on the accuracy of signal classification,but in most cases the accuracy is low,resulting in poor coding quality in open-loop mode.Therefore an improved neural network based open-loop mode selection algorithm is proposed.The decision tree algorithm of the original open-loop mode selection is replaced by the neural network,and the closed-loop mode selection is selected for training to obtain the mode selection classifier.According to the logic process of the closed-loop mode selection,the neural network is used to predict the input signal,and the two codes are ACELP256 and TVC256.The signal to noise ratio of the mode replaces the signal-to- noise ratio that the coding attempt is calculated.Experimental results show that the accuracy of the two methods is 5.96% and 18.07%,respectively,and the accuracy of music classification is 3.84% and 20.29%,the performance of subjective and objective tone has high improvement.

Key words: neural network, advanced audio and video coding, mode selection, feature selection, signal classification, Signal to Noise Ratio(SNR) estimation

中图分类号: