Author Login Editor-in-Chief Peer Review Editor Work Office Work

Computer Engineering

Previous Articles     Next Articles

Greedy Adaptive Speech Time Scale Modification Algorithm Based on Pronunciation Mechanism

YANG Yan  a,LEI Yingsi  a,YUE Hui  b   

  1. (a.School of Electronic and Information Engineering; b.School of Railway Technology,Lanzhou Jiaotong University,Lanzhou 730070,China)
  • Received:2014-09-26 Online:2015-08-15 Published:2015-08-15

基于发音机制的贪婪自适应语音时长规整算法

杨燕a,雷颖思a,岳辉b   

  1. (兰州交通大学 a.电子与信息工程学院; b.铁道技术学院,兰州 730070)
  • 作者简介:杨燕(1972-),女,副教授、博士,主研方向:语音信号处理,数字图像处理;雷颖思,硕士研究生;岳辉,副教授。
  • 基金资助:
    甘肃省科技厅自然科学基金资助项目(1310RJZA050);甘肃省高等学校基本科研业务费专项基金资助项目(214138)。

Abstract: The Synchronized Overlap-add(SOLA) algorithm of speech Time Scale Modification(TSM) neglects the natural characteristics of real sound speech signals that different kinds of speech segments change differently under the change of speech speed and applies a same scaling factor to all the speech segments.When scaling proportion is large,the output speech signal is distorted.Aiming at such problems,a greedy adaptive algorithm is proposed.This algorithm applies different scaling factors to different speech segments and puts forward an adaptive algorithm.It changes the scaling factors dynamically,the defect of the whole modified proportion is further ameliorated and a greedy adaptive algorithm is created.Experimental results show that,under the Matlab environment,in the comparison simulations of speeches from TIMIT speech base,this algorithm improves the natural degree of the synthetic speech signals compared with the existing algorithms like Waveform Similarity Synchronized Overlap-add(WSOLA) algorithm and Time Domain Pitch Synchronized Overlap-add(TDPSOLA) algorithm.The scaled time deviation of the greedy adaptive algorithm is small.

Key words: speech Time Scale Modification(TSM), modification factor, Synchronized Overlap-add(SOLA) algorithm, adaptive algorithm, greedy adaptive algorithm

摘要: 语音时长规整的同步叠加算法未考虑真实声音信号中不同类型语音帧受语速影响变化不同的特性,对所有语音帧都采用相同的规整因子,当规整比例过大时,导致输出语音失真。针对该问题,提出一种贪婪自适应算法。对不同类型语音段使用不同的规整因子,动态改变规整因 子,进一步改进整体规整比例缺陷,从而设计贪婪自适应语音时长规整算法。在Matlab环境下对TIMIT语音库进行语音对比的结果表明,与波形相似同步叠加算法、时域基音同步叠加算法相比,该算法能提高合成语音的自然度,减小规整时长误差。

关键词: 语音时长规整, 规整因子, 同步叠加算法, 自适应算法, 贪婪自适应算法

CLC Number: