Greedy Adaptive Speech Time Scale Modification Algorithm Based on Pronunciation Mechanism

doi:10.3969/j.issn.1000-3428.2015.08.039

Abstract

Abstract: The Synchronized Overlap-add(SOLA) algorithm of speech Time Scale Modification(TSM) neglects the natural characteristics of real sound speech signals that different kinds of speech segments change differently under the change of speech speed and applies a same scaling factor to all the speech segments.When scaling proportion is large,the output speech signal is distorted.Aiming at such problems,a greedy adaptive algorithm is proposed.This algorithm applies different scaling factors to different speech segments and puts forward an adaptive algorithm.It changes the scaling factors dynamically,the defect of the whole modified proportion is further ameliorated and a greedy adaptive algorithm is created.Experimental results show that,under the Matlab environment,in the comparison simulations of speeches from TIMIT speech base,this algorithm improves the natural degree of the synthetic speech signals compared with the existing algorithms like Waveform Similarity Synchronized Overlap-add(WSOLA) algorithm and Time Domain Pitch Synchronized Overlap-add(TDPSOLA) algorithm.The scaled time deviation of the greedy adaptive algorithm is small.

Key words: speech Time Scale Modification(TSM), modification factor, Synchronized Overlap-add(SOLA) algorithm, adaptive algorithm, greedy adaptive algorithm

摘要： 语音时长规整的同步叠加算法未考虑真实声音信号中不同类型语音帧受语速影响变化不同的特性,对所有语音帧都采用相同的规整因子,当规整比例过大时,导致输出语音失真。针对该问题,提出一种贪婪自适应算法。对不同类型语音段使用不同的规整因子,动态改变规整因子,进一步改进整体规整比例缺陷,从而设计贪婪自适应语音时长规整算法。在Matlab环境下对TIMIT语音库进行语音对比的结果表明,与波形相似同步叠加算法、时域基音同步叠加算法相比,该算法能提高合成语音的自然度,减小规整时长误差。

关键词: 语音时长规整, 规整因子, 同步叠加算法, 自适应算法, 贪婪自适应算法

CLC Number:

TN912.3

YANG Yan,LEI Yingsi,YUE Hui. Greedy Adaptive Speech Time Scale Modification Algorithm Based on Pronunciation Mechanism[J]. Computer Engineering, doi: 10.3969/j.issn.1000-3428.2015.08.039.

杨燕,雷颖思,岳辉. 基于发音机制的贪婪自适应语音时长规整算法[J]. 计算机工程, doi: 10.3969/j.issn.1000-3428.2015.08.039.

/ / Recommend / Download Citations

URL: http://www.ecice06.com/EN/10.3969/j.issn.1000-3428.2015.08.039

http://www.ecice06.com/EN/Y2015/V41/I8/212

References

参考文献［1］Kupryjanow A,Czyzewski A.Methods of Improving Speech Intelligibility for Listeners with Hearing Resolution Deficit［EB/OL］.(2010-11-21).http://www.ncbi.nlm.nih.gov/pubmed/23009662. ［2］周俊,高悦,谭薇,等.语音时长规整技术的研究回溯［J］.现代电子技术,2006,29(18):102-105. ［3］王京辉.语音信号处理技术研究［D］.济南:山东大学,2008. ［4］Xiang Shijun,Kim H J,Huang Jiwu.Audio Water-marking Robust Against Time-scale Modification and MP3 Compression［J］.Signal Processing,2008,88(10):2372-2387. ［5］Amatriain X,Bonada J L.Content-based Transforma-tions［J］.Journal of New Music Research,2003, 32(1):95-114. ［6］Kang J A,Choi S H.Speaking Rate Control Based on Time-scale Modification and Its Effects on the Performance of Speech Recognition［J］.International Journal of Engi-neering Systems Modelling and Simulation,2014,6(1):31-36. ［7］Chu W C,Lashkari K.Energy-based Nonuniform Time-scale Compression of Audio Signals［J］.IEEE Tran-sactions on Consumer Electronics,2003,49(1):183-187. ［8］黄昊,郭立,郑东飞.分段语音时长规整算法［J］.声学技术,2007,26(6):1191-1195. ［9］毛启容,詹永照,杜守富.一种快速实时语音个人特征改变方法［J］.电子与信息学报,2007,29(2):434-438. ［10］Jeon K M,Kim H K.High-quality Speech Modification Based on Pitch-synchronous Harmonic and Non-harmonic Modeling of Speech［J］.Advanced Science and Technology Letters,2012,14(1):176-179. ［11］Hejna D J.Real-time Time-scale Modification of Speech via the Synchronized Overlap-add Algorithm［D］.Cambridge,USA:Massachusetts Institute of Technology,1990. ［12］Verhelst W.Overlap-add Methods for Time-scaling of Speech［J］.Speech Communication,2000,30(4):207-221. ［13］李英浩,孔江平.普通话双音节V1n#C2V2音节间逆向协同发音［J］.清华大学学报:自然科学版,2013,53(6):818-822. ［14］张茹,韩纪庆.一种基于音素模型感知度的发音质量评价方法［J］.声学学报,2013,38(2):201-207. ［15］Furui S.On the Role of Spectral Transition for Speech Perception［J］.Journal of the Acoustical Society of America,1986,80(4):1016-1025. ［16］Costa-Faidella J,Grimm S,Slabu L,et al.Multiple Time Scales of Adaptation in the Auditory System as Revealed by Human Evoked Potentials［J］.Psychophysiology,2011,48(6):774-783. ［17］Grofit S,Lavner Y.Time-scale Modification of Audio Signals Using Enhanced WSOLA with Management of Transients［J］.IEEE Transactions on Audio,Speech,and Language Processing,2008,16(1):106-115. 编辑刘冰

[1]	CUI Yan, LI Qinghua. Parameters Adaptive Finte-time Consensus Algorithm for Second-order Multi-agent Systems [J]. Computer Engineering, 2020, 46(4): 273-278,286.
[2]	KANG Yan, YANG Qiyue, LI Hao, LIANG Wentao, LI Jinyuan, CUI Guorong, WANG Peiyao. Adaptive Text Classification Based on Topic Similarity Clustering [J]. Computer Engineering, 2020, 46(3): 93-98.
[3]	SHI Yuanhao, ZHANG Jianming, XU Zhengyi, TENG Guowei. Pedestrian Dead Reckoning Algorithm with Cumulative Error Correction in Multi-Motion Mode [J]. Computer Engineering, 2020, 46(12): 305-312.
[4]	CHEN Lian,REN Zhi,GE Lijia,LI Guilin. Adaptive MPR Set Selection Algorithm Based on Optimized Link State Routing Protocol [J]. Computer Engineering, 2017, 43(10): 68-71,76.
[5]	WEN Haoxiang,HONG Yuanquan,LUO Huan,ZHOU Yongming. Active Coefficient Locating Algorithm for Dual-filter Structure [J]. Computer Engineering, 2016, 42(7): 310-314.
[6]	YUAN Peng,FU Jielin. Research on Fractional Spaced Equalizer for Short Wave Burst Communication [J]. Computer Engineering, 2016, 42(12): 108-111.
[7]	YANG Minghua,WANG Yunhui,TAN Li,SU Weijun,WANG Zhenhai. Real-time Video Optimization Transmission Method Based on Bayesian Classifier [J]. Computer Engineering, 2014, 40(12): 251-257.
[8]	LUO Yu-di, WANG Mei, FU Jie-lin. Performance Analysis of ERAM Algorithm in Rayleigh Fading Channel [J]. Computer Engineering, 2013, 39(9): 142-145.
[9]	HUANG Hao, YANG Wei-dong. Adaptive Processing Algorithm of Ad Hoc Query in Data Streams [J]. Computer Engineering, 2013, 39(9): 74-79.
[10]	TUN Chao, JIN Xi-Fu. Adaptive Genetic Algorithm Based on Cloud Control [J]. Computer Engineering, 2011, 37(8): 189-191.
[11]	WANG Liang; LIU Lian-shan. Blind Video Watermark Algorithm Based on Adaptive Strategy [J]. Computer Engineering, 2010, 36(06): 142-145.
[12]	ZHANG Ze-qi; HAN Guo-dong; HUANG Wan-wei; ZHENG Liang-quan. Adaptive Routing Mechanism of NoC Based on Awareness of Link-state [J]. Computer Engineering, 2009, 35(24): 133-135.
[13]	HUANG Dong-mei; ZHU Zhong-jie; WANG Yu-er;. Adaptive Information Hiding Algorithm for Database Based on Watermark Technology [J]. Computer Engineering, 2008, 34(18): 191-193.

Please choose a citation manager

Content to export