作者投稿和查稿 主编审稿 专家审稿 编委审稿 远程编辑

计算机工程 ›› 2006, Vol. 32 ›› Issue (21): 3-5,11. doi: 10.3969/j.issn.1000-3428.2006.21.002

• 博士论文 • 上一篇    下一篇

基于决策树的语料库分析

崔丹丹,蔡莲红   

  1. (清华大学计算机系普适计算教育部重点实验室,北京 100084)
  • 收稿日期:1900-01-01 修回日期:1900-01-01 出版日期:2006-11-05 发布日期:2006-11-05

Corpus Analysis Based on Decision Tree

CUI Dandan, CAI Lianhong   

  1. (Key Lab of Pervasive Computing, Ministry of Education, Dept. of Computer, Tsinghua Univ., Beijing 100084)
  • Received:1900-01-01 Revised:1900-01-01 Online:2006-11-05 Published:2006-11-05

摘要: 利用CART决策树算法,对TH-CoSS语料库音节的韵律参数进行聚类,分析语境特征的分布:出现率、平均层级。为了评价语境特征对语音韵律表现的影响程度,设计了一个影响权重的重要性函数,对语料库文本设计和TTS系统选音参数的权重设定具有较高的参考价值。

关键词: 语境特征, 韵律参数, TH-CoSS, CART, 影响权重

Abstract: This paper uses a CART to cluster the syllables in the TH-CoSS corpus by their prosodic parameters, analyses the distribution of context features (appearance rates and average levels), and proposes an importance function of context features to evaluate their weights of influence on the prosody of speech, which shows to be a valuable reference to both text script design of speech corpus and weight setting in TTS unit selection.


Key words: Context features, Prosodic parameters, TH-CoSS, Classification and regression trees(CART), Weight of influence

中图分类号: