= More experiment on F0 Incorporation training =
Date: 2009-01-17
iterative result
= F0 Incorporated with iterative training 2 =
Date: 2009-01-05
10Hz interval is used in each iteration.
result is evaluated by measuring the RMSE of several pitch values.
iterative result
= Prosody Analysis =
Date: 2008-12-22
Energy, Duration and pitch (maximum/minimum/average/range) analysis in 3 layers including
utterance, prosody phrase and syllable.
prosody analysis
= F0 Incorporated with iterative training =
Date: 2008-12-8
Iterative training incorporating F0 and the interval is gradually reduced for accuracy.
iterative result
= F0 Incorporated/Modification Result =
Date: 2008-11-24
5 features from F0 was used in training for 2 emotions.
F0 modification at synthesis time is also tested.
f0inc/modification result
= Emotion HTS Result =
Date: 2008-11-17
Data
4 emotions: angry, happy, sad and surprise.
For each emotion, 220 utterances are recorded, the first 210 are used as training data and the remains for test.
Labels from 10 non-emotion utterances (text only) are also used for testing.
The non-emotion utterances are merged into 4 kinds of contextes to form 4 emotional text set. The labels are also generated for testing.
Methods
3 models are used:
- Emotion dependent HTS model.
- Adaptation from emotion indenpendent model with Maximum a Posterior.
- Adaptive training.
Samples
adaptation result
= Mandarin HTS Result =
Date: 2008-11-05
Data
3000 utterances from Tsinghua HCSI Corpus.
Samples
synthesis resust