Metric | 既存データ | 音素+音符長+音高(ヒュリースティック) | 位置情報+音素 | 04_static_trj |
---|---|---|---|---|
Mel-cepstrum distortion (dB) Lower is better |
5.43778 | 5.53042 | 5.50459 | 5.38162 ★ |
GV distance Lower is better |
0.46402 ★ | 0.57736 | 0.60858 | 0.48721 |
F0 RMSE (cent) Lower is better |
74.19561 | 74.00517 | 69.39025 ★ | 72.32802 |
F0 correlation Higher is better |
0.97226 | 0.97221 | 0.97552 ★ | 0.97217 |
Total voice/unvoice error (%) Lower is better |
2.14051 | 2.35665 | 2.13791 ★ | 2.10536 ★ |
★ indicates the best performance for each metric. All experiments used the same test set (10 utterances, 76804 frames).