International conference and workshop

2019

  • Wei-Ning Hsu, Yu Zhang, Ron Weiss, Heiga Zen, Yonghui Wu, Yuxuan Wang, Yuan Cao, Ye Jia, Zhifeng Chen, Jonathan Shen, Patrick Nguyen, Ruoming Pang, Hierarchical generative modeling for controllable speech synthesis, Proc. ICLR, 2019 (accepted).
  • Heiga Zen, Rob Clark, Ron J. Weiss, Viet Dang, Ye Jia, Yonghui Wu, Yu Zhang, Zhifeng Chen, LibriTTS: A Corpus Derived from LibriSpeech? for Text-to-Speech, Proc. Interspeech, 2019 (submitted).

2018

  • Antoine Bruguier, Heiga Zen, Arkady Arkhangorodsky, Sequence-to-sequence neural network model with 2D attention for learning Japanese pitch accents, Proc. Interspeech, 2018.

2017

  • Aäron van den Oord, Yazhe Li, Igor Babuschkin, Karen Simonyan, Oriol Vinyals, Koray Kavukcuoglu, George van den Driessche, Edward Lockhart, Luis Carlos Cobo Rus, Florian Stimberg, Norman Casagrande, Dominik Grewe, Seb Noury, Sander Dieleman, Erich Elsen, Nal Kalchbrenner, Heiga Zen, Alexander Graves, Helen King, Thomas Walters, Dan Belov, Demis Hassabis, Parallel WaveNet?: Fast high-fidelity speech synthesis, Proc. ICML, 2018 filepaper

2016

  • Keiichi Tokuda, Heiga Zen, Directly modeling voiced and unvoiced components in speech waveforms by neural networks, Proc. ICASSP, pp. 5640--5644, Shanghai, P.R. China, April 2016 filepaper
  • Heiga Zen, Yannis Agiomyrgiannakis, Niels Egberts, Fergus Henderson, Przemysław Szczepaniak, Fast, compact, and high quality LSTM-RNN based statistical parametric speech synthesizers for mobile devices, Proc. Interspeech (accepted), San Francisco, CA, U.S.A., September 2016 filepaper
  • Bo Li, Heiga Zen, Multi-language multi-speaker acoustic modeling for LSTM-RNN based statistical parametric speech synthesis, Proc. Interspeech (accepted), San Francisco, CA, U.S.A., September 2016 filepaper
  • Hideki Kawahara, Yannis Agiomyrgiannakis, Heiga Zen, Using instantaneous frequency and aperiodicity detection to estimate F0 for high-quality speech synthesis, Proc. ISCA SSW9 (submitted), Sunnyvale, CA, U.S.A., September 2016 filepaper

2015

  • Heiga Zen, Hasim Sak, Unidirectional long short-term memory recurrent neural network with recurrent output layer for low-latency speech synthesis, Proc. ICASSP, pp.4470--4474, Brisbane, Australia, April 2015 filepaper
  • Keiichi Tokuda, Heiga Zen, Directly modeling speech waveforms by neural networks for statistical parametric speech synthesis, Proc. ICASSP, pp.4215--4219, Brisbane, Australia, April 2015 filepaper
  • Heiga Zen, Acoustic modeling in statistical parametric speech synthesis - From HMM to LSTM-RNN, Proc. MLSLP, Aizu-Wakamatsu, Japan, September 2015. filepaper

2014

  • Heiga Zen, Andrew Senior, Deep mixture density networks for acoustic modeling in statistical parametric speech synthesis, Proc. ICASSP, pp.3872--3876, Florence, Italy, May 2014. filepaper

2013

  • Heiga Zen, Andrew Senior, Mike Schuster, Statistical parametric speech synthesis using deep neural networks, Proc. of ICASSP, pp.7962--7966, Vancouver, Canada, May 2013. filepaper

2012

  • Cassia Valentini-Botinhao, Ranniery Maia, Junichi Yamagishi, Simon King, Heiga Zen, Cepstral analysis based on the Glimpse proportion measure for improving the intelligibility of HMM-based synthetic speech in noise, Proc of ICASSP, pp.3997--4000, Kyoto, Japan, March 2012.
  • Vincent Wan, Javier Latorre, K. K. Chin, Langzhou Chen, Mark J. F. Gales, Heiga Zen, Kate Knill, Masami Akamine, Combining multiple high quality corpora for improving HMM-TTS, Proc. of Interspeech, pp.1135--1138, Portland, OR, U.S.A., September 2012.

2011

  • Heiga Zen, Mark J. F. Gales, Decision tree-based context clustering based on cross validation and hierarchical priors, Proc. of ICASSP, pp.4560-4563, Prague, Czech, May 2011. filepaper fileslide
  • Matt Shannon, Heiga Zen, William Byrne, The effect of using normalized models in statistical speech synthesis, Proc. of Interspeech, pp.121--124, Florence, Italy, August 2011. filelink
  • Ranniery Maia, Heiga Zen, Kate Knill, Mark J. F. Gales, Sabine Buchholz, Multipulse sequences for residual signal modeling, Proc. of Interspeech, pp.1833--1836, Florence, Italy, August 2011.
  • Ling-Hui Chen, Yoshihiko Nankaku, Heiga Zen, Keiichi Tokuda, Zhen-Hua Ling, Li-Rong Dai, Estimation of window coefficients for dynamic feature extraction for HMM based speech synthesis, Proc. of Interspeech, pp.1801--1804, Florence, Italy, August 2011.
  • Nicholas Pilkington, Heiga Zen, Mark J. F. Gales, Gaussian process experts for voice conversion, Proc. of Interspeech, pp.2761--2764, Florence, Italy, August 2011. filepaper fileposter

2010

  • Heiga Zen, Mark J. F. Gales, Yoshihiko Nankaku, Keiichi Tokuda, Statistical parametric speech synthesis based on product of experts, Proc. of ICASSP2010, pp.4242-4245, Dallas, TX, U.S.A., March 2010. filepaper fileslide
  • Ranniery Maia, Heiga Zen, Mark J. F. Gales, Statistical parametric speech synthesis with joint estimation of acoustic and excitation model parameters, Proc. of ISCA SSW7, pp.88-93, Kyoto, Japan, Sept. 2010.
  • Heiga Zen, Norbert Braunschweiler, Sabine Buchholz, Kate Knill, Sacha Krstulovic, Javier Latorre, HMM-based polyglot speech synthesis by speaker and language adaptive training, Proc. of ISCA SSW7, pp186-191, Kyoto, Japan, Sept. 2010. filepaper fileslide
  • Heiga Zen, Speaker and language adaptive training for HMM-based polyglot speech synthesis, Proc. of Interspeech2010, pp.410-413, Makuhari, Japan, Sept. 2010. filepaper fileslide
  • Kai Yu, Heiga Zen, Francois Mairesse, Steve Young, Context adaptive training with factorized decision trees for HMM-based speech synthesis, Proc. of Interspeech2010, pp.414-417, Makuhari, Japan, Sept. 2010. filelink
  • Nicholas Pilkington, Heiga Zen, An implementation of decision tree-based context clutering on graphics processing units, Proc. of Interspeech2010, pp.833-836, Makuhari, Japan, Sept. 2010. filepaper fileposter
  • Javier Latorre, Mark J.F. Gales, Heiga Zen, Training a parametric-based log F0 model with the minimum generation error criterion, Proc. of Interspeech2010, pp.2174-2177, Makuhari, Japan, Sept. 2010.

2009

  • Heiga Zen, Yoshihiko Nankaku, Keiichi Tokuda, Stereo-based stochastic noise compensation based on trajectory GMMs, Proc. of ICASSP2009, pp.4577-4580, Taipei, Taiwan, April 2009. filepaper fileposter
  • Kei Hashimoto, Heiga Zen, Yoshihiko Nankaku, Takashi Masuko, Keiichi Tokuda, A Bayesian approach to HMM-based speech synthesis, Proc. of ICASSP2009, pp.4029-4033, Taipei, Taiwan, April 2009. filelink
  • Heiga Zen, Norbert Braunschweiler, Context-dependent additive log F0 model for HMM-based speech synthesis, Proc. of Interspeech2009, pp.2091-2094, Brighton, UK, Sept. 2009. filepaper fileposter
  • Keiichiro Oura, Heiga Zen, Yoshihiko Nankaku, Akinobu Lee, Keiichi Tokuda, Tying covariance matrices to reduce the footprint of HMM-based speech synthesis systems, Proc. of Interspeech2009, pp.1759-1762, Brighton, UK, Sept. 2009. filelink
  • Heiga Zen, Keiichiro Oura, Takashi Nose, Junichi Yamagishi, Shinji Sako, Tomoki Toda, Takashi Masuko, Alan W. Black, Keiichi Tokuda, Recent development of the HMM-based speech synthesis system (HTS), Proc. APSIPA ASC 2009, Sapporo, Japan, Oct. 2009 (accepted).

2008

  • Yi-Jian Wu, Heiga Zen, Yoshihiko Nankaku, Keiichi Tokuda, Minimum generation error criterion considering global/local variance for HMM-based speech synthesis, Proc. of ICASSP2008, pp.4621-4624, Las Vegas, NV, Mar. 2008.
  • Junichi Yamagishi, Takashi Nose, Heiga Zen, Tomoki Toda, Keiichi Tokuda, Performance evaluation of the speaker-independent HMM-based speech synthesis system "HTS-2007" for the Blizzard Challenge 2007, Proc. of ICASSP2008, pp.3957-3960, Las Vegas, NV, Mar. 2008. filelink
  • Yoshihiko Nankaku, Kazuhiro Nakamura, Heiga Zen, Keiichi Tokuda, Acoustic modeling with contextual additive structure for HMM-based speech recognition, Proc. of ICASSP2008, pp.4469-4472, Las Vegas, NV, Mar. 2008.
  • Junichi Yamagishi, Heiga Zen, Yi-Jian Wu, Tomoki Toda, Keiichi Tokuda, HTS-2008: Yet another evaluation of speaker adaptive HMM-based speech synthesis system, Proc. of Blizzard Challenge Workshop 2008, Brisbane, Australia, Sept. 2008. filelink
  • Sayaka Shiota, Kei Hashimoto, Heiga Zen, Yoshihiko Nankaku, Akinobu Lee, Keiichi Tokuda, Acoustic modeling based on model structure annealing for speech recognition, Proc. of Interspeech2008, pp.932-035, Brisbane, Australia, Sept. 2008.
  • Kei Hashimoto, Heiga Zen, Yoshihiko Nankaku, Akinobu Lee, Keiichi Tokuda, Bayesian context clustering using cross valid prior distribution for HMM-Based speech recognition, Proc. of Interspeech2008, pp,936-939, Brisbane, Australia, Sept. 2008. filelink
  • Heiga Zen, Yoshihiko Nankaku, Keiichi Tokuda, Probabilistic feature mapping based on trajectory HMMs, Proc. of Interspeech2008, pp.1068-1071, Brisbane, Australia, Sept. 2008. filepaper fileposter
  • Simon King, Keiichi Tokuda, Heiga Zen, Junichi Yamagishi, Unsupervised adaptation for HMM-based speech synthesis, Proc. of Interspeech2008, pp.1869-1872, Brisbane, Australia, Sept. 2008.
  • Zhi-Peng Yu, Yi-Jian Wu, Heiga Zen, Yoshihiko Nankaku, Keiichi Tokuda, Analysis of stream-dependent tying structure for HMM-based speech synthesis, Proc. of ICSP2008, Beijing, P.R. China, Oct. 2008.

2007

  • Alan W. Black, Heiga Zen, Keiichi Tokuda, Statistical parametric speech synthesis, Proc. of ICASSP2007, pp.1229-1232, Honolulu, Hawaii, Apr. 2007. filepaper fileslide
  • Heiga Zen, Yoshihiko Nankaku, Keiichi Tokuda, Model-space MLLR for trajectory HMMs, Proc. of Interspeech2007, pp.2065-2068, Antwerp, Belgium, Aug. 2007. filepaper fileposter
  • Ranniery Maia, Tomoki Toda, Heiga Zen, Yoshihiko Nankaku, Keiichi Tokuda, A trainable excitation model for HMM-based speech synthesis, Proc. of Interspeech2007, pp.1909-1912, Antwerp, Belgium, Aug. 2007. filelink
  • Heiga Zen, Takashi Nose, Junichi Yamagishi, Shinji Sako, Takashi Masuko, Alan W. Black, Keiichi Tokuda, The HMM-based speech synthesis system version 2.0, Proc. of ISCA SSW6, pp.294-299, Bonn, Germany, Aug. 2007. filepaper fileslide (21 MBytes)
  • Ranniery Maia, Tomoki Toda, Heiga Zen, Yoshihiko Nankaku, Keiichi Tokuda, An excitation model for HMM-based speech synthesis based on residual modeling, Proc. of ISCA SSW6, pp.131-136, Bonn, Germany, Aug. 2007. filelink
  • Junichi Yamagishi, Takao Kobayashi, Steve Renals, Simon King, Heiga Zen, Tomoki Toda, Keiichi Tokuda, Improved average-voice-based speech synthesis using gender-mixed modeling and a parameter generation algorithm considering GV, Proc. of ISCA SSW6, pp.125-130, Bonn, Germany, Aug. 2007. filelink
  • Junichi Yamagishi, Heiga Zen, Tomoki Toda, Keiichi Tokuda, Speaker-Independent HMM-based Speech Synthesis System - HTS-2007 System for the Blizzard Challenge 2007, Proc. of BLZ3-2007, paper 008, Bonn, Germany, Aug. 2007. filelink

2006

  • Heiga Zen, Yoshihiko Nankaku, Keiichi Tokuda, Tadashi Kitamura, Estimating trajectory HMM parameters using Monte Carlo EM with Gibbs sampler, Proc. of ICASSP2006, pp.1173-1176, Toulouse, France, May 2006. filepaper fileposter
  • Keiichiro Oura, Heiga Zen, Yoshihiko Nankaku, Akinobu Lee, Keiichi Tokuda, Hidden semi-Markov model based speech recognition system using weighted finite-state transducer, Proc of ICASSP2006, pp.33-34, Toulouse, France, May 2006. filelink
  • Heiga Zen, Yoshihiko Nankaku, Keiichi Tokuda, Tadashi Kitamura, Speaker adaptation of trajectory HMMs using feature-space MLLR, Proc. of Interspeech2006 (ICLSP), pp.2274-2277, Pittsburgh, PA, U.S.A., Sept. 2006. filepaper fileposter
  • Keijiro Saino, Heiga Zen, Yoshihiko Nankaku, Akinobu Lee, Keiichi Tokuda, HMM-based singing voice synthesis system, Proc. of Interspeech2006 (ICSLP), pp.1141-1144, Pittsburgh, PA, U.S.A., Sept. 2006. filelink
  • Heiga Zen, Tomoki Toda, Keiichi Tokuda, The Nitech-NAIST HMM-based speech synthesis system for the Blizzard Challenge 2006, Proc. of BLZ2-2006, Pittsburgh, PA, U.S.A., Sept. 2006. filepaper fileslide filelink
  • Kei Hashimoto, Heiga Zen, Yoshihiko Nankaku, Akinobu Lee, Keiichi Tokuda, Hyperparameter estimation for speech recognition based on variational Bayesian approach, Proc. ASA & ASJ Joint Meeting, pp.3042, Honolulu, Hawaii, Dec. 2006. filelink

2005

  • Amaro Lima, Heiga Zen, Yoshihiko Nankaku, Keiichi Tokuda, Tadashi Kitamura, Sparse KPCA for feature extraction in speech recognition, Proc. of ICASSP2005, vol.I, pp.353-356, Philadelphia, PA, U.S.A., Mar. 2005. filelink
  • Heiga Zen, Tomoki Toda, An overview of Nitech HMM-based speech synthesis system for Blizzard Challenge 2005, Proc. of Interspeech2005 (Eurospeech), pp.93-96, Lisbon, Sept. 2005. filepaper fileslide
  • Wael Hamza, Raimo Bakis, Zhang-Wei Shuang, Heiga Zen, On building a concatenative speech synthesis system from the Blizzard Challenge speech databases, Proc. of Interspeech2005 (Eurospeech), pp.97-100, Lisbon, Sept. 2005. filelink

2004

  • Heiga Zen, Keiichi Tokuda, Tadashi Kitamura, A Viterbi algorithm for a trajectory model derived from HMM with explicit relationship between static and dynamic features, Proc. of ICASSP 2004, pp.837-840, Montreal, May 2004. filepaper fileposter
  • Heiga Zen, Keiichi Tokuda, Tadashi Kitamura, An introduction of trajectory model into HMM-based speech synthesis, Proc. of 5th ISCA Speech Synthesis Workshop, Pittsburgh, June 2004. filepaper fileposter
  • Heiga Zen, Keiichi Tokuda, Takashi Masuko, Takao Kobayashi, Tadashi Kitamura, Hidden semi-Markov model based speech synthesis, Proc. of ICSLP 2004, vol.II, pp.1397-1400, Jeju, Oct. 2004. filepaper fileslide
  • Yohei Itaya, Heiga Zen, Yoshihiko Nankaku, Chiyomi Miyajima, Keiichi Tokuda, Tadashi Kitamura, Deterministic annealing EM algorithm in parameter estimation for acoustic model, Proc. of ICSLP 2004, vol.I, pp.433-436, Jeju, Oct. 2004. filepaper
  • Ryosuke Tsuzuki, Heiga Zen, Keiichi Tokuda, Tadashi Kitamura, Murtaza Bulut, Shrikanth S. Narayanan, Constructing emotional speech synthesizers with limited speech database, Proc. of ICSLP 2004, vol.II, pp.1185-1188, Jeju, Oct. 2004. filepaper
  • Keiichi Tokuda, Heiga Zen, Tadashi Kitamura, Reformulating the HMM as a trajectory model, Proc. of Beyond HMM -- Workshop on statistical modeling approach for speech recognition, Kyoto, Dec. 2004. filepaper fileposter

2003

  • Hiroyuki Suzuki, Heiga Zen, Yoshihiko Nankaku, Chiyomi Miyajima, Keiichi Tokuda, Tadashi Kitamura, Speech recognition using voice-characteristic dependent acoustic model, Proc. of ICASSP 2003, vol.1, pp.740-743, Apr. 2003. filepaper
  • Takahiro Hoshiya, Shinji Sako, Heiga Zen, Keiichi Tokuda, Takashi Masuko, Takao Kobayashi, Tadashi Kitamura, Improving the performance of HMM-based very low bitrate speech coding, Proc. of ICASSP 2003, vol.1, pp.800-803, Apr. 2003. filepaper
  • Heiga Zen, Keiichi Tokuda, Tadashi Kitamura, Decision tree based simultaneous clustering of phonetic contexts, dimensions, and state positions for acoustic modeling, Proc. of Eurospeech 2003, pp.3189-3192, Sept. 2003. filepaper fileposter
  • Keiichi Tokuda, Heiga Zen, Tadashi Kitamura, Trajectory modeling based on HMMs with the explicit relationship between static and dynamic features, Proc. of Eurospeech 2003, pp.865-868, Sept. 2003. filepaper
  • Ranniery S. Maia, Heiga Zen, Keiichi Tokuda, Tadashi Kitamura, Towards the development of a Brazilian Portuguese text-to-speech system based on HMM, Proc. of Eurospeech 2003, pp.2465-2468, Sept. 2003. filelink
  • Amaro Lima, Heiga Zen, Yoshihiko Nankaku, Chiyomi Miyajima, Keiichi Tokuda, Tadashi Kitamura, On the use of kernel PCA for feature extraction in speech recognition, Proc. of Eurospeech 2003, pp.2625-2628, Sept. 2003. filelink

2002

  • Heiga Zen, Keiichi Tokuda, Tadashi Kitamura, Decision tree distribution tying based on a dimensional split technique, Proc. of ICSLP 2002, pp.1257-1260, Sept. 2002. filepaper fileslide
  • Keiichi Tokuda, Heiga Zen, Alan W. Black, An HMM-based speech synthesis system applied to English, Proc. of IEEE Speech Synthesis Workshop, Sept. 2002. filelink

Attach file: filezen-ssw5.pdf 1089 download [Information] filezen-ssw6.pdf 4512 download [Information] filezen-ssw7-slide.pdf 2164 download [Information] filezen-ssw7.pdf 2283 download [Information] filezen-trjMapping-bham09.pdf 2371 download [Information] filezen-interspeech09.pdf 2349 download [Information] filezen-interspeech09-poster.pdf 2322 download [Information] filezen-interspeech2010-slide.pdf 1629 download [Information] filezen-icslp08.pdf 2419 download [Information] filezen-interspeech2010.pdf 1549 download [Information] filezen-ssw5-poster.pdf 3268 download [Information] filezen-icslp02.pdf 2099 download [Information] filezen-icassp2011.pdf 1481 download [Information] filezen-icslp04.pdf 2921 download [Information] filezen-icslp02-slide.pdf 2119 download [Information] filezen-icslp04-slide.pdf 3072 download [Information] filezen-icslp08-poster.pdf 3289 download [Information] filezen-icslp06-poster.pdf 2773 download [Information] filezen-icslp06.pdf 2685 download [Information] filezen-icassp2010-slide.pdf 1584 download [Information] filezen-icassp2010.pdf 1495 download [Information] filezen-icassp2011-slide.pdf 1674 download [Information] filezen-icassp09.pdf 1765 download [Information] filezen-icassp09-poster.pdf 2447 download [Information] filezen-icassp06.pdf 2764 download [Information] filezen-icassp06-poster.pdf 2863 download [Information] filezen-icassp04.pdf 2663 download [Information] filetokuda-euro03.pdf 2323 download [Information] filezen-blizzard06.pdf 2400 download [Information] filetokuda-beyondHMM.pdf 2202 download [Information] filesuzuki-icassp03.pdf 3898 download [Information] filezen-euro03-poster.pdf 2285 download [Information] filetokuda-beyondHMM-poster.pdf 2104 download [Information] filezen-blizzard06-slide.pdf 1920 download [Information] filezen-icassp04-poster.pdf 2209 download [Information] filezen-euro03.pdf 2608 download [Information] filezen-euro07.pdf 2612 download [Information] filezen-euro07-poster.pdf 2418 download [Information] filezen-euro05-slide.pdf 1796 download [Information] filezen-euro05.pdf 2552 download [Information] filehoshiya-icassp03.pdf 3536 download [Information] fileitaya-icslp04.pdf 3813 download [Information] fileawb-icassp07.pdf 4835 download [Information] filepilkington-interspeech2010-poster.pdf 1774 download [Information] fileryosuke-icslp04.pdf 2495 download [Information] fileawb-icassp07-slide.pdf 6157 download [Information] filepilkington-interspeech2011.pdf 1956 download [Information] filepilkington-interspeech2011-poster.pdf 1052 download [Information] filepilkington-interspeech2010.pdf 2250 download [Information]

Front page   Edit Freeze Diff Backup Upload Copy Rename Reload   New List of pages Search Recent changes   Help   RSS of recent changes
Last-modified: 2019-04-19 (Fri) 12:23:30 (215d)