Yukiya Hono
Visiting Assistant Professor at Nagoya Institute of Technology
Profile
Research and Work Experiences
- May. 2022 -- Present
- Visiting Assistant Processor at Nagoya Institute of Technology
- Apr. 2022 -- Present
- Full Time Employee at rinna Co., Ltd., Japan
- Apr. 2022 -- Apr. 2022
- Researcher at Nagoya Institute of Technology
- Jan. 2020 -- Feb. 2020
- Visiting Researcher at University of Sheffield, U.K.
- Oct. 2019 -- Dec. 2019
- Visiting Researcher at University of Edinburgh, U.K.
- July. 2019 -- Aug. 2019
- Internship at Microsoft Development Co., Ltd., Japan
Educations
- Apr. 2019 -- Mar. 2022
- Department of Computer Science, Nagoya Institute of Technology, Japan (PhD)
- Apr. 2017 -- Mar. 2019
- Department of Computer Science, Nagoya Institute of Technology, Japan (Master)
- Apr. 2013 -- Mar. 2017
- Department of Computer Science, Nagoya Institute of Technology, Japan (Bachelor)
Awards
- 5 Sep. 2024
- The 3rd Yoshida Prize Speech Synthesis Researcher Award, Acoustical Society of Japan
- 9 Dec. 2023
- IEEE Nagoya Section Young Researcher Award 2023
- 27 Dec. 2022
- 2022 IEEE Signal Processing Society Japan Student Journal Paper Award
- 15 Sep. 2022
- The 52nd Awaya Prize Young Researcher Award, Acoustical Society of Japan
- 26 Mar. 2022
- IEEE Nagoya Section Conference Presentation Award 2022
- 24 Mar. 2022
- The Vice President Award, Nagoya Institute of Technology
- 28 Dec. 2021
- 2021 IEEE Signal Processing Society Japan Student Conference Paper Award
- 23 Mar. 2021
- IEEE Nagoya Section Excellent student Award 2020
- 11 June 2019
- IEICE Tokai Section Student Award 2018
- 06 Mar. 2019
- The 18th Student Presentation Award, Acoustical Society of Japan
- 18 Dec. 2018
- 2018 the presentation excellence prize from ASJ-Tokai
- 04 Aug. 2018
- Overview Lecture Award, 22nd Tokai region's speech related master's graduation thesis midterm presentation event
- 23 Mar. 2017
- DEN'EIKAI Award
Software
- Apr. 2018 -- Present
- HMM/DNN-based Singing Voice Synthesis System (Sinsy)
Resarch Topics
- Text-to-speech synthesis
- Singing voice synthesis
Society
- Institute of Electrical and Electronics Engineers (IEEE)
- Acoustical Society of Japan (ASJ)
Publications
Journal Paper
-
PeriodNet: A non-autoregressive raw waveform generative model with a structure separating periodic and aperiodic components
Yukiya Hono, Shinji Takaki, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda
IEEE Access, vol. 9, pp. 137599-137612, October, 2021. (DOI: 10.1109/ACCESS.2021.3118033) (IEEE Xplore) (demo page)
-
Sinsy: A Deep Neural Network-Based Singing Voice Synthesis System
Yukiya Hono, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda
IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 29, pp. 2803-2815, August, 2021. (DOI: 10.1109/TASLP.2021.3104165) (IEEE SPS Japan Student Journal Paper Award) (IEEE Xplore) (demo page)
International Conference
-
PSLM: Parallel Generation of Text and Speech with LLMs for Low-Latency Spoken Dialogue Systems
Kentaro Mitsui, Koh Mitsuda, Toshiaki Wakatsuki, Yukiya Hono, and Kei Sawada
Findings of the Association for Computational Linguistics EMNLP 2024. (accepted) (arXiv preprint)
-
Integrating Pre-Trained Speech and Language Models for End-to-End Speech Recognition
Yukiya Hono, Koh Mitsuda, Tianyu Zhao, Kentaro Mitsui, Toshiaki Wakatsuki, and Kei Sawada
Findings of the Association for Computational Linguistics ACL 2024, pp. 13289-13305, Bangkok, Thailand, August 2024. (ACL Anthology)
-
Release of Pre-Trained Models for the Japanese Language
Kei Sawada, Tianyu Zhao, Makoto Shing, Kentaro Mitsui, Akio Kaga, Yukiya Hono, Toshiaki Wakatsuki, and Koh Mitsuda
2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pp. 13898-13905, Torino, Italia, May, 2024. (ACL Anthology)
-
PeriodGrad: Towards Pitch-Controllable Neural Vocoder Based on a Diffusion Probabilistic Model
Yukiya Hono, Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda
2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 12782-12786, Seoul, Korea, April, 2024. (IEEE Xplore)
-
UniFLG: Unified Facial Landmark Generator from Text or Speech
Kentaro Mitsui, Yukiya Hono, and Kei Sawada
Interspeech 2023, pp. 5501-5505, Dublin, Ireland, September, 2023. (ISCA Archive)
-
Singing Voice Synthesis Based on a Musical Note Position-Aware Attention Mechanism
Yukiya Hono, Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda
2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece, June, 2023. (IEEE Xplore)
-
Embedding a Differentiable Mel-cepstral Synthesis Filter to a Neural Speech Synthesis System
Takenori Yoshimura, Shinji Takaki Kazuhiro Nakamura, Keiichiro Oura, Yukiya Hono, Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda
2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece, June, 2023. (IEEE Xplore)
-
End-to-End Text-to-Speech Based on Latent Representation of Speaking Styles Using Spontaneous Dialogue
Kentaro Mitsui, Tianyu Zhao, Kei Sawada, Yukiya Hono, Yoshihiko Nankaku, and Keiichi Tokuda
Interspeech 2022, pp. 2328–2332, Incheon, Korea, September, 2022. (ISCA Archive)
-
PeriodNet: A non-autoregressive waveform generation model with a structure separating periodic and aperiodic components
Yukiya Hono, Shinji Takaki, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda
2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6049-6053, Toronto, Ontario, Canada, June, 2021. (IEEE SPS Japan Student Conference Paper Award) (IEEE Xplore)
-
Hierarchical Multi-Grained Generative Model for Expressive Speech Synthesis
Yukiya Hono, Kazuna Tsuboi, Kei Sawada, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda
Interspeech 2020, pp. 3441-3445, Shanghai, China, October, 2020. (ISCA archive)
-
Singing voice synthesis based on generative adversarial networks
Yukiya Hono, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda
2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6955-6959, Brighton, UK, May, 2019.
-
Singing Voice Conversion Using Posted Waveform Data on Music Social Media
Koki Senda, Yukiya Hono, Kei Sawada, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda
Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pp. 1913-1917, Honolulu, Hawaii, November, 2018.
-
Recent Development of the DNN-based Singing Voice Synthesis System -- Sinsy
Yukiya Hono, Shumma Murata, Kazuhiro Nakamura, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda
Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pp. 1003-1009, Honolulu, Hawaii, November, 2018.
Technical Report
-
Singing voice synthesis based on a frame-driven attention mechanism considering vocal timing deviation
Miku Nishihara, Yukiya Hono, Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda
Technical Report of IEICE, vol. 122, no. 339, pp. 19-24, Okinawa, Japan, February, 2023.
-
A comparison of neural vocoders in singing voice synthesis
Sota Wada, Yukiya Hono, Shinji Takaki, Keiichiro Oura, Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda
Technical Report of IEICE, vol. 119, no. 321, pp. 85-90, Tokyo, Japan, December, 2019.
Domestic Conference
-
Consideration on periodic excitation signals in source-filter type neural vocoders
Hikaru Aohara, Yukiya Hono, Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda
Acoustical Society of Japan 2024 Spring Meeting, pp. 813-816, Tokyo, Japan, March, 2024. (Student presentation award)
-
Generating Spoken Dialogue from Text for Natural Conversations Between AI Agents
Kentaro Mitsui, Yukiya Hono, Kei Sawada
Acoustical Society of Japan 2024 Spring Meeting, pp. 1327-1330, Tokyo, Japan, March, 2024.
-
End-to-end speech recognition by integrating self-supervised speech and language model
Yukiya Hono, Koh Mitsuda, Tianyu Zhao, Kentaro Mitsui, Toshiaki Wakatsuki, Kei Sawada
Acoustical Society of Japan 2024 Spring Meeting, pp. 1323-1326, Tokyo, Japan, March, 2024.
-
Japanese pre-trained models using self-supervised learning and their applications to speech recognition and synthesis
Kei Sawada, Yukiya Hono, Kentaro Mitsui
Acoustical Society of Japan 2024 Spring Meeting, pp. 1319-1320, Tokyo, Japan, March, 2024. (Invited talk)
-
A neural vocoder training method using a pitch extractor for fundamental frequency controllabilit
Shion Fukuda, Yukiya Hono, Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda
Acoustical Society of Japan 2023 Autumn Meeting, pp. 1065-1068, Aichi, Japan, September, 2023.
-
A neural vocoder training method using a pitch extractor for fundamental frequency controllabilit
Shion Fukuda, Yukiya Hono, Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda,
Acoustical Society of Japan 2023 Autumn Meeting, pp. 1065-1068, Aichi, Japan, September, 2023.
-
Neural vocoder based on disentangled representation learning to control fundamental frequency
Suzuka Sato, Takato Fujimoto, Yukiya Hono, Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda
Acoustical Society of Japan 2023 Autumn Meeting, pp. 1061-1064, Aichi, Japan, September, 2023.
-
PeriodGrad: A neural vocoder based on a diffusion probabilistic model with fundamental frequency controllability
Yukiya Hono, Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda,
Acoustical Society of Japan 2023 Autumn Meeting, pp. 1045-1048, Aichi, Japan, September, 2023.
-
A study on vocal timing modeling for sequence-to-sequence singing voice synthesis
Miku Nishihara, Yukiya Hono, Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda
Acoustical Society of Japan 2022 Autumn Meeting, pp. 1359-1362, Hokkaido, Japan, September, 2022.
-
Text-to-speech synthesis based on the extraction and prediction of latent speaking style representation using spontaneous dialogue
Kentaro Mitsui, Tianyu Zhao, Kei Sawada, Yukiya Hono, Yoshihiko Nankaku, and Keiichi Tokuda
Acoustical Society of Japan 2022 Autumn Meeting, pp. 1593-1596, Hokkaido, Japan, September, 2022.
-
A study on musical note position-aware attention mechanism for sequence-to-sequence singing voice synthesis
Yukiya Hono, Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda
Acoustical Society of Japan 2022 Autumn Meeting, pp. 1589-1592, Hokkaido, Japan, September, 2022.
-
Embedding a differentiable mel-cepstral synthesis filter to an end-to-end speech synthesis system
Takenori Yoshimura, Shinji Takaki, Kazuhiro Nakamura, Keiichiro Oura, Yukiya Hono, Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda
Acoustical Society of Japan 2022 Autumn Meeting, pp. 1585-1588, Hokkaido, Japan, September, 2022.
-
Neural vocoder training considering the aperiodic measure
Yukiya Hono, Shinji Takaki, Kei Hashimoto, Kazuhiro Nakamura, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda
Acoustical Society of Japan 2022 Spring Meeting, pp. 973-976, Japan, March, 2022. (Awaya Prize Young Researcher Award)
-
Sequence-to-sequence singing voice synthesis considering vocal timing fluctuation
Yukiya Hono, Taisei Kato, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda
Acoustical Society of Japan 2021 Autumn Meeting, pp. 911-914, Japan, September, 2021.
-
Automatic pitch correction of tuneless singing for DNN-based singing voice synthesis
Yukiya Hono, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda
Acoustical Society of Japan 2021 Autumn Meeting, pp. 907-910, Japan, September, 2021.
-
An investigation of modeling speech waveform by neural vocoder based on periodic/aperiodic decomposition
Yukiya Hono, Shinji Takaki, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda
Acoustical Society of Japan 2021 Spring Meeting, pp. 861-864, Japan, March, 2021.
-
Expressive speech synthesis using hierarchical multi-grained generative model
Yukiya Hono, Kazuna Tsuboi, Kei Sawada, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda
Acoustical Society of Japan 2020 Autumn Meeting, pp. 791-794, Japan, September, 2020.
-
An investigation of modeling periodic and aperiodic components in speech vocoder based on deep neural networks
Yukiya Hono, Shinji Takaki, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda
Acoustical Society of Japan 2020 Autumn Meeting, pp. 759-760, Japan, September, 2020.
-
Statistical parametric speech synthesis based on acoustic parameter prediction using cascade model architecture
Kentaro Mitsui, Yukiya Hono, Kazuna Tsuboi, Kei Sawada
Acoustical Society of Japan 2020 Spring Meeting, pp. 1107-1108, Saitama, Japan, March, 2020.
-
A study on singing voice synthesis with attention mechanism using musical score time information
Shumma Murata, Takato Fujimoto, Yukiya Hono, Shinji Takaki, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda
Acoustical Society of Japan 2019 Autumn Meeting, pp. 943-944, Shiga, Japan, September, 2019.
-
AI Singer Rinna: a singing voice synthesis system using user's singing voice or musical score
Kei Sawada, Kazuna Tsuboi, Xianchao Wu, Zhan Chen, Yukiya Hono, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda
Acoustical Society of Japan 2019 Spring Meeting, pp. 1041-1044, Tokyo, Japan, March, 2019.
-
Singing voice synthesis using generative adversarial networks
Yukiya Hono, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda
Acoustical Society of Japan 2019 Spring Meeting, pp. 1039-1040, Tokyo, Japan, March, 2019.
-
A DNN-based singing voice synthesis system -- Sinsy
Yukiya Hono, Shumma Murata, Kazuhiro Nakamura, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda
Acoustical Society of Japan 2018 Autumn Meeting, pp. 1099-1102, Oita, Japan, September, 2018. (Student presentation award)
-
Singing voice synthesis based on neural network using a structure of a hidden semi-Markov model
Yukiya Hono, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda
Acoustical Society of Japan 2018 Spring Meeting, pp. 247-248, Saitama, Japan, March, 2018.
-
Singing voice conversion using post data in music SNS
Yukiya Hono, Kei Sawada, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda, Daisuke Kondo, and Daisuke Ishikawa
Acoustical Society of Japan 2017 Autumn Meeting, pp. 209-210, Ehime, Japan, September, 2017.
Thesis
-
Acoustic and waveform modeling for singing voice synthesis based on deep neural networks
Yukiya Hono
Doctor Thesis, Nagoya Institute of Technology, February, 2022.
-
Development of a singing voice synthesis system based on deep neural networks
Yukiya Hono
Master Thesis, Nagoya Institute of Technology, February, 2019.
-
Singing voice synthesis based on deep neural networks using generative model structures
Yukiya Hono
Graduation Thesis, Nagoya Institute of Technology, February, 2017.
Preprint
-
Towards human-like spoken dialogue generation between AI agents from written dialogue
Kentaro Mitsui, Yukiya Hono, and Kei Sawada
arXiv preprint arXiv:2310.01088, October, 2023. (arXiv)
-
Singing voice synthesis based on frame-level sequence-to-sequence models considering vocal timing deviation
Miku Nishihara, Yukiya Hono, Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda
arXiv preprint arXiv:2301.02262, January, 2023. (arXiv)
Contact
The 5th floor of Building No.4, Tokuda, Nankaku and Hashimoto Laboratory
Nagoya Institute of Technology, Gokiso-Cho, Showa-Ku, Nagoya, 466-8555 Japan
E-mail : hono [at] nitech.ac.jp