Yukiya Hono

Visiting Assistant Professor at Nagoya Institute of Technology

English page Japanese page

Profile

Research and Work Experiences

May. 2022 -- Present: Visiting Assistant Professor at Nagoya Institute of Technology
Apr. 2022 -- Dec. 2025: Researcher at Nagoya Institute of Technology
Apr. 2022 -- Oct. 2025: Full Time Employee at rinna Co., Ltd., Japan
Jan. 2020 -- Feb. 2020: Visiting Researcher at University of Sheffield, U.K.
Oct. 2019 -- Dec. 2019: Visiting Researcher at University of Edinburgh, U.K.
July. 2019 -- Aug. 2019: Internship at Microsoft Development Co., Ltd., Japan

Educations

Apr. 2019 -- Mar. 2022: Department of Computer Science, Nagoya Institute of Technology, Japan (PhD)
Apr. 2017 -- Mar. 2019: Department of Computer Science, Nagoya Institute of Technology, Japan (Master)
Apr. 2013 -- Mar. 2017: Department of Computer Science, Nagoya Institute of Technology, Japan (Bachelor)

Awards

18 Mar. 2025: The 20th Itakura Prize Innovative Young Researcher Award, Acoustical Society of Japan
5 Sep. 2024: The 3rd Yoshida Prize Speech Synthesis Researcher Award, Acoustical Society of Japan
9 Dec. 2023: IEEE Nagoya Section Young Researcher Award 2023
27 Dec. 2022: 2022 IEEE Signal Processing Society Japan Student Journal Paper Award
15 Sep. 2022: The 52nd Awaya Prize Young Researcher Award, Acoustical Society of Japan
26 Mar. 2022: IEEE Nagoya Section Conference Presentation Award 2022
24 Mar. 2022: The Vice President Award, Nagoya Institute of Technology
28 Dec. 2021: 2021 IEEE Signal Processing Society Japan Student Conference Paper Award
23 Mar. 2021: IEEE Nagoya Section Excellent student Award 2020
11 June 2019: IEICE Tokai Section Student Award 2018
06 Mar. 2019: The 18th Student Presentation Award, Acoustical Society of Japan
18 Dec. 2018: 2018 the presentation excellence prize from ASJ-Tokai
04 Aug. 2018: Overview Lecture Award, 22nd Tokai region's speech related master's graduation thesis midterm presentation event
23 Mar. 2017: DEN'EIKAI Award

Software

Apr. 2018 -- Present: DNN/HMM-Based Singing Voice Synthesis System (Sinsy)

Resarch Topics

Text-to-speech synthesis
Singing voice synthesis

Society

Institute of Electrical and Electronics Engineers (IEEE)
Acoustical Society of Japan (ASJ)

Publications

Journal Paper

PeriodNet: A non-autoregressive raw waveform generative model with a structure separating periodic and aperiodic components

Yukiya Hono, Shinji Takaki, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda

IEEE Access, vol. 9, pp. 137599-137612, October, 2021. (DOI: 10.1109/ACCESS.2021.3118033) (IEEE Xplore) (demo page)
Sinsy: A Deep Neural Network-Based Singing Voice Synthesis System

Yukiya Hono, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda

IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 29, pp. 2803-2815, August, 2021. (DOI: 10.1109/TASLP.2021.3104165) (IEEE SPS Japan Student Journal Paper Award) (IEEE Xplore) (demo page)

International Conference

PeriodCodec: A Pitch-Controllable Neural Audio Codec Using Periodic Signals for Singing Voice Synthesis

Masato Takagi, Miku Nishihara, Yukiya Hono, Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda

Interspeech 2025, pp. 4913-4917, Rotterdam, Netherlands, August, 2025. (ISCA Archive)
PSLM: Parallel Generation of Text and Speech with LLMs for Low-Latency Spoken Dialogue Systems

Kentaro Mitsui, Koh Mitsuda, Toshiaki Wakatsuki, Yukiya Hono, and Kei Sawada

Findings of the Association for Computational Linguistics EMNLP 2024. (accepted) (arXiv preprint)
Integrating Pre-Trained Speech and Language Models for End-to-End Speech Recognition

Yukiya Hono, Koh Mitsuda, Tianyu Zhao, Kentaro Mitsui, Toshiaki Wakatsuki, and Kei Sawada

Findings of the Association for Computational Linguistics ACL 2024, pp. 13289-13305, Bangkok, Thailand, August 2024. (ACL Anthology)
Release of Pre-Trained Models for the Japanese Language

Kei Sawada, Tianyu Zhao, Makoto Shing, Kentaro Mitsui, Akio Kaga, Yukiya Hono, Toshiaki Wakatsuki, and Koh Mitsuda

2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pp. 13898-13905, Torino, Italia, May, 2024. (ACL Anthology)
PeriodGrad: Towards Pitch-Controllable Neural Vocoder Based on a Diffusion Probabilistic Model

Yukiya Hono, Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda

2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 12782-12786, Seoul, Korea, April, 2024. (IEEE Xplore)
UniFLG: Unified Facial Landmark Generator from Text or Speech

Kentaro Mitsui, Yukiya Hono, and Kei Sawada

Interspeech 2023, pp. 5501-5505, Dublin, Ireland, September, 2023. (ISCA Archive)
Singing Voice Synthesis Based on a Musical Note Position-Aware Attention Mechanism

Yukiya Hono, Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda

2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece, June, 2023. (IEEE Xplore)
Embedding a Differentiable Mel-cepstral Synthesis Filter to a Neural Speech Synthesis System

Takenori Yoshimura, Shinji Takaki Kazuhiro Nakamura, Keiichiro Oura, Yukiya Hono, Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda

2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece, June, 2023. (IEEE Xplore)
End-to-End Text-to-Speech Based on Latent Representation of Speaking Styles Using Spontaneous Dialogue

Kentaro Mitsui, Tianyu Zhao, Kei Sawada, Yukiya Hono, Yoshihiko Nankaku, and Keiichi Tokuda

Interspeech 2022, pp. 2328–2332, Incheon, Korea, September, 2022. (ISCA Archive)
PeriodNet: A non-autoregressive waveform generation model with a structure separating periodic and aperiodic components

Yukiya Hono, Shinji Takaki, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda

2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6049-6053, Toronto, Ontario, Canada, June, 2021. (IEEE SPS Japan Student Conference Paper Award) (IEEE Xplore)
Hierarchical Multi-Grained Generative Model for Expressive Speech Synthesis

Yukiya Hono, Kazuna Tsuboi, Kei Sawada, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda

Interspeech 2020, pp. 3441-3445, Shanghai, China, October, 2020. (ISCA archive)
Singing voice synthesis based on generative adversarial networks

Yukiya Hono, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda

2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6955-6959, Brighton, UK, May, 2019.
Singing Voice Conversion Using Posted Waveform Data on Music Social Media

Koki Senda, Yukiya Hono, Kei Sawada, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda

Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pp. 1913-1917, Honolulu, Hawaii, November, 2018.
Recent Development of the DNN-based Singing Voice Synthesis System -- Sinsy

Yukiya Hono, Shumma Murata, Kazuhiro Nakamura, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda

Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pp. 1003-1009, Honolulu, Hawaii, November, 2018.

Technical Report

Singing voice synthesis based on a frame-driven attention mechanism considering vocal timing deviation

Miku Nishihara, Yukiya Hono, Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda

Technical Report of IEICE, vol. 122, no. 339, pp. 19-24, Okinawa, Japan, February, 2023.
A comparison of neural vocoders in singing voice synthesis

Sota Wada, Yukiya Hono, Shinji Takaki, Keiichiro Oura, Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda

Technical Report of IEICE, vol. 119, no. 321, pp. 85-90, Tokyo, Japan, December, 2019.

Domestic Conference

A neural audio codec training method for singing voice synthesi

Masato Takagi, Miku Nishihara, Yukiya Hono, Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda

Acoustical Society of Japan 2025 Spring Meeting, pp. 947-950, Saitama, Japan, March, 2025.
Frame-level neural vocoder utilizing phase information of periodic signals

Motohiro Kunda, Takato Fujimoto, Yukiya Hono, Takenori Yoshimura, Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda

Acoustical Society of Japan 2025 Spring Meeting, pp. 905-908, Saitama, Japan, March, 2025.
Consideration on periodic excitation signals in source-filter type neural vocoders

Hikaru Aohara, Yukiya Hono, Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda

Acoustical Society of Japan 2024 Spring Meeting, pp. 813-816, Tokyo, Japan, March, 2024. (Student presentation award)
Generating Spoken Dialogue from Text for Natural Conversations Between AI Agents

Kentaro Mitsui, Yukiya Hono, Kei Sawada

Acoustical Society of Japan 2024 Spring Meeting, pp. 1327-1330, Tokyo, Japan, March, 2024.
End-to-end speech recognition by integrating self-supervised speech and language model

Yukiya Hono, Koh Mitsuda, Tianyu Zhao, Kentaro Mitsui, Toshiaki Wakatsuki, Kei Sawada

Acoustical Society of Japan 2024 Spring Meeting, pp. 1323-1326, Tokyo, Japan, March, 2024.
Japanese pre-trained models using self-supervised learning and their applications to speech recognition and synthesis

Kei Sawada, Yukiya Hono, Kentaro Mitsui

Acoustical Society of Japan 2024 Spring Meeting, pp. 1319-1320, Tokyo, Japan, March, 2024. (Invited talk)
A neural vocoder training method using a pitch extractor for fundamental frequency controllabilit

Shion Fukuda, Yukiya Hono, Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda

Acoustical Society of Japan 2023 Autumn Meeting, pp. 1065-1068, Aichi, Japan, September, 2023.
Neural vocoder based on disentangled representation learning to control fundamental frequency

Suzuka Sato, Takato Fujimoto, Yukiya Hono, Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda

Acoustical Society of Japan 2023 Autumn Meeting, pp. 1061-1064, Aichi, Japan, September, 2023.
PeriodGrad: A neural vocoder based on a diffusion probabilistic model with fundamental frequency controllability

Yukiya Hono, Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda,

Acoustical Society of Japan 2023 Autumn Meeting, pp. 1045-1048, Aichi, Japan, September, 2023.
A study on vocal timing modeling for sequence-to-sequence singing voice synthesis

Miku Nishihara, Yukiya Hono, Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda

Acoustical Society of Japan 2022 Autumn Meeting, pp. 1359-1362, Hokkaido, Japan, September, 2022.
Text-to-speech synthesis based on the extraction and prediction of latent speaking style representation using spontaneous dialogue

Kentaro Mitsui, Tianyu Zhao, Kei Sawada, Yukiya Hono, Yoshihiko Nankaku, and Keiichi Tokuda

Acoustical Society of Japan 2022 Autumn Meeting, pp. 1593-1596, Hokkaido, Japan, September, 2022.
A study on musical note position-aware attention mechanism for sequence-to-sequence singing voice synthesis

Yukiya Hono, Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda

Acoustical Society of Japan 2022 Autumn Meeting, pp. 1589-1592, Hokkaido, Japan, September, 2022.
Embedding a differentiable mel-cepstral synthesis filter to an end-to-end speech synthesis system

Takenori Yoshimura, Shinji Takaki, Kazuhiro Nakamura, Keiichiro Oura, Yukiya Hono, Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda

Acoustical Society of Japan 2022 Autumn Meeting, pp. 1585-1588, Hokkaido, Japan, September, 2022.
Neural vocoder training considering the aperiodic measure

Yukiya Hono, Shinji Takaki, Kei Hashimoto, Kazuhiro Nakamura, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda

Acoustical Society of Japan 2022 Spring Meeting, pp. 973-976, Japan, March, 2022. (Awaya Prize Young Researcher Award)
Sequence-to-sequence singing voice synthesis considering vocal timing fluctuation

Yukiya Hono, Taisei Kato, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda

Acoustical Society of Japan 2021 Autumn Meeting, pp. 911-914, Japan, September, 2021.
Automatic pitch correction of tuneless singing for DNN-based singing voice synthesis

Yukiya Hono, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda

Acoustical Society of Japan 2021 Autumn Meeting, pp. 907-910, Japan, September, 2021.
An investigation of modeling speech waveform by neural vocoder based on periodic/aperiodic decomposition

Yukiya Hono, Shinji Takaki, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda

Acoustical Society of Japan 2021 Spring Meeting, pp. 861-864, Japan, March, 2021.
Expressive speech synthesis using hierarchical multi-grained generative model

Yukiya Hono, Kazuna Tsuboi, Kei Sawada, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda

Acoustical Society of Japan 2020 Autumn Meeting, pp. 791-794, Japan, September, 2020.
An investigation of modeling periodic and aperiodic components in speech vocoder based on deep neural networks

Yukiya Hono, Shinji Takaki, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda

Acoustical Society of Japan 2020 Autumn Meeting, pp. 759-760, Japan, September, 2020.
Statistical parametric speech synthesis based on acoustic parameter prediction using cascade model architecture

Kentaro Mitsui, Yukiya Hono, Kazuna Tsuboi, Kei Sawada

Acoustical Society of Japan 2020 Spring Meeting, pp. 1107-1108, Saitama, Japan, March, 2020.
A study on singing voice synthesis with attention mechanism using musical score time information

Shumma Murata, Takato Fujimoto, Yukiya Hono, Shinji Takaki, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda

Acoustical Society of Japan 2019 Autumn Meeting, pp. 943-944, Shiga, Japan, September, 2019.
AI Singer Rinna: a singing voice synthesis system using user's singing voice or musical score

Kei Sawada, Kazuna Tsuboi, Xianchao Wu, Zhan Chen, Yukiya Hono, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda

Acoustical Society of Japan 2019 Spring Meeting, pp. 1041-1044, Tokyo, Japan, March, 2019.
Singing voice synthesis using generative adversarial networks

Yukiya Hono, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda

Acoustical Society of Japan 2019 Spring Meeting, pp. 1039-1040, Tokyo, Japan, March, 2019.
A DNN-based singing voice synthesis system -- Sinsy

Yukiya Hono, Shumma Murata, Kazuhiro Nakamura, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda

Acoustical Society of Japan 2018 Autumn Meeting, pp. 1099-1102, Oita, Japan, September, 2018. (Student presentation award)
Singing voice synthesis based on neural network using a structure of a hidden semi-Markov model

Yukiya Hono, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, and Keiichi Tokuda

Acoustical Society of Japan 2018 Spring Meeting, pp. 247-248, Saitama, Japan, March, 2018.
Singing voice conversion using post data in music SNS

Yukiya Hono, Kei Sawada, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda, Daisuke Kondo, and Daisuke Ishikawa

Acoustical Society of Japan 2017 Autumn Meeting, pp. 209-210, Ehime, Japan, September, 2017.

Thesis

Acoustic and waveform modeling for singing voice synthesis based on deep neural networks

Yukiya Hono

Doctoral Dissertation, Nagoya Institute of Technology, February, 2022.
Development of a singing voice synthesis system based on deep neural networks

Yukiya Hono

Master Thesis, Nagoya Institute of Technology, February, 2019.
Singing voice synthesis based on deep neural networks using generative model structures

Yukiya Hono

Graduation Thesis, Nagoya Institute of Technology, February, 2017.

Preprint

Towards human-like spoken dialogue generation between AI agents from written dialogue

Kentaro Mitsui, Yukiya Hono, and Kei Sawada

arXiv preprint arXiv:2310.01088, October, 2023. (arXiv)
Singing voice synthesis based on frame-level sequence-to-sequence models considering vocal timing deviation

Miku Nishihara, Yukiya Hono, Kei Hashimoto, Yoshihiko Nankaku, and Keiichi Tokuda

arXiv preprint arXiv:2301.02262, January, 2023. (arXiv)