SOFTWARE

In Tokuda & Nankaku & Hashimoto laboratory, softwares for promotion of speech and image research are developed and opened to the public. These are used to research in various organizations and companies.

Software

HMM/DNN Speech Sysnthesis System Toolkit: HTS

01.jpg

HTS is a basic software for speech synthesis that a lot of research laboratories (Microsoft, IBM, etc) adopt.

Speech Signal Processing Toolkit: SPTK

03.jpg

This is a software that does signal processing and data processing for the acoustical analysis.

Speech Synthesis Engine: hts_engine

07.jpg

It is a software which synthesis speech by using model learned by HTS. It opens to the public with BSD license.

Japanese Text-to-Speech System: Open JTalk

08.jpg

It is a Japanese Text-to-Speech System. hts_engine is used in a speech synthesis module. It opens to the public with BSD license.

HMM/DNN-based Singing Voice Synthesis System: Sinsy

10.jpg

It is an HMM/DNN-based singing voice synthesis system.
You can generate a singing voice sample by uploading the musical score.

Voice Interaction System Building Toolkit: MMDAgent

09.jpg

MMDAgent is an open-source, toolkit for building voice interaction systems.
This realizes to talk with characters (MMD model) in the screen by combining speech synthesis, speech recognition, speech learning function, 3D rendering, lip sync technology, etc.

Anthropomorphic Spoken Dialogue Agent: Galatea

04.jpg

This is an open-source, license-free software toolkit for building anthropomorphic spoken dialogue agents. This is a product of project which speech, language, and image researchers from ten or more university in Japan participate to build anthropomorphic spoken dialogue agents. HTS and Julius, those are developed in this laboratory, are used on speech wave form generation module and speech recognition module respectively in this software.

System

The Voice Interactive Digital Signage system "Mei-chan" (Japanese page)

11.jpg

The Voice Interactive Digital Signage system "Mei-chan" is located at the front gate of Nagoya Institute of Technology. Please talk when you come to Nagoya Institute of Technology.

Database

Multi-Modal Speech Database for Research: M2TINIT (Japanese page)

05.jpg

M2TINIT is a multi-modal database which japanese speech and lip dynamic scene are recorded concurrently. It is developed and opened to the public by Takao Kobayashi laboratory (Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology) and Kitamura & Tokuda laboratory (Department of Computer Science, Nagoya Institute of Technology. Currently, Tokuda & Lee & Nankaku laboratory) for promotion of multi-modal speech research. It has been used to researches that are generation of speech and lip dynamic scene and bimodal speech recognition.





files: file07.jpg 1199件 [詳細] file11.jpg 528件 [詳細] file03.jpg 1151件 [詳細] file01.jpg 1304件 [詳細] file05.jpg 1277件 [詳細] file10.jpg 611件 [詳細] file02.jpg 1318件 [詳細] file08.jpg 1218件 [詳細] file04.jpg 1314件 [詳細] file06.jpg 1225件 [詳細] file09.jpg 559件 [詳細]
トップ   編集 凍結 差分 履歴 添付 複製 名前変更 リロード   新規 一覧 検索 最終更新   ヘルプ   最終更新のRSS