* Software [#i5bc7284]
* SOFTWARE [#z8265057]

> In Tokuda & Lee laboratory, softwares for promotion of speech and image research  are developed and opened to the public. These are used  to research in various organizations and companies.
> In Tokuda & Nankaku & Hashimoto laboratory, softwares for promotion of speech and image research  are developed and opened to the public. These are used  to research in various organizations and companies.

** Software [#od9a694f]

*** [[HMM Speech Sysnthesis System toolkit: HTS>http://hts.sp.nitech.ac.jp/]] [#u24178e6]
*** [[HMM/DNN Speech Sysnthesis System Toolkit: HTS>http://hts.sp.nitech.ac.jp/]] [#u24178e6]

CENTER:[[&ref(01.jpg,center,nolink);>http://hts.sp.nitech.ac.jp/]]
>[[&ref(01.jpg,center,nolink);>http://hts.sp.nitech.ac.jp/]]

> HTS is a basic software for speech synthesis that a lot of research laboratories ([[Microsoft>http://www.microsoft.com/en/us/default.aspx]], [[IBM>http://www.ibm.com/us/]], etc) adopt.

*** [[Open-source large vocabulary continuous speech recognition engine: Julius>http://julius.sourceforge.jp/en_index.php?q=index-en.html]] [#rfd724b6]
// *** [[Open-Source Large Vocabulary Continuous Speech Recognition Engine: Julius>http://julius.sourceforge.jp/en_index.php?q=index-en.html]] [#rfd724b6]

CENTER:[[&ref(02.jpg,center,nolink);>http://julius.sourceforge.jp/en_index.php?q=index-en.html]]
// CENTER:[[&ref(02.jpg,center,nolink);>http://julius.sourceforge.jp/en_index.php?q=index-en.html]]

> Julius は,音声認識システムの開発・研究のためのオープ
ンソースの高性能な汎用大語彙連続音声認識エンジンです.~
数万語彙の連続音声認識を一般のPC上で実時間で実行できます.~
高い汎用性を持ち,発音辞書や言語モデル・音響モデルなどの
モジュールを組み替えることで,様々な幅広い用途に応用でき
ます.~
機能はライブラリで提供されており,アプリケーションへの組
み込みも可能です.
// > Julius is an open-source and a high-performance, general-purpose large vocabulary continuous speech recognition engine for developments and researches of the speech recognition system. It can perform almost real-time decording on most current PCs in tens of thousands of vocabulary. It has high generality and can apply to various, wide usages by rearranging modules such as a  pronunciation dictionary, a language model and an acoustic model, etc. The function is offered in the library, and the inclusion to application is also possible. 

//> This is a speech recognition software that is adopted by various research laboratories, and maintains Google Rank of the top of Japan as free software.

*** [[Speech signal processing toolkit: SPTK>http://sp-tk.sourceforge.net/]] [#pe110015]
*** [[Speech Signal Processing Toolkit: SPTK>http://sp-tk.sourceforge.net/]] [#pe110015]

CENTER:[[&ref(03.jpg,center,nolink);>http://sp-tk.sourceforge.net/]]
>[[&ref(03.jpg,center,nolink);>http://sp-tk.sourceforge.net/]]

> This is a software that does signal processing and data processing for the acoustical analysis. 

*** [[音声合成エンジン: hts_engine>http://hts-engine.sourceforge.net/]] [#xf0cd5dc]
*** [[Speech Synthesis Engine: hts_engine>http://hts-engine.sourceforge.net/]] [#xf0cd5dc]

CENTER:[[&ref(07.jpg,center,nolink);>http://hts-engine.sourceforge.net/]]
>[[&ref(07.jpg,center,nolink);>http://hts-engine.sourceforge.net/]]

> HTSで学習したモデルを用いて音声を合成するソフトウェアです.~
BSDライセンスで公開しています.
> It is a software which synthesis speech by using model learned by HTS. 
It opens to the public with BSD license.

*** [[Anthropomorphic spoken dialogue agent: Galatea>http://hil.t.u-tokyo.ac.jp/~galatea/index.html]] [#q9d1e4d6]
*** [[Japanese Text-to-Speech System: Open JTalk>http://open-jtalk.sourceforge.net/]] [#zef408c8]

CENTER:[[&ref(04.jpg,center,nolink);>http://hil.t.u-tokyo.ac.jp/~galatea/index.html]]
>[[&ref(08.jpg,center,nolink);>http://open-jtalk.sourceforge.net/]]

> It is a Japanese Text-to-Speech System.
hts_engine is used in a speech synthesis module.
It opens to the public with BSD license.

*** [[HMM/DNN-based Singing Voice Synthesis System: Sinsy>http://www.sinsy.jp/]] [#t107cfpe]

>[[&ref(10.jpg,center,nolink);>http://www.sinsy.jp/]]

>It is an HMM/DNN-based singing voice synthesis system.~
You can generate a singing voice sample by uploading the musical score.

*** [[Voice Interaction System Building Toolkit: MMDAgent>http://www.mmdagent.jp/]] [#q1d1e4d6]

>[[&ref(09.jpg,center,nolink);>http://www.mmdagent.jp/]]

>MMDAgent is an open-source, toolkit for building voice interaction systems.~
This realizes to talk with characters (MMD model) in the screen by combining speech synthesis, speech recognition, speech learning function, 3D rendering, lip sync technology, etc.

*** [[Anthropomorphic Spoken Dialogue Agent: Galatea>http://hil.t.u-tokyo.ac.jp/~galatea/index.html]] [#q9d1e4d6]

>[[&ref(04.jpg,center,nolink);>http://hil.t.u-tokyo.ac.jp/~galatea/index.html]]

>This is an open-source, license-free software toolkit for building anthropomorphic spoken dialogue agents. This is a product of project which speech, language, and image researchers from ten or more university in Japan participate to build anthropomorphic spoken dialogue agents. HTS and Julius, those are developed in this laboratory, are used on speech wave form generation module and speech recognition module respectively in this software.

** 端末 [#da2a6c4a]
** System [#da2a6c4a]

*** [[名工大 音声対話端末 めいちゃん (Japanese)>http://d.hatena.ne.jp/nit_mei]] [#w69b479a]
*** [[The Voice Interactive Digital Signage system "Mei-chan" (Japanese page)>http://mei.web.nitech.ac.jp/]] [#w69b479a]

CENTER:[[&ref(06.jpg,center,nolink);>http://d.hatena.ne.jp/nit_mei]]
>[[&ref(11.jpg,center,nolink);>http://mei.web.nitech.ac.jp/]]

> 名古屋工業大学2号館の1階に音声情報案内端末を設置しました.~
名工大にお越しの際はぜひ喋りかけてみてください.

> The Voice Interactive Digital Signage system "Mei-chan" is located at the front gate of [[Nagoya Institute of Technology>http://eng.nitech.ac.jp/]]. 
Please talk when you come to Nagoya Institute of Technology. 
** Database [#g698609d]

*** [[Multimodal speech data base for research: M2TINIT (Japanese)>http://m2tinit.sp.nitech.ac.jp/]] [#qcbbb711]
*** [[Multi-Modal Speech Database for Research: M2TINIT (Japanese page)>http://m2tinit.sp.nitech.ac.jp/]] [#qcbbb711]

CENTER:[[&ref(05.jpg,center,nolink);>http://m2tinit.sp.nitech.ac.jp/]]
>[[&ref(05.jpg,center,nolink);>http://m2tinit.sp.nitech.ac.jp/]]

> M2TINIT is a multi-modal data base which japanese speech and lip dynamic scene are recorded concurrently. It is developed and opened to the public by [[Takao Kobayashi laboratory>http://www.kbys.ip.titech.ac.jp/]] ([[Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology>http://www.igs.titech.ac.jp/index_English.html]]) and Kitamura & Tokuda laboratory ([[Department of Computer Science, Nagoya Institute of Technology>http://eng.nitech.ac.jp/faculty_day05.html]]. Currently, [[Tokuda & Lee laboratory>http://www.sp.nitech.ac.jp]]) for promotion of multi-modal speech research. It has been used to researches that are generation of speech and lip dynamic scene and bimodal speech recognition.
> M2TINIT is a multi-modal database which japanese speech and lip dynamic scene are recorded concurrently. It is developed and opened to the public by [[Takao Kobayashi laboratory>http://www.kbys.ip.titech.ac.jp/]] ([[Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology>http://www.igs.titech.ac.jp/index_English.html]]) and Kitamura & Tokuda laboratory ([[Department of Computer Science, Nagoya Institute of Technology>http://eng.nitech.ac.jp/faculty_day05.html]]. Currently, [[Tokuda & Lee & Nankaku laboratory>http://www.sp.nitech.ac.jp/index.php?HOME]]) for promotion of multi-modal speech research. It has been used to researches that are generation of speech and lip dynamic scene and bimodal speech recognition.





トップ   新規 一覧 検索 最終更新   ヘルプ   最終更新のRSS