speech.h

音声入出力処理に関する定義 [詳細]

#include <sent/adin.h>

speech.hのインクルード依存関係図

このグラフは、どのファイルから直接、間接的にインクルードされているかを示しています。

ソースコードを見る。

マクロ定義

#define MAXSEQNUM   150

Maximum number of words in an input.

#define MAXSPEECHLEN   320000

Maximum length of an input in samples.

#define OUTPROB_CACHE_PERIOD   100

Expansion period in frames for output probability cache.

#define period2freq(A)   (10000000.0 / (float)(A))

Macro to convert smpPeriod (100nsec unit) to frequency (Hz).

#define freq2period(A)   (10000000.0 / (float)(A))

Macro to convert sampling frequency (Hz) to smpPeriod (100nsec unit).

関数

int wrsamp (int fd, SP16 *buf, int len)

FILE * wrwav_open (char *filename, int sfreq)

Open/create a WAVE file and write header.

boolean wrwav_data (FILE *fp, SP16 *buf, int len)

boolean wrwav_close (FILE *fp)

Close the file.

int strip_zero (SP16 a[], int len)

説明

音声入出力処理に関する定義

作者:: Akinobu LEE

日付:: Sat Feb 12 11:16:41 2005

このファイルには，音声の入出力に関する雑多な定義が収められています．一発話あたりの入力長に関する制限などが定義されています．

入力ソースに関する定義は adin.h，MFCC 特徴量抽出に関する定義は mfcc.h, 特徴量パラメータについては htk_param.h を参照して下さい．

Revision: 1.1.1.1

speech.h で定義されています。

マクロ定義

#define MAXSEQNUM 150

Maximum number of words in an input.
This value defines limitation of word length in one utterance input. If the number of words exceeds this value, Julius produces error. So you have to set large value enough.
speech.h の 49 行で定義されています。
参照元 bt_current_max(), cpy_node(), print_1pass_result(), wb_init(), と wchmm_fbs().

#define MAXSPEECHLEN 320000

Maximum length of an input in samples.
This value defines limitation of speech input length in one utterance input. If the length of an input exceeds this value, Julius stop the input at that point and recognize it, disgarding the rest until the end of speech (long silence) comes.
The default value is 320000, which means you can give Julius an input of at most 20 secons in 16kHz sampling. Setting smaller value saves memory usage.
speech.h の 64 行で定義されています。
参照元 adin_cut(), adin_cut_callback_store_buffer(), adin_store_buffer(), adin_thread_create(), adin_thread_process(), と RealTimeInit().

#define OUTPROB_CACHE_PERIOD 100

Expansion period in frames for output probability cache.
When recognition, the 1st recognition pass stores all the output probabilities of HMM states for every incoming input frame, to speed up the re-computation of acoustic likelihoods in the 2nd pass. In live input mode, this output probability cache will be re-allocated dynamically as the input becomes longer.
This value specifies the re-allocation period in frames. The probability cache are will be expanded as the input proceeds this frame.
Smaller value may improve memory efficiency, but Too small value may result in the overhead of memory re-allocation and slow down the recognition.
speech.h の 83 行で定義されています。
参照元 outprob_cache_init().

関数

int wrsamp ( int fd,

SP16 * buf,

int len

)

Write waveform data in big endian to a file descriptor

引数:

fd [in] file descriptor

buf [in] array of speech data

len [in] length of above

戻り値:
number of bytes written, -1 on error.

wrsamp.c の 36 行で定義されています。

FILE* wrwav_open ( char * filename,

int sfreq

)

Open/create a WAVE file and write header.
Open or creat a new WAV file and prepare for later data writing. The frame length written here is dummy, and will be overwritten when closed by wrwav_close().

引数:

filename [in] file name

sfreq [in] sampling frequency of the data you are going to write

戻り値:
the file pointer.

wrwav.c の 73 行で定義されています。
参照元 record_sample_open().

boolean wrwav_data ( FILE * fp,

SP16 * buf,

int len

)

Write speech samples.

引数:

fp [in] file descriptor

buf [in] speech data to be written

len [in] length of above

戻り値:
actual number of written samples.

wrwav.c の 126 行で定義されています。
参照元 record_sample_write().

boolean wrwav_close ( FILE * fp )

Close the file.
The frame length in the header part is overwritten by the actual value before file close.

引数:

fp [in] file pointer to close, previously opened by wrwav_open().

戻り値:
TRUE on success, FALSE on failure.

wrwav.c の 146 行で定義されています。
参照元 record_sample_close().

int strip_zero ( SP16 a[],

int len

)

Strip zero samples from speech data.

引数:

a [I/O] speech data

len [in] length of above

戻り値:
new length after stripping.

strip.c の 40 行で定義されています。
参照元 adin_cut().

Juliusに対してTue Mar 28 16:05:32 2006に生成されました。

1.4.2


マクロ定義
#define	MAXSEQNUM 150
	Maximum number of words in an input.
#define	MAXSPEECHLEN 320000
	Maximum length of an input in samples.
#define	OUTPROB_CACHE_PERIOD 100
	Expansion period in frames for output probability cache.
#define	period2freq(A) (10000000.0 / (float)(A))
	Macro to convert smpPeriod (100nsec unit) to frequency (Hz).
#define	freq2period(A) (10000000.0 / (float)(A))
	Macro to convert sampling frequency (Hz) to smpPeriod (100nsec unit).
関数
int	wrsamp (int fd, SP16 *buf, int len)
FILE *	wrwav_open (char *filename, int sfreq)
	Open/create a WAVE file and write header.
boolean	wrwav_data (FILE fp, SP16 buf, int len)
boolean	wrwav_close (FILE *fp)
	Close the file.
int	strip_zero (SP16 a[], int len)