Include dependency graph for realtime-1stpass.c:
Go to the source code of this file.
|static void||init_param ()|
|int||RealTimePipeLine (SP16 *Speech, int nowlen)|
|Main function of on-the-fly 1st pass decoding. |
|HTK_Param *||RealTimeParam (LOGPROB *backmax)|
|void||RealTimeCMNUpdate (HTK_Param *param)|
|static HTK_Param *||param = NULL|
|< Define if you want local debug message Computed MFCC parameter vectors |
|static float *||bf|
|Work space for FFT. |
|static DeltaBuf *||db|
|Work space for delta MFCC cycle buffer. |
|static DeltaBuf *||ab|
|Work space for accel MFCC cycle buffer. |
|static VECT *||tmpmfcc|
|Work space to hold temporarl MFCC vector. |
|Maximum allowed input frame length. |
|Last processed frame. |
|TRUE if last pass was a segmented input. |
|Frame pointer of current base MFCC. |
|Frame pointer where all MFCC computation has been done. |
|static SP16 *||window|
|Window buffer for MFCC calculation. |
|Buffer length of window. |
|Currently left samples in window. |
The basic recognition procedure of Julius in main_recognition_loop() is as follows:
At on-the-fly decoding, procedures from 1 to 3 above will be performed in parallel. It is implemented by a simple scheme, processing the captured small speech fragments one by one progressively:
Actual procedure is as follows. The function RealTimePipeLine() will be given to adin_go() as callback. Then adin_go() will watch the input, and if speech input starts, it calls RealTimePipeLine() for every captured input fragments. RealTimePipeLine() will compute the feature vector of the given fragment and proceed the 1st pass processing for them, and return to the capture function. The current status will be hold to the next call, to perform inter-frame processing (computing delta coef. etc.).
Note about CMN: With acoustic models trained with CMN, Julius performs CMN to the input. On file input, the whole sentence mean will be computed and subtracted. At the on-the-fly decoding, the ceptral mean will be performed using the cepstral mean of last 5 second input (excluding rejected ones). This was a behavier earlier than 3.5, and 3.5.1 now applies MAP-CMN at on-the-fly decoding, using the last 5 second cepstrum as initial mean. Initial cepstral mean at start can be given by option "-cmnload", and you can also prohibit the updates of initial cepstral mean at each input by "-cmnnoupdate". The last option is useful to always use static global cepstral mean as initial mean for each input.
The primary functions in this file are:
Definition in file realtime-1stpass.c.
|static void init_param||(||)||
|int RealTimePipeLine||(||SP16 *||Speech,|
Main function of on-the-fly 1st pass decoding.
This function will be called each time a new speech sample comes, as as callback from A/D-in routine. When a speech input begins, the captured speech will be passed to this function for every sample segments. This process will continue until A/D-in routine detects an end of speech or input stream reached to an end.
This function will perform feture vector extraction and beam decording as 1st pass recognition simultaneously, in frame-wise mannar.
|Speech||[in] pointer to the speech sample segments|
|nowlen||[in] length of above|
Referenced by main_recognition_loop().
|void RealTimeCMNUpdate||(||HTK_Param *||param||)|