libjulius/src/ngram_decode.c File Reference

N-gram based word prediction for the 2nd pass. More...

#include <julius/julius.h>

Go to the source code of this file.

Functions

static int compare_nw (NEXTWORD **a, NEXTWORD **b)
 qsort callback function to sort next word candidates by their word ID.
static NEXTWORDsearch_nw (NEXTWORD **nw, WORD_ID w, int num)
 Find a word from list of next word candidates.
static LOGPROB ngram_forw2back (NGRAM_INFO *ngram, WORD_ID *w, int wlen)
 Compute backward N-gram score from forward N-gram.
static int pick_backtrellis_words (RecogProcess *r, NEXTWORD **nw, int oldnum, NODE *hypo, short t)
 Extract next word candidates from word trellis.
static int get_backtrellis_words (RecogProcess *r, NEXTWORD **nw, NODE *hypo, short tm, short t_end)
 Determine next word candidates from the word trellis.
static int limit_nw (NEXTWORD **nw, NODE *hypo, int num, WORD_INFO *winfo)
 Remove non-expansion word from list.
int ngram_firstwords (NEXTWORD **nw, int peseqlen, int maxnw, RecogProcess *r)
 Get initial word hypotheses at the beginning.
int ngram_nextwords (NODE *hypo, NEXTWORD **nw, int maxnw, RecogProcess *r)
 Return the list of next word candidate.
boolean ngram_acceptable (NODE *hypo, RecogProcess *r)
 Acceptance check.


Detailed Description

N-gram based word prediction for the 2nd pass.

These functions returns next word candidates in the 2nd recognition pass of Julius, i.e. N-gram based stack decoding.

Given a partial sentence hypothesis, it first estimate the beginning frame of the hypothesis based on the word trellis. Then the words in the word trellis around the estimated frame are extracted from the word trellis. They will be returned with their N-gram probabilities.

In Julius, ngram_firstwords(), ngram_nextwords() and ngram_acceptable() are called from main search function wchmm_fbs(). In Julian, corresponding functions in dfa_decode.c will be used instead.

Author:
Akinobu Lee
Date:
Fri Jul 8 14:57:51 2005
Revision
1.1.1.1

Definition in file ngram_decode.c.


Function Documentation

static int compare_nw ( NEXTWORD **  a,
NEXTWORD **  b 
) [static]

qsort callback function to sort next word candidates by their word ID.

Parameters:
a [in] element 1
b [in] element 2
Returns:
1 if word id of a > that of b, -1 if negative, 0 if equal.

Definition at line 69 of file ngram_decode.c.

Referenced by get_backtrellis_words().

static NEXTWORD* search_nw ( NEXTWORD **  nw,
WORD_ID  w,
int  num 
) [static]

Find a word from list of next word candidates.

Parameters:
nw [in] list of next word candidates
w [in] word id to search for
num [in] length of nw
Returns:
the pointer to the NEXTWORD data if found, or NULL if not found.

Definition at line 99 of file ngram_decode.c.

Referenced by pick_backtrellis_words().

static LOGPROB ngram_forw2back ( NGRAM_INFO ngram,
WORD_ID w,
int  wlen 
) [static]

Compute backward N-gram score from forward N-gram.

Parameters:
ngram [in] N-gram data structure
w [in] word sequence
wlen [in] length of w
Returns:
the backward probability of the word w[0].

Definition at line 139 of file ngram_decode.c.

Referenced by pick_backtrellis_words().

static int pick_backtrellis_words ( RecogProcess r,
NEXTWORD **  nw,
int  oldnum,
NODE hypo,
short  t 
) [static]

Extract next word candidates from word trellis.

This function extracts the list of trellis words whose word end has survived in the word trellis at the specified frame. The N-gram probabilities of them are then computed and added to the current next word candidates data.

Parameters:
r [in] recognition process instance
nw [in] list of next word candidates (new words will be appended at oldnum)
oldnum [in] number of words already stored in nw
hypo [in] the source sentence hypothesis
t [in] specified frame
Returns:
the total number of words currently stored in the nw.

< Last two non-transparent words

< Last two non-transparent words

< Num of found non-transparent words (<=2)

< Num of skipped transparent words

Definition at line 192 of file ngram_decode.c.

Referenced by get_backtrellis_words().

static int get_backtrellis_words ( RecogProcess r,
NEXTWORD **  nw,
NODE hypo,
short  tm,
short  t_end 
) [static]

Determine next word candidates from the word trellis.

This function builds a list of next word candidates by looking up the word trellis at specified frame, with lookup_range frame margin. If the same words exists in the near frames, only the one nearest to the specified frame will be chosen.

Parameters:
r [in] recognition process instance
nw [out] pointer to hold the extracted words as list of next word candidates
hypo [in] partial sentence hypothesis from which the words will be expanded
tm [in] center time frame to look up the words
t_end [in] right frame boundary for the lookup.
Returns:
the number of next words candidates stored in nw.

Definition at line 334 of file ngram_decode.c.

Referenced by ngram_nextwords().

static int limit_nw ( NEXTWORD **  nw,
NODE hypo,
int  num,
WORD_INFO winfo 
) [static]

Remove non-expansion word from list.

Remove words in the nextword list which should not be expanded.

Parameters:
nw [i/o] list of next word candidates (will be shrinked by removing some words)
hypo [in] partial sentence hypothesis from which the words will be expanded
num [in] current number of next words in nw
winfo [in] word dictionary
Returns:
the new number of words in nw

Definition at line 427 of file ngram_decode.c.

Referenced by ngram_nextwords().

int ngram_firstwords ( NEXTWORD **  nw,
int  peseqlen,
int  maxnw,
RecogProcess r 
)

Get initial word hypotheses at the beginning.

on N-gram based recogntion, the initial hypothesis is fixed to the tail silence word. Exception is that, in short-pause segmentation mode, the initial hypothesis will be chosen from survived words on the last input frame in the first pass.

Parameters:
nw [out] pointer to hold the initial word candidates
peseqlen [in] input frame length
maxnw [in] maximum number of words that can be stored in nw
r [in] recognition process instance
Returns:
the number of words extracted and stored to nw.

Definition at line 495 of file ngram_decode.c.

int ngram_nextwords ( NODE hypo,
NEXTWORD **  nw,
int  maxnw,
RecogProcess r 
)

Return the list of next word candidate.

Given a partial sentence hypothesis "hypo", it returns the list of next word candidates. Actually, it extracts from word trellis the list of words whose word-end node has survived near the estimated beginning-of-word frame of last word "hypo->estimated_next_t", and store them to "nw" with their N-gram probabilities.

Parameters:
hypo [in] source partial sentence hypothesis
nw [out] pointer to store the list of next word candidates (should be already allocated)
maxnw [in] maximum number of words that can be stored to nw
r [in] recognition process instance
Returns:
the number of extracted next word candidates in nw.

Definition at line 562 of file ngram_decode.c.

boolean ngram_acceptable ( NODE hypo,
RecogProcess r 
)

Acceptance check.

Return whether the given partial hypothesis is acceptable as a sentence and can be treated as a final search candidate. In N-gram mode, it checks whether the last word is the beginning-of-sentence silence (silhead).

Parameters:
hypo [in] partial sentence hypothesis to be examined
r [in] recognition process instance
Returns:
TRUE if acceptable as a sentence, or FALSE if not.

Definition at line 612 of file ngram_decode.c.


Generated on Tue Dec 18 16:01:19 2007 for Julius by  doxygen 1.5.4