julius/ngram_decode.c

N-gram確率とトレリス上の単語から次単語を予測する(第2パス)． [詳細]

#include <julius.h>

ngram_decode.cのインクルード依存関係図

ソースコードを見る。

関数

static int compare_nw (NEXTWORD **a, NEXTWORD **b)

static NEXTWORD * search_nw (NEXTWORD **nw, WORD_ID w, int num)

static void set_word_context (WORD_ID *cseq, int n, WORD_INFO *winfo)

static int pick_backtrellis_words (BACKTRELLIS *bt, WORD_INFO *winfo, NGRAM_INFO *ngram, NEXTWORD **nw, int oldnum, NODE *hypo, short t)

単語トレリス上の次単語を取り出す．

int get_backtrellis_words (BACKTRELLIS *bt, WORD_INFO *winfo, NGRAM_INFO *ngram, NEXTWORD **nw, NODE *hypo, short tm, short t_end)

指定フレーム周辺の単語トレリスから次単語集合を決定する．

int limit_nw (NEXTWORD **nw, NODE *hypo, int num)

int ngram_firstwords (NEXTWORD **nw, int peseqlen, int maxnw, WORD_INFO *winfo, BACKTRELLIS *bt)

初期単語仮説集合を返す．

int ngram_nextwords (NODE *hypo, NEXTWORD **nw, int maxnw, NGRAM_INFO *ngram, WORD_INFO *winfo, BACKTRELLIS *bt)

次単語仮説集合を返す．

boolean ngram_acceptable (NODE *hypo, WORD_INFO *winfo)

変数

static WORD_ID cnword [2]

Last two non-transparent words

static int cnnum

Num of found non-transparent words (<=2)

static int last_trans

Num of skipped transparent words

説明

N-gram確率とトレリス上の単語から次単語を予測する(第2パス)．

作者:: Akinobu Lee

日付:: Fri Jul 8 14:57:51 2005

Julius のN-gramを用いたスタックデコーディング(第2パス)において，次に接続しうる単語の集合を決定する．

与えられた展開元仮説の始端フレームを予測し，単語トレリス上でその予測フレーム周辺に終端が存在する単語の集合を，そのN-gram出現確率とともに返す．

Julius では ngram_firstwords(), ngram_nextwords(), ngram_acceptable() がそれぞれ第2パスのメイン関数 wchmm_fbs() から呼び出される．なお， Julian ではこれらの関数の代わりに dfa_decode.c の関数が用いられる．

Revision: 1.4

ngram_decode.c で定義されています。

関数

static int compare_nw	(	NEXTWORD **	a,
		NEXTWORD **	b
	)			`[static]`

次単語候補を単語IDで昇順ソートするための qsort コールバック関数．

引数:

	a	[in] 要素1
	b	[in] 要素2

戻り値:: aの単語ID > bの単語ID なら1, 逆なら -1, 同じなら 0 を返す．

ngram_decode.c の 71 行で定義されています。

参照元 get_backtrellis_words().

static NEXTWORD* search_nw	(	NEXTWORD **	nw,
		WORD_ID	w,
		int	num
	)			`[static]`

指定された単語を次単語候補リスト内から検索する．

引数:

	nw	[in] 次単語候補リスト
	w	[in] 検索する単語のID
	num	[in] 次単語候補リストの長さ

戻り値:: 見つかった場合その次単語候補構造体へのポインタ，見つからなければ NULL を返す．

ngram_decode.c の 101 行で定義されています。

参照元 pick_backtrellis_words().

static void set_word_context	(	WORD_ID *	cseq,
		int	n,
		WORD_INFO *	winfo
	)			`[static]`

単語列から透過単語でない最後の２単語を抽出し，cnword にセットする．

引数:

	cseq	[in] 単語列
	n	[in] cseq の長さ
	winfo	[in] 単語情報構造体

ngram_decode.c の 147 行で定義されています。

参照元 pick_backtrellis_words().

static int pick_backtrellis_words	(	BACKTRELLIS *	bt,
		WORD_INFO *	winfo,
		NGRAM_INFO *	ngram,
		NEXTWORD **	nw,
		int	oldnum,
		NODE *	hypo,
		short	t
	)			`[static]`

単語トレリス上の次単語を取り出す．

単語トレリス上の指定したフレーム上に終端が存在するトレリス単語のリストを抽出し，それらの次単語としての N-gram 接続確率を計算する．そのリストを次単語情報構造体に追加して返す．

引数:

	bt	[in] 単語トレリス構造体
	winfo	[in] 単語辞書構造体
	ngram	[in] N-gram構造体
	nw	[i/o] 次単語候補リスト（抽出結果は oldnum 以降に追加される）
	oldnum	[in] nw にすでに格納されている次単語の数
	hypo	[in] 展開元の文仮説
	t	[in] 指定フレーム

戻り値:: 抽出リストを追加したあとの nw に含まれる次単語の総数．

ngram_decode.c の 205 行で定義されています。

参照元 get_backtrellis_words().

int get_backtrellis_words	(	BACKTRELLIS *	bt,
		WORD_INFO *	winfo,
		NGRAM_INFO *	ngram,
		NEXTWORD **	nw,
		NODE *	hypo,
		short	tm,
		short	t_end
	)

指定フレーム周辺の単語トレリスから次単語集合を決定する．

指定フレームの前後 lookup_range 分に終端があるトレリス上の単語を集め，次単語構造体を構築する．同じ単語が上記の範囲内に複数ある場合，指定フレームにもっとも近いトレリス上の単語が選択される．

引数:

	bt	[in] 単語トレリス構造体
	winfo	[in] 単語辞書構造体
	ngram	[in] 単語N-gram構造体
	nw	[out] 次単語集合を格納する構造体へのポインタ
	hypo	[in] 展開元の部分文仮説
	tm	[in] 単語を探す中心となる指定フレーム
	t_end	[in] 単語を探すフレームの右端

戻り値:: nw に格納された次単語候補の数を返す．

ngram_decode.c の 309 行で定義されています。

参照元 ngram_nextwords().

int limit_nw	(	NEXTWORD **	nw,
		NODE *	hypo,
		int	num
	)

制約により展開対象とならない単語をリストから消去する．

引数:

	nw	[i/o] 次単語集合（集合中の展開できない単語が消去される）
	hypo	[in] 展開元の部分文仮説
	num	[in] nw に現在格納されている単語数

戻り値:: 新たに nw に含まれる次単語数

ngram_decode.c の 391 行で定義されています。

参照元 ngram_nextwords().

int ngram_firstwords	(	NEXTWORD **	nw,
		int	peseqlen,
		int	maxnw,
		WORD_INFO *	winfo,
		BACKTRELLIS *	bt
	)

初期単語仮説集合を返す．

N-gramベースの探索では，初期仮説は単語末尾の無音単語に固定されている．ただし，ショートポーズセグメンテーション時は，第1パスで最終フレームに終端が残った単語の中で尤度最大の単語となる．

引数:

	nw	[out] 次単語候補リスト（得られた初期単語仮説を格納する）
	peseqlen	[in] 入力フレーム長
	maxnw	[in] nw に格納できる単語の最大数
	winfo	[in] 単語情報構造体
	bt	[in] 単語トレリス構造体

戻り値:: nw に格納された単語候補数を返す．

ngram_decode.c の 462 行で定義されています。

参照元 wchmm_fbs().

int ngram_nextwords	(	NODE *	hypo,
		NEXTWORD **	nw,
		int	maxnw,
		NGRAM_INFO *	ngram,
		WORD_INFO *	winfo,
		BACKTRELLIS *	bt
	)

次単語仮説集合を返す．

与えられた部分文仮説から，次に接続しうる単語の集合を返す．実際には，第1パスの結果であるトレリス単語集合 bt 上で，展開元の部分文仮説の最終単語の（推定された）始端フレーム hypo->estimated_next_t の前後に存在する単語集合を取出し，それらの N-gram 接続確率を計算して返す．取り出された次単語仮説は，あらかじめ maxnm の長さだけ領域が確保されている nw に格納される．

引数:

	hypo	[in] 展開元の文仮説
	nw	[out] 次単語候補リストを格納する領域へのポインタ
	maxnw	[in] nw の最大長
	ngram	[in] N-gram情報構造体
	winfo	[in] 辞書情報構造体
	bt	[in] 単語トレリス構造体

戻り値:: 抽出され nw に格納された次単語仮説の数を返す．

ngram_decode.c の 531 行で定義されています。

boolean ngram_acceptable	(	NODE *	hypo,
		WORD_INFO *	winfo
	)

与えられた部分文仮説が，文（すなわち探索終了）として受理可能であるかどうかを返す．N-gram では文頭に対応する無音単語 (silhead) であれば受理する．

引数:

	hypo	[in] 部分文仮説
	winfo	[in] 単語辞書情報

戻り値:: 文として受理可能であれば TRUE，不可能なら FALSE を返す．

ngram_decode.c の 584 行で定義されています。

Juliusに対してTue Dec 26 16:19:48 2006に生成されました。

1.5.0


関数
static int	compare_nw (NEXTWORD a, NEXTWORD b)
static NEXTWORD *	search_nw (NEXTWORD **nw, WORD_ID w, int num)
static void	set_word_context (WORD_ID cseq, int n, WORD_INFO winfo)
static int	pick_backtrellis_words (BACKTRELLIS bt, WORD_INFO winfo, NGRAM_INFO ngram, NEXTWORD nw, int oldnum, NODE hypo, short t)
	単語トレリス上の次単語を取り出す．
int	get_backtrellis_words (BACKTRELLIS bt, WORD_INFO winfo, NGRAM_INFO ngram, NEXTWORD nw, NODE hypo, short tm, short t_end)
	指定フレーム周辺の単語トレリスから次単語集合を決定する．
int	limit_nw (NEXTWORD *nw, NODE hypo, int num)
int	ngram_firstwords (NEXTWORD *nw, int peseqlen, int maxnw, WORD_INFO winfo, BACKTRELLIS *bt)
	初期単語仮説集合を返す．
int	ngram_nextwords (NODE hypo, NEXTWORD nw, int maxnw, NGRAM_INFO ngram, WORD_INFO winfo, BACKTRELLIS bt)
	次単語仮説集合を返す．
boolean	ngram_acceptable (NODE hypo, WORD_INFO winfo)
変数
static WORD_ID	cnword [2]
	Last two non-transparent words
static int	cnnum
	Num of found non-transparent words (<=2)
static int	last_trans
	Num of skipped transparent words