Main Page | Modules | Data Structures | Directories | File List | Data Fields | Globals | Related Pages

vocabulary.h File Reference

Word dictionary for recognition. More...

#include <sent/stddefs.h>
#include <sent/htk_hmm.h>

Include dependency graph for vocabulary.h:

This graph shows which files directly or indirectly include this file:

Go to the source code of this file.

Defines

#define BEGIN_WORD_DEFAULT   "<s>"
 Default word string of beginning-of-sentence word.
#define END_WORD_DEFAULT   "</s>"
 Default word string of end-of-sentence word.
#define MAXWSTEP   4000
 Memory allocation step in number of words when loading a word dictionary.

Functions

WORD_INFOword_info_new ()
void word_info_free (WORD_INFO *winfo)
void winfo_init (WORD_INFO *winfo)
void winfo_expand (WORD_INFO *winfo)
boolean init_voca (WORD_INFO *winfo, char *filename, HTK_HMM_INFO *hmminfo, boolean, boolean)
boolean voca_load_htkdict (FILE *, WORD_INFO *, HTK_HMM_INFO *, boolean)
boolean voca_load_htkdict_fd (int, WORD_INFO *, HTK_HMM_INFO *, boolean)
boolean voca_load_htkdict_sd (int, WORD_INFO *, HTK_HMM_INFO *, boolean)
boolean voca_append_htkdict (char *entry, WORD_INFO *winfo, HTK_HMM_INFO *hmminfo, boolean ignore_tri_conv)
void voca_append (WORD_INFO *dstinfo, WORD_INFO *srcinfo, int coffset, int woffset)
boolean voca_load_htkdict_line (char *buf, int vnum, WORD_INFO *winfo, HTK_HMM_INFO *hmminfo, boolean do_conv, boolean *ok_flag)
boolean voca_mono2tri (WORD_INFO *winfo, HTK_HMM_INFO *hmminfo)
 Convert whole words in word dictionary to word-internal triphone.
WORD_ID voca_lookup_wid (char *, WORD_INFO *)
WORD_IDnew_str2wordseq (WORD_INFO *, char *, int *)
char * cycle_triphone (char *p)
char * cycle_triphone_flush ()
void print_voca_info (WORD_INFO *)
void put_voca (WORD_INFO *winfo, WORD_ID wid)
void put_voca_err (WORD_INFO *winfo, WORD_ID wid)
void make_base_phone (HTK_HMM_INFO *hmminfo, WORD_INFO *winfo)
 Build basephone information.
void print_phone_info (HTK_HMM_INFO *hmminfo)
void print_all_basephone_detail (HMM_basephone *base)
void print_all_basephone_name (HMM_basephone *base)
void test_interword_triphone (HTK_HMM_INFO *hmminfo, WORD_INFO *winfo)


Detailed Description

Word dictionary for recognition.

Author:
Akinobu LEE
Date:
Sat Feb 12 12:38:13 2005
This file defines data structure for word dictionary used in recognition. It stores word's string, output string, phoneme sequence, transparency. Beginning-of-sentence word and End-of-sentence word guessed from runtime environment is also stored here.

Please note that the N-gram vocabulary is stored in NGRAM_INFO and it can differ from this word dictionary. The reference from the word dictionary to a N-gram vocabulary is done by wton[] member in WORD_INFO. When used with DFA, the wton[] holds a category number to which each word belongs.

Revision
1.1.1.1

Definition in file vocabulary.h.


Function Documentation

WORD_INFO* word_info_new  ) 
 

Allocate a new word dictionary structure.

Returns:
pointer to the newly allocated WORD_INFO.

Definition at line 34 of file voca_malloc.c.

Referenced by initialize_dict().

void word_info_free WORD_INFO winfo  ) 
 

Free all informations in the WORD_INFO.

Parameters:
winfo [i/o] word dictionary data to be freed.

Definition at line 49 of file voca_malloc.c.

void winfo_init WORD_INFO winfo  ) 
 

Initialize a new word dictionary structure.

Parameters:
winfo [i/o] word dictionary to be initialized.

Definition at line 78 of file voca_malloc.c.

Referenced by voca_load_htkdict(), voca_load_htkdict_fd(), and voca_load_htkdict_sd().

void winfo_expand WORD_INFO winfo  ) 
 

Expand the word dictionary.

Parameters:
winfo [i/o] word dictionary to be expanded.

Definition at line 108 of file voca_malloc.c.

Referenced by voca_append(), voca_append_htkdict(), voca_load_htkdict(), voca_load_htkdict_fd(), and voca_load_htkdict_sd().

boolean init_voca WORD_INFO winfo,
char *  filename,
HTK_HMM_INFO hmminfo,
boolean  not_conv_tri,
boolean  force_dict
 

Load and initialize a word dictionary.

Parameters:
winfo [out] pointer to a word dictionary data to store the read data
filename [in] file name of the word dictionary to read
hmminfo [in] HMM definition data, needed for triphone conversion.
not_conv_tri [in] TRUE if not converting monophone to triphone.
force_dict [in] TRUE if want to ignore the error words in the dictionary
Returns:
TRUE on success, FALSE on failure.

Definition at line 40 of file init_voca.c.

Referenced by initialize_dict().

boolean voca_load_htkdict FILE *  fp,
WORD_INFO winfo,
HTK_HMM_INFO hmminfo,
boolean  ignore_tri_conv
 

Top function to read word dictionary via file pointer

Parameters:
fp [in] file pointer
winfo [out] pointer to word dictionary to store the read data.
hmminfo [in] HTK HMM definition data. if NULL, phonemes are ignored.
ignore_tri_conv [in] TRUE if triphone conversion is ignored
Returns:
TRUE on success, FALSE on any error word.

Definition at line 229 of file voca_load_htkdict.c.

Referenced by init_voca().

boolean voca_load_htkdict_fd int  fd,
WORD_INFO winfo,
HTK_HMM_INFO hmminfo,
boolean  ignore_tri_conv
 

Top function to read word dictionary via file descriptor.

Parameters:
fd [in] file descriptor
winfo [out] pointer to word dictionary to store the read data.
hmminfo [in] HTK HMM definition data. if NULL, phonemes are ignored.
ignore_tri_conv [in] TRUE if triphone conversion is ignored
Returns:
TRUE on success, FALSE on any error word.

Definition at line 269 of file voca_load_htkdict.c.

boolean voca_load_htkdict_sd int  sd,
WORD_INFO winfo,
HTK_HMM_INFO hmminfo,
boolean  ignore_tri_conv
 

Top function to read word dictionary via socket descriptor.

Parameters:
sd [in] socket descriptor
winfo [out] pointer to word dictionary to store the read data.
hmminfo [in] HTK HMM definition data. if NULL, phonemes are ignored.
ignore_tri_conv [in] TRUE if triphone conversion is ignored
Returns:
TRUE on success, FALSE on any error word.

Definition at line 308 of file voca_load_htkdict.c.

boolean voca_append_htkdict char *  entry,
WORD_INFO winfo,
HTK_HMM_INFO hmminfo,
boolean  ignore_tri_conv
 

Append a single entry to the existing word dictionary.

Parameters:
entry [in] dictionary entry string to be appended.
winfo [out] pointer to word dictionary to append the data.
hmminfo [in] HTK HMM definition data. if NULL, phonemes are ignored.
ignore_tri_conv [in] TRUE if triphone conversion is ignored
Returns:
TRUE on success, FALSE on any error word.

Definition at line 347 of file voca_load_htkdict.c.

Referenced by initialize_dict().

void voca_append WORD_INFO dstinfo,
WORD_INFO srcinfo,
int  coffset,
int  woffset
 

Append one word dictionary to other, for multiple grammar handling. Assumes that the same HMM definition is used on both word dictionary.

Parameters:
dstinfo [i/o] word dictionary
srcinfo [in] word dictionary to be appended to dst
coffset [in] category id offset in dst where the new data should be stored
woffset [in] word id offset in dst where the new data should be stored

Definition at line 640 of file voca_load_htkdict.c.

boolean voca_load_htkdict_line char *  buf,
int  vnum,
WORD_INFO winfo,
HTK_HMM_INFO hmminfo,
boolean  do_conv,
boolean ok_flag
 

Sub function to Add a dictionary entry line to the word dictionary.

Parameters:
buf [i/o] buffer to hold the input string, will be modified in this function
vnum [in] current number of words in winfo
winfo [out] pointer to word dictionary to append the data.
hmminfo [in] HTK HMM definition data. if NULL, phonemes are ignored.
do_conv [in] TRUE if performing triphone conversion
ok_flag [out] will be set to FALSE if an error occured for this input.
Returns:
FALSE if buf == "DICEND", else TRUE will be returned.

Definition at line 384 of file voca_load_htkdict.c.

Referenced by voca_append_htkdict(), voca_load_htkdict(), voca_load_htkdict_fd(), and voca_load_htkdict_sd().

boolean voca_mono2tri WORD_INFO winfo,
HTK_HMM_INFO hmminfo
 

Convert whole words in word dictionary to word-internal triphone.

Normally triphone conversion will be performed directly when reading dictionary file. This function is for post conversion only.

Parameters:
winfo [i/o] word dictionary information
hmminfo [in] HTK HMM definition
Returns:
TRUE on success, FALSE on failure.

Definition at line 599 of file voca_load_htkdict.c.

Referenced by final_fusion().

WORD_ID voca_lookup_wid char *  keyword,
WORD_INFO winfo
 

Look up a word on dictionary by string.

Parameters:
keyword [in] keyword to search
winfo [in] word dictionary
Returns:
the word id if found, or WORD_INVALID if not found.

Definition at line 42 of file voca_lookup.c.

Referenced by initialize_dict(), and new_str2wordseq().

WORD_ID* new_str2wordseq WORD_INFO winfo,
char *  s,
int *  len_return
 

Convert string of space-separated word strings to array of word ids.

Parameters:
winfo [in] word dictionary
s [in] string of space-separated word strings
len_return [out] number of found words
Returns:
pointer to a newly allocated word list.

Definition at line 116 of file voca_lookup.c.

char* cycle_triphone char *  p  ) 
 

Return string of triphone name composed from last 3 call.

Parameters:
p [in] next phone string
Returns:
the composed triphone name, or NULL on end.

Definition at line 79 of file voca_load_htkdict.c.

Referenced by cycle_triphone_flush(), new_str2phseq(), voca_load_htkdict_line(), and voca_mono2tri().

char* cycle_triphone_flush  ) 
 

Flush the triphone buffer and return the last biphone.

Returns:
the composed last bi-phone name.

Definition at line 125 of file voca_load_htkdict.c.

Referenced by new_str2phseq(), voca_load_htkdict_line(), and voca_mono2tri().

void print_voca_info WORD_INFO winfo  ) 
 

Output overall word dictionary information to stdout.

Parameters:
winfo [in] word dictionary

Definition at line 33 of file voca_util.c.

Referenced by print_info().

void put_voca WORD_INFO winfo,
WORD_ID  wid
 

Output information of a word in dictionary to stdout.

Parameters:
winfo [in] word dictionary
wid [in] word id to be output

Definition at line 80 of file voca_util.c.

Referenced by hmm_check(), make_dfa_voca_ref(), print_info(), and wchmm_add_word().

void put_voca_err WORD_INFO winfo,
WORD_ID  wid
 

Output information of a word in dictionary to stderr.

Parameters:
winfo [in] word dictionary
wid [in] word id to be output

Definition at line 113 of file voca_util.c.

void make_base_phone HTK_HMM_INFO hmminfo,
WORD_INFO winfo
 

Build basephone information.

Extract base phones from HMM definition, mark them whether they appear on word head or word tail, and count the number.

Parameters:
hmminfo [i/o] HMM definition information, basephone list will be added.
winfo [in] word dictionary information

Definition at line 380 of file chkhmmlist.c.

Referenced by hmm_check().

void print_phone_info HTK_HMM_INFO hmminfo  ) 
 

Output general information concerning phone mapping in HMM definition.

Parameters:
hmminfo [in] HMM definition data.

Definition at line 394 of file chkhmmlist.c.

Referenced by hmm_check().

void print_all_basephone_detail HMM_basephone base  ) 
 

Output all basephone informations to stdout.

Parameters:
base [in] pointer to the top basephone data holder.

Definition at line 105 of file chkhmmlist.c.

Referenced by hmm_check().

void print_all_basephone_name HMM_basephone base  ) 
 

Output all basephone names to stdout

Parameters:
base [in] pointer to the top basephone data holder.

Definition at line 115 of file chkhmmlist.c.

Referenced by hmm_check().

void test_interword_triphone HTK_HMM_INFO hmminfo,
WORD_INFO winfo
 

Top function to check if all the possible triphones on given word dictionary actually exist in the logical HMM.

Parameters:
hmminfo [in] HMM definition information, with basephone list.
winfo [in] word dictionary information

Definition at line 339 of file chkhmmlist.c.

Referenced by hmm_check().


Generated on Tue Mar 28 16:02:44 2006 for Julius by  doxygen 1.4.2