#include <ngram2.h>
変数 | |
int | n |
N-gram order (ex. 3 for 3-gram) | |
int | dir |
direction (either DIR_LR or DIR_RL) | |
boolean | from_bin |
TRUE if source was bingram, otherwise ARPA | |
boolean | bigram_index_reversed |
TRUE if read from old (<=3.5.3) bingram, in which case the 2-gram tuple index is reversed (DIR_LR) against the RL 3-gram. | |
WORD_ID | max_word_num |
N-gram vocabulary size | |
char ** | wname |
List of word strings. | |
PATNODE * | root |
Root of index tree to search n-gram word ID from its name | |
WORD_ID | unk_id |
Word ID of unknown word. | |
int | unk_num |
Number of dictionary words that are not in this N-gram vocabulary | |
LOGPROB | unk_num_log |
Log10 value of unk_num, used for calculating probability of unknown words | |
boolean | isopen |
TRUE if dictionary has unknown words, which does not appear in this N-gram | |
NGRAM_TUPLE_INFO | d [MAX_N] |
Main body of N-gram info | |
LOGPROB * | bo_wt_1 |
back-off weights for 2-gram on 1st pass | |
LOGPROB * | p_2 |
2-gram prob for the 1st pass | |
LOGPROB(* | bigram_prob )(struct __ngram_info__ *, WORD_ID, WORD_ID) |
Pointer of a function to compite bigram probability on the 1st pass. See bi_prob_func_set() for details |
bigrams and trigrams are stored in the form of sequential lists. They are grouped by the same context, and referred from the context ((N-1)-gram) data by the beginning ID and its number.