libsent/src/ngram/ngram_read_arpa.c File Reference

Read ARPA format N-gram files. More...

#include <sent/stddefs.h>
#include <sent/ngram2.h>

Go to the source code of this file.

Functions

static int get_total_info (FILE *fp, int num[])
 Set number of N-gram entries, for reading the first LR 2-gram.
static boolean set_unigram (FILE *fp, NGRAM_INFO *ndata)
 Read word/class entry names and 1-gram data from LR 2-gram file.
static boolean add_unigram (FILE *fp, NGRAM_INFO *ndata)
 Read 1-gram data from RL 3-gram file.
static boolean add_bigram (FILE *fp, NGRAM_INFO *ndata)
 Read reverse 2-gram data from RL 3-gram file, and set RL 2-gram probabilities and back-off values for RL 3-gram to the corresponding LR 2-gram data.
static boolean set_ngram (FILE *fp, NGRAM_INFO *ndata, int n)
 Read n-gram data for a given N from ARPA n-gram file.
boolean ngram_read_arpa (FILE *fp, NGRAM_INFO *ndata, boolean addition)
 Read in one ARPA N-gram file.

Variables

static char buf [800]
 Local buffer for reading.
static char pbuf [800]
 Local buffer for error string.


Detailed Description

Read ARPA format N-gram files.

When N-gram data is given in ARPA format, both 2-gram file and reverse 3-gram file should be specified.

See also:
ngram2.h
Author:
Akinobu LEE
Date:
Wed Feb 16 16:52:24 2005
Revision
1.1.1.1

Definition in file ngram_read_arpa.c.


Function Documentation

static int get_total_info ( FILE *  fp,
int  num[] 
) [static]

Set number of N-gram entries, for reading the first LR 2-gram.

Parameters:
fp [in] file pointer
num [out] set the values to this buffer

Definition at line 51 of file ngram_read_arpa.c.

Referenced by ngram_read_arpa().

static boolean set_unigram ( FILE *  fp,
NGRAM_INFO ndata 
) [static]

Read word/class entry names and 1-gram data from LR 2-gram file.

Parameters:
fp [in] file pointer
ndata [out] N-gram to set the read data.

Definition at line 91 of file ngram_read_arpa.c.

Referenced by ngram_read_arpa().

static boolean add_unigram ( FILE *  fp,
NGRAM_INFO ndata 
) [static]

Read 1-gram data from RL 3-gram file.

Only the back-off weights are stored.

Parameters:
fp [in] file pointer
ndata [out] N-gram to store the read data.

Definition at line 188 of file ngram_read_arpa.c.

Referenced by ngram_read_arpa().

static boolean add_bigram ( FILE *  fp,
NGRAM_INFO ndata 
) [static]

Read reverse 2-gram data from RL 3-gram file, and set RL 2-gram probabilities and back-off values for RL 3-gram to the corresponding LR 2-gram data.

Parameters:
fp [in] file pointer
ndata [i/o] N-gram to set the read data.

Definition at line 253 of file ngram_read_arpa.c.

Referenced by ngram_read_arpa().

static boolean set_ngram ( FILE *  fp,
NGRAM_INFO ndata,
int  n 
) [static]

Read n-gram data for a given N from ARPA n-gram file.

(n >= 2)

Parameters:
fp [in] file pointer
ndata [out] N-gram to set the read data.

Definition at line 322 of file ngram_read_arpa.c.

Referenced by ngram_read_arpa().

boolean ngram_read_arpa ( FILE *  fp,
NGRAM_INFO ndata,
boolean  addition 
)

Read in one ARPA N-gram file.

Supported combinations are LR 2-gram, RL 3-gram and LR 3-gram.

Parameters:
fp [in] file pointer
ndata [out] N-gram data to store the read data
addition [in] TRUE if going to read additional 2-gram
Returns:
TRUE on success, FALSE on failure.

Definition at line 514 of file ngram_read_arpa.c.

Referenced by init_ngram_arpa(), and init_ngram_arpa_additional().


Generated on Tue Dec 18 16:01:40 2007 for Julius by  doxygen 1.5.4