libsent/src/adin/adin-cut.c File Reference

Read in speech waveform and detect speech segment. More...

#include <sent/stddefs.h>
#include <sent/speech.h>
#include <sent/adin.h>
#include <pthread.h>

Include dependency graph for adin-cut.c:

Go to the source code of this file.

Defines

#define TMP_FIX_200602
 Enable some fixes relating adinnet+module.

Functions

static void adin_thread_create ()
 create and start A/D-in and detection thread
void adin_setup_func (int(*cad_read)(SP16 *, int), boolean(*cad_pause)(), boolean(*cad_resume)(), boolean use_cut_def, boolean need_thread)
void adin_setup_param (int silence_cut, boolean strip_zero, int cthres, int czc, int head_margin, int tail_margin, int sample_freq, boolean ignore_speech, boolean need_zeromean)
boolean query_segment_on ()
boolean query_thread_on ()
void adin_reset_zmean ()
static void adin_purge (int from)
static int adin_cut (int(*ad_process)(SP16 *, int), int(*ad_check)())
 Main A/D-in function.
static int adin_store_buffer (SP16 *now, int len)
void adin_thread_input_main (void *dummy)
static int adin_thread_process (int(*ad_process)(SP16 *, int), int(*ad_check)())
 Main function of processing triggered samples at main thread.
int adin_go (int(*ad_process)(SP16 *, int), int(*ad_check)())
 Top function to start input processing.

Variables

Variables of zero-cross parameters and buffer sizes
static int c_length = 5000
 Computed length of cycle buffer for zero-cross, actually equals to head margin length.
static int c_offset = 0
 Static data DC offset (obsolute, should be 0).
static int wstep = DEFAULT_WSTEP
 Data fragment size.
static int thres
 Input Level threshold (0-32767).
static int noise_zerocross
 Computed threshold of zerocross num in the cycle buffer.
static int nc_max
 Computed number of fragments for tail margin.
Variables for delayed tail silence processing
static SP16swapbuf
 Buffer for re-triggering in tail margin.
static int sbsize
static int sblen
 Size and current length of swapbuf.
static int rest_tail
 Samples not processed yet in swap buffer.
Work area for device configurations for local use
static boolean(*) ad_resume ()
 Function pointer to (re)start input.
static boolean(*) ad_pause ()
 Function pointer to stop input.
static int(*) ad_read (SP16 *, int)
 Function pointer to read in input samples.
static boolean adin_cut_on
 TRUE if do input segmentation by silence.
static boolean silence_cut_default
 Device-dependent default value of adin_cut_on().
static boolean strip_flag
 TRUE if skip invalid zero samples.
static boolean enable_thread = FALSE
 TRUE if input device needs threading.
static boolean ignore_speech_while_recog = TRUE
 TRUE if ignore speech input between call, while waiting recognition process.
static boolean need_zmean
 TRUE if perform zmeansource.
Variables related to POSIX threading
static pthread_t adin_thread
 Thread information.
static pthread_mutex_t mutex
 Lock primitive.
static SP16speech
 Unprocessed samples recorded by A/D-in thread.
static int speechlen
 Current length of speech.
static boolean transfer_online = FALSE
 Semaphore to start/stop recognition.
static boolean adinthread_buffer_overflowed = FALSE
 Will be set to TRUE if speech has been overflowed.
Input data buffer
static SP16buffer = NULL
 Temporary buffer to hold input samples.
static int bpmax
 Maximum length of buffer.
static int bp
 Current point to store the next data.
static int current_len
 Current length of stored samples.
static SP16cbuf
 Buffer for flushing cycle buffer just after detecting trigger.


Detailed Description

Read in speech waveform and detect speech segment.

Author:
Akinobu LEE
Date:
Sat Feb 12 13:20:53 2005
This file contains functions to get speech waveform from an audio device and detect speech segment.

Speech detection is based on level threshold and zero cross count. The number of zero cross are counted for each incoming speech fragment. If the number becomes larger than specified threshold, the fragment is treated as a beginning of speech input (trigger on). If the number goes below the threshold, the fragment will be treated as an end of speech input (trigger off). In actual detection, margins are considered on the beginning and ending point, which will be treated as head and tail silence part. DC offset normalization will be also performed if configured so.

The triggered input speech data should be processed concurrently with the detection for real-time recognition. For this purpose, after the beginning of speech input has been detected, the following triggered input fragments (samples of a certain period in live input, or buffer size in file input) are passed sequencially in turn to a callback function. The callback function should be specified by the caller, typicaly to store the recoded speech, or to process them into a frame-synchronous recognition process.

When source is a live input such as microphone, the device buffer will overflow if the processing callback is slow. In that case, some input fragments may be lost. To prevent this, the A/D-in part together with speech detection will become an independent thread if pthread functions are supported. The A/D-in and detection thread will cooperate with the original main thread through speech buffer, like the followings:

adin_setup_func() is used to switch audio input by specifying device-dependent open/read/close functions, and should be called at first. Function adin_setup_param() should be called after adin_setup_func() to set various parameters for speech detection. The adin_go() function is the top function that will be called from outside, to perform actual input processing. adin_cut() is the main function to read audio input and detect speech segment.

See also:
adin.c
Revision
1.6

Definition in file adin-cut.c.


Function Documentation

static void adin_thread_create (  )  [static]

create and start A/D-in and detection thread

Start new A/D-in thread, and also initialize buffer speech.

Definition at line 957 of file adin-cut.c.

Referenced by adin_setup_param().

void adin_setup_func ( int(*)(SP16 *, int)  cad_read,
boolean(*)()  cad_pause,
boolean(*)()  cad_resume,
boolean  use_cut_def,
boolean  need_thread 
)

Store the given device-dependent functions and configuration values to local work area. This function will be called from adin_select() via adin_register_func().

Parameters:
cad_read  [in] function to read input samples
cad_pause  [in] function to stop input
cad_resume  [in] function to (re-)start input
use_cut_def  [in] TRUE if the device needs speech segment detection by default
need_thread  [in] TRUE if the device is live input and needs threading

Definition at line 179 of file adin-cut.c.

Referenced by adin_register_func().

void adin_setup_param ( int  silence_cut,
boolean  strip_zero,
int  cthres,
int  czc,
int  head_margin,
int  tail_margin,
int  sample_freq,
boolean  ignore_speech,
boolean  need_zeromean 
)

Setup silence detection parameters (should be called after adin_select()). If using pthread, the A/D-in and detection thread will be started at the end of this function.

Parameters:
silence_cut [in] whether to perform silence cutting. 0=force off, 1=force on, 2=keep device-specific default
strip_zero [in] TRUE if enables stripping of zero samples
cthres [in] input level threshold (0-32767)
czc [in] zero-cross count threshold in a second
head_margin [in] header margin length in msec
tail_margin [in] tail margin length in msec
sample_freq [in] sampling frequency: just providing value for computing other variables
ignore_speech [in] TRUE if ignore speech input between call, while waiting recognition process
need_zeromean [in] TRUE if perform zero-mean subtraction

Definition at line 216 of file adin-cut.c.

Referenced by adin_initialize().

boolean query_segment_on (  ) 

Query function to check whether the input speech detection is on or off.

Returns:
TRUE if on, FALSE if off.

Definition at line 254 of file adin-cut.c.

boolean query_thread_on (  ) 

Query function to check whether the input threading is on or off.

Returns:
TRUE if on, FALSE if off.

Definition at line 265 of file adin-cut.c.

void adin_reset_zmean (  ) 

Reset zero mean data to re-estimate zero mean at the next input.

Definition at line 275 of file adin-cut.c.

Referenced by adin_begin(), and adin_standby().

static void adin_purge ( int  from  )  [static]

Purge samples already processed in the temporary buffer buffer.

Parameters:
from [in] Purge samples in range [0..from-1].

Definition at line 324 of file adin-cut.c.

Referenced by adin_cut().

static int adin_cut ( int(*)(SP16 *, int)  ad_process,
int(*)()  ad_check 
) [static]

Main A/D-in function.

In threaded mode, this function will detach and loop forever in ad-in thread, storing triggered samples in speech, and telling the status to another process thread via transfer_online. The process thread, called from adin_go(), polls the length of speech and transfer_online, and if there are stored samples, process them.

In non-threaded mode, this function will be called directly from adin_go(), and triggered samples are immediately processed within here.

In module mode, the function argument ad_check should be specified to poll the status of incoming command from client while recognition.

Returns:
-1 on error, 0 on end of stream, >0 when paused by external process.

< TRUE if we are now triggered

Parameters:
ad_process  function to process the triggered samples
ad_check  function periodically called while input processing

Definition at line 351 of file adin-cut.c.

Referenced by adin_go(), and adin_thread_input_main().

static int adin_store_buffer ( SP16 now,
int  len 
) [static]

Callback for storing triggered samples to speech in A/D-in thread.

Parameters:
now [in] triggered fragment
len [in] length of above
Returns:
always 0, to tell caller to continue recording.

Definition at line 920 of file adin-cut.c.

Referenced by adin_thread_input_main().

void adin_thread_input_main ( void *  dummy  ) 

A/D-in thread main function: just call adin_cut() with storing function.

Parameters:
dummy [in] a dummy data, not used.

Definition at line 947 of file adin-cut.c.

Referenced by adin_thread_create().

static int adin_thread_process ( int(*)(SP16 *, int)  ad_process,
int(*)()  ad_check 
) [static]

Main function of processing triggered samples at main thread.

Wait for the new samples to be stored in speech by A/D-in thread, and if found, process them.

Parameters:
ad_process [in] function to process the recorded fragments
ad_check [in] function to be called periodically for checking incoming user command in module mode.
Returns:
-2 when input terminated by result of the ad_check function, -1 on error, 0 on end of stream, >0 if successfully segmented.

Definition at line 997 of file adin-cut.c.

Referenced by adin_go().

int adin_go ( int(*)(SP16 *, int)  ad_process,
int(*)()  ad_check 
)

Top function to start input processing.

If threading mode is enabled, this function simply enters to adin_thread_process() to process triggered samples detected by another running A/D-in thread.

If threading mode is not available or disabled by either device requirement or OS capability, this function simply calls adin_cut() to detect speech segment from input device and process them concurrently by one process.

Parameters:
ad_process [in] function to process the recorded fragments
ad_check [in] function to be called periodically for checking incoming user command in module mode.
Returns:
the same as adin_thread_process() in threading mode, or same as adin_cut() when non-threaded mode.

Definition at line 1126 of file adin-cut.c.

Referenced by main_recognition_loop().


Variable Documentation

boolean transfer_online = FALSE [static]

Semaphore to start/stop recognition.

If TRUE, A/D-in thread will store incoming samples to speech and main thread will detect and process them. If FALSE, A/D-in thread will still get input and check trigger as the same as TRUE case, but does not store them to speech.

Definition at line 300 of file adin-cut.c.


Generated on Tue Dec 26 12:54:22 2006 for Julian by  doxygen 1.5.0