The overall objective of this research effort is to establish confidence measures as a feasible means to handle the uncertainty about the correctness of speech recognition systems results, and to apply them as an effective way to validate or reject results.
As previously stated in section 1 the correctness status may consider several levels of correspondence, the same levels in which the speech communication issue can be understood. Since the approach to this work is from the signal processing point of view, the orthographic transcription of speech is its main objective and therefore the correspondence question will be addressed between utterances and transcriptions.
One of the goals of this research work is to develop a procedure, as general as possible, for building up confidence measures by minimizing as much as possible the need for task related knowledge. Minimizing does not mean neglecting. It is expected that such a system should include some knowledge about the particular environment to be surrounded by. This knowledge can be obtained prior to the system operation (e.g. by including the language model information on the design of the confidence measures generator) or by means of on-line adaptation of the original general purpose system to a derived task-oriented one by means of a portion of the task specific database. The aim of this project is to avoid full reconfiguration of the whole system every time the application is changed. Thus, it is expected to build a confidence measures generator based on an application and speaker independent recognition system that utilizes for acoustic modeling sub-lexical unit sets allowing to be combined and capable of dealing with large vocabularies. The confidence measures generated are expected to be of this same fashion (as in ). In order to be able to cope with large vocabularies, the generator system should be able to handle out-of-vocabulary utterances and noises (as the ones considered by the keyword spotting techniques), therefore the keyword spotting capability should not be discarded of the designed system.
On the other hand, by submitting the same system to different environments, the robustness issue should be taken into account. It is our aim to build a robust system able to face different acoustic and environmental conditions, so the inclusion of robust speech recognition will also be considered in this research work.
Based on the previous work by Eide et al  and by Chase , the search for the main causes of speech misrecognition will be addressed in this work. Knowing that the word error rate provides no insight into the factors responsible for recognition errors, we consider convenient to look for the differences between correctly and incorrectly recognized sequences in order to consider them as discriminant features.
The application of confidence measures in the validation process of recognition results is another relevant goal of this work. It is not enough just to express how reliable the results are, it is also important to develop a rejection strategy of potential recognition errors; moreover, an efficient verification step may lead to error correction. The repercussion of building effective confidence measures in the utterance verification process will be considered in this work.
The research objectives previously stated should derived in the construction of an application independent and robust labeler system capable to tag recognition results with a measure of confidence about their correctness. The resulting confidence measures should represent the core of an effective application independent and robust utterance verifier.