Chair of Computer Science - Human Language Technology and Pattern Recognition, RWTH Aachen
"The Statistical Approach to Speech Recognition and Natural Language Processing"
The last 25 years have seen a dramatic progress in statistical methods for recognizing speech signals and for translating spoken and written language. This lecture gives an overview of the underlying statistical methods. In particular, the lecture will focus on the remarkable fact that, for these tasks and similar tasks like handwriting recognition, the statistical approach makes use of the same four principles:
1) Bayes decision rule for minimum error rate; 2) probabilistic models, e.g. Hidden Markov models or conditional random fields for handling strings of observations (like acoustic vectors for speech recognition and written words for language translation); 3) training criteria and algorithms for estimating the free model parameters from large amounts of data; 4) the generation or search process that generates the recognition or translation result.
Most of these methods had originally been designed for speech recognition. However, it has turned out that, with suitable modifications, the same concepts carry over to language translation and other tasks in natural language processing. This lecture will summarize the achievements and the open problems in this field.