Bu Blogda Ara


Konuşma İşleme -Speech Processing

Konuşma işleme üzerine bir ders içeriği

Speech Processing (E4.14)
Mike Brookes20 lectures in the Spring Term

The human vocal and auditory systems. Characteristics of speech signals: phonemes, prosody, IPA notation. Lossless tube model of speech production. Time and frequency domain representations of speech; window characteristics and time/frequency resolution tradeoffs. Properties of digital filters: mean log response, resonance gain and bandwidth relations, bandwidth expansion transformation, all-pass filter characteristics. Autocorrelation and covariance linear prediction of speech; optimality criteria in time and frequency domains; alternate LPC parametrisation. Speech coding: PCM, ADPCM, CELP. Speech synthesis: language processing, prosody, diphone and formant synthesis; time domain pitch and speech modification. Speech recognition: hidden Markov models and associated recognition and training algorithms. Language modelling. Large vocabulary recognition. Acoustic preprocessing for speech recognition.

Lecture List
Overview of Course
Sound Waves in a Tube
Time-Frequency Representation
Characteristics of Filters
Autocorrelation Linear Prediction and Spectral Whitening
Covariance LPC and LPC Parameter Sets
Cepstral Coefficients and Line Spectrum Frequencies
Speech Coding using uniform and non-uniform quantisation
Speech Coding using Adaptive Differential PCM
Code-excited Linear Prediction
Phonetics: Vowels, Consonants and Prosody
Speech Synthesis: Words to Phonemes
Speech Synthesis: Phonemes to Sounds
Introduction to Speech Recognition
Hidden Markov Models and Viterbi Recognition
Hidden Markov Model Training
Continuous Speech Recognition
Language Modelling
Input Processing


Hiç yorum yok: