Konuşma işleme üzerine bir ders içeriği
Speech Processing (E4.14)
Mike Brookes20 lectures in the Spring Term
Syllabus
The human vocal and auditory systems. Characteristics of speech signals: phonemes, prosody, IPA notation. Lossless tube model of speech production. Time and frequency domain representations of speech; window characteristics and time/frequency resolution tradeoffs. Properties of digital filters: mean log response, resonance gain and bandwidth relations, bandwidth expansion transformation, all-pass filter characteristics. Autocorrelation and covariance linear prediction of speech; optimality criteria in time and frequency domains; alternate LPC parametrisation. Speech coding: PCM, ADPCM, CELP. Speech synthesis: language processing, prosody, diphone and formant synthesis; time domain pitch and speech modification. Speech recognition: hidden Markov models and associated recognition and training algorithms. Language modelling. Large vocabulary recognition. Acoustic preprocessing for speech recognition.
Lecture List
Overview of Course
Sound Waves in a Tube
Time-Frequency Representation
Characteristics of Filters
Autocorrelation Linear Prediction and Spectral Whitening
Covariance LPC and LPC Parameter Sets
Cepstral Coefficients and Line Spectrum Frequencies
Speech Coding using uniform and non-uniform quantisation
Speech Coding using Adaptive Differential PCM
Code-excited Linear Prediction
Phonetics: Vowels, Consonants and Prosody
Speech Synthesis: Words to Phonemes
Speech Synthesis: Phonemes to Sounds
Introduction to Speech Recognition
Hidden Markov Models and Viterbi Recognition
Hidden Markov Model Training
Continuous Speech Recognition
Language Modelling
Input Processing
kaynak:http://www.ee.ic.ac.uk/hp/staff/dmb/courses/speech/speech.htm
Hiç yorum yok:
Yorum Gönder