Cognitive: Acquaintance with short-term signal processing. Exposition to speech science fundamentals. Deep understanding of linear prediction analysis. Thorough grasp of the feature extraction techniques as front-end of speech recognition. Treating speech recognition as a pattern recognition problem.
Skills: Building the foundations for further studies focused on speech synthesis, speech recognition, and language technology. Promoting analytical and programming skills. Critical re-view of the whole undergraduate syllabus from speech recognition and language technology perspective. Programming an isolated-word speech recognition system.
Course Content (Syllabus)
Anatomy and physiology of speech production system. Phonetics and phonology. Modeling speech production. Short-term processing of speech. Linear prediction analysis. Cepstral analysis. Dynamic time warping.
Phonetics, Speech Modeling, Short-term Speech Processing, Linear Prediction Analysis, Cepstral Analysis, Dynamic Time Warping
a1) Lecture attendance and active participation in them is compulsory. Absence from 30 the total educational activities is tolerated at most. Failure to comply with this requirement implies failure in the course automatically.
a2) Homework is assigned to students, including survey writing and computer-based project implementation. The surveys deal with the various stages of the discrete-time speech processing, speech and speaker recognition, and statistical language engineering. The computer-based projects aim at modeling speech production by humans, speech segmentation, short-time speech processing, linear prediction analysis, homomorphic speech processing, dynamic time warping, and so on. Students are requested to develop demo applications in MATLAB or python. One survey and one related computer-based project is assigned to each student. Homework assignment is posted to the course web page at http://pileas.csd.auth.gr.
a3) Hourly mid-term and final progress exams are organized in the course. During these exams, the students are requested to provide short answers in 10-15 questions covering the topics taught in the course. The mid-term exam is scheduled for the 8th week and the final progress exam will take place during the period of exams.
The final grade results by 10% from the survey assessment, 50% from the computer-project assessment (including project presentation), and 40% from paper grading in the mid-term and final progress exams. The students pass the course, if their total grade is greater than on equal to five (5). Details on the grading procedure are announced in the course web page, which supersede any prior arrangement.
Course Bibliography (Eudoxus)
Rabiner, L. R., and Schafer, R. W. (επιμ. μτφ. Α. Πικράκης) «Ψηφιακή Επεξεργασία Φωνής», Broken Hill Publishers LTD, Λευκωσία Κύπρος 2012.
McClellan J., Schafer R., Yoder M. (μτφ. Ε. Ζ. Ψαράκης) «Θεμελιώδεις Έννοιες της Επεξεργασίας Σημάτων», Εκδόσεις Άγγελος Γκότσης, Πάτρα 2006
Additional bibliography for study
Σημειώσεις σε ηλεκτρονική μορφή του διδάσκοντος και επιμέλεια διαφανειών από διδασκαλία αντίστοιχου μαθήματος σε πανεπιστήμια του εξωτερικού.
Ενδεικτικoί τίτλοι. Πλήρης κατάλογος στην ιστοσελίδα του μαθήματος:
J. R. Deller, J. G. Proakis, and J. H. L. Hansen, Discrete-Time Processing of Speech Signals. New York, Ν.Y.: Wiley-IEEE, 1999.
D. Jurafsky and J. H. Martin, Speech and Language Processing, 2/e. Upper Saddle River, N.J.: Pearson Education-Prentice Hall, 2009.
T. F. Quartieri, Discrete-Time Speech Signal Processing: Principles and Practice. Upper Saddle River, N.J.: Pearson Education-Prentice Hall, 2002.
X. Huang, A. Acero, and H. -S. Hon, Spoken Language Processing. Upper Saddle River, N.J.: Pearson Education-Prentice Hall, 2001.
L. R. Rabiner and R. W. Schafer, Theory and Applications of Digital Speech Processing. Upper Saddle River, N.J.: Pearson Education-Prentice Hall, 2011.
T. Dutoit and F. Marques, Applied Signal Processing. A MATLAB-Based Proof of Concept. New York, N.Y.: Springer, 2009 (πρόσβαση στο e-book μέσω του www.lib.auth.gr)