DISCRETE-TIME SPEECH PROCESSING

Course Information
TitleΨΗΦΙΑΚΗ ΕΠΕΞΕΡΓΑΣΙΑ ΟΜΙΛΙΑΣ / DISCRETE-TIME SPEECH PROCESSING
CodeNDM-07-02
FacultySciences
SchoolInformatics
Cycle / Level1st / Undergraduate, 2nd / Postgraduate
Teaching PeriodWinter
CoordinatorKonstantinos(constantine) Kotropoulos
CommonNo
StatusActive
Course ID40002974

Programme of Study: PPS-Tmīma Plīroforikīs (2019-sīmera)

Registered students: 1
OrientationAttendance TypeSemesterYearECTS
GENIKĪ KATEUTHYNSĪYPOCΗREŌTIKO KATA EPILOGĪ745

Class Information
Academic Year2016 – 2017
Class PeriodWinter
Faculty Instructors
Weekly Hours4
Total Hours52
Class ID
600039749
Course Type 2016-2020
  • Background
Course Type 2011-2015
General Foundation
Mode of Delivery
  • Face to face
Digital Course Content
Erasmus
The course is also offered to exchange programme students.
Language of Instruction
  • Greek (Instruction, Examination)
  • English (Instruction, Examination)
Prerequisites
General Prerequisites
Prior exposition to a course on partial differential equations on the top of understanding of signals and systems, digital signal processing, stochastic signal processing, pattern recognition as well as pre-existing computer programming skills facilitate the grasping of concepts by the students as well as their successful enrollment in the course.
Learning Outcomes
Cognitive: Acquaintance with short-term signal processing. Exposition to speech science fundamentals. Deep understanding of linear prediction analysis. Thorough grasp of the feature extraction techniques as front-end of speech recognition. Treating speech recognition as a pattern recognition problem. Skills: Building the foundations for further studies focused on speech synthesis, speech recognition, and language technology. Promoting analytical and programming skills. Critical re-view of the whole undergraduate syllabus from speech recognition and language technology perspective. Programming an isolated-word speech recognition system.
General Competences
  • Apply knowledge in practice
  • Retrieve, analyse and synthesise data and information, with the use of necessary technologies
  • Adapt to new situations
  • Make decisions
  • Work autonomously
  • Generate new research ideas
  • Advance free, creative and causative thinking
Course Content (Syllabus)
Anatomy and physiology of speech production system. Phonetics and phonology. Modeling speech production. Short-term processing of speech. Linear prediction analysis. Cepstral analysis. Dynamic time warping.
Keywords
Phonetics, Speech Modeling, Short-term Speech Processing, Linear Prediction Analysis, Cepstral Analysis, Dynamic Time Warping
Educational Material Types
  • Notes
  • Slide presentations
  • Book
Use of Information and Communication Technologies
Use of ICT
  • Use of ICT in Course Teaching
  • Use of ICT in Laboratory Teaching
  • Use of ICT in Communication with Students
Description
Sides and MATLAB demos.
Course Organization
ActivitiesWorkloadECTSIndividualTeamworkErasmus
Lectures60
Reading Assigment15
Tutorial15
Project30
Written assigments15
Exams15
Total150
Student Assessment
Description
a1) Lecture attendance and active participation in them is compulsory. Absence from 30 the total educational activities is tolerated at most. Failure to comply with this requirement implies failure in the course automatically. a2) Homework is assigned to students, including survey writing and computer-based project implementation. The surveys deal with the various stages of the discrete-time speech processing, speech and speaker recognition, and statistical language engineering. The computer-based projects aim at modeling speech production by humans, speech segmentation, short-time speech processing, linear prediction analysis, homomorphic speech processing, dynamic time warping, and so on. Students are requested to develop demo applications in MATLAB or python. One survey and one related computer-based project is assigned to each student. Homework assignment is posted to the course web page at http://pileas.csd.auth.gr. a3) Hourly mid-term and final progress exams are organized in the course. During these exams, the students are requested to provide short answers in 10-15 questions covering the topics taught in the course. The mid-term exam is scheduled for the 8th week and the final progress exam will take place during the period of exams. The final grade results by 10% from the survey assessment, 50% from the computer-project assessment (including project presentation), and 40% from paper grading in the mid-term and final progress exams. The students pass the course, if their total grade is greater than on equal to five (5). Details on the grading procedure are announced in the course web page, which supersede any prior arrangement.
Student Assessment methods
  • Written Exam with Short Answer Questions (Formative, Summative)
  • Written Assignment (Formative, Summative)
  • Performance / Staging (Formative, Summative)
  • Written Exam with Problem Solving (Formative, Summative)
Bibliography
Course Bibliography (Eudoxus)
Rabiner, L. R., and Schafer, R. W. (επιμ. μτφ. Α. Πικράκης) «Ψηφιακή Επεξεργασία Φωνής», Broken Hill Publishers LTD, Λευκωσία Κύπρος 2012. McClellan J., Schafer R., Yoder M. (μτφ. Ε. Ζ. Ψαράκης) «Θεμελιώδεις Έννοιες της Επεξεργασίας Σημάτων», Εκδόσεις Άγγελος Γκότσης, Πάτρα 2006
Additional bibliography for study
Σημειώσεις σε ηλεκτρονική μορφή του διδάσκοντος και επιμέλεια διαφανειών από διδασκαλία αντίστοιχου μαθήματος σε πανεπιστήμια του εξωτερικού. Ενδεικτικoί τίτλοι. Πλήρης κατάλογος στην ιστοσελίδα του μαθήματος: J. R. Deller, J. G. Proakis, and J. H. L. Hansen, Discrete-Time Processing of Speech Signals. New York, Ν.Y.: Wiley-IEEE, 1999. D. Jurafsky and J. H. Martin, Speech and Language Processing, 2/e. Upper Saddle River, N.J.: Pearson Education-Prentice Hall, 2009. T. F. Quartieri, Discrete-Time Speech Signal Processing: Principles and Practice. Upper Saddle River, N.J.: Pearson Education-Prentice Hall, 2002. X. Huang, A. Acero, and H. -S. Hon, Spoken Language Processing. Upper Saddle River, N.J.: Pearson Education-Prentice Hall, 2001. L. R. Rabiner and R. W. Schafer, Theory and Applications of Digital Speech Processing. Upper Saddle River, N.J.: Pearson Education-Prentice Hall, 2011. T. Dutoit and F. Marques, Applied Signal Processing. A MATLAB-Based Proof of Concept. New York, N.Y.: Springer, 2009 (πρόσβαση στο e-book μέσω του www.lib.auth.gr)
Last Update
23-09-2016