Course Information
Cycle / Level2nd / Postgraduate
Teaching PeriodWinter
CoordinatorKonstantinos(constantine) Kotropoulos
Course ID600000892

Programme of Study: Internet and World Wide Web

Registered students: 0
OrientationAttendance TypeSemesterYearECTS
KORMOSElective Courses117.5

Class Information
Academic Year2017 – 2018
Class PeriodWinter
Faculty Instructors
Weekly Hours3
Class ID
Course Type 2016-2020
  • Scientific Area
Course Type 2011-2015
Specific Foundation / Core
Mode of Delivery
  • Face to face
Digital Course Content
The course is also offered to exchange programme students.
Language of Instruction
  • Greek (Instruction, Examination)
  • English (Instruction, Examination)
General Prerequisites
Prior exposition to undergraduate courses on Artificial Intelligence, Pattern Recognition, and Discrete-Time Speech Processsing as well as pre-existing computer programming skills facilitate the grasping of concepts by the student as well as their successful enrolment in the course.
Learning Outcomes
Cognitive: Exposition to speech science fundamentals. Treating speech recognition as a pattern recognition problem. Systematic transition from deterministic techniques, such as Dynamic Time Warping to statistical ones (e.g., Hidden Markov Models). Decomposition of text-to-speech (TTS) synthesis into sub-problems that can be solved by either classical techniques from Artificial Intelligence (e.g., finite-state automata, finite-state transducers, context-free gramars) or Digital Signal Processing for converting phonetic transcription into speech. Deep understanding of prosody. Skills: Building the foundations for professional enrolment in language technologies. Promoting analytical and programming skills. Acquaintance with platforms, such as SONIC, HTK, Sphinx, SRI Language Toolkit, CMU Language Toolkit, Festival through two team large-scale projects one in speech recognition and another one in TTS.
General Competences
  • Apply knowledge in practice
  • Retrieve, analyse and synthesise data and information, with the use of necessary technologies
  • Adapt to new situations
  • Make decisions
  • Work autonomously
  • Work in teams
  • Generate new research ideas
  • Design and manage projects
  • Be critical and self-critical
  • Advance free, creative and causative thinking
Course Content (Syllabus)
Automatic speech recognition from the point of view of pattern recognition. Dynamic Time Warping. Gaussian mixture models. Hidden Markov models. Statistical language modeling. Estimation of probabilities and language models with maximum entropy techniques. Review of discrete - time speech processing. Text-to-speech (TTS) transcription. Grammars, inference, parsing, and transduction. Natural language processing architectures for TTS synthesis. Morpho - syntactic analysis. Automatic phonetization. Automatic prosody generation. Synthesis strategies.
speech recognition, text-to-speech synthesis, natural language processing
Educational Material Types
  • Slide presentations
  • Book
Use of Information and Communication Technologies
Use of ICT
  • Use of ICT in Course Teaching
  • Use of ICT in Laboratory Teaching
  • Use of ICT in Communication with Students
Slides and MATLAB demos.
Course Organization
Reading Assigment301
Written assigments301
Student Assessment
Assignment, implementation, demonstration of large-scale computer-based projects and written examination. Students are assessed with respect to the progress they make during project implementation (50%), course attendance and active participation in them (10%) and their performance in written exams (40%). Homework assignment and deadlines are announced in the course web page at http://pileas.csd.auth.gr. Students pass the course, if their total grade is greater than on equal to 5. Details on the grading procedure can be found in the course web page.
Student Assessment methods
  • Written Exam with Short Answer Questions (Formative, Summative)
  • Written Assignment (Formative, Summative)
  • Performance / Staging (Formative, Summative)
Additional bibliography for study
X. Huang, A. Acero, and H. -S. Hon, Spoken Language Processing. Upper Saddle River, N.J.: Pearson Education-Prentice Hall, 2001. D. Jurafsky and J. H. Martin, Speech and Language Processing, 2/e. Upper Saddle River, N.J.: Pearson Education-Prentice Hall, 2009. J. R. Deller, J. G. Proakis, and J. H. L. Hansen, Discrete-Time Processing of Speech Signals. New York, Ν.Y.: Wiley-IEEE, 1999. T. F. Quartieri, Discrete-Time Speech Signal Processing: Principles and Practice. Upper Saddle River, N.J.: Pearson Education-Prentice Hall, 2002. L. R. Rabiner and R. W. Schafer, Theory and Applications of Digital Speech Processing. Upper Saddle River, N.J.: Pearson Education-Prentice Hall, 2011. S. E. Levinson, Mathematical Models for Speech Technology. New York, N.Y.: J. Wiley & Sons, 2005. T. Dutoit, Ιntroduction to Speech Synthesis, 1/e. Dordrecht, The Netherlands: Kluwer Academic Publishers, 1997. F. Jelinek, Statistical Methods for Speech Recognition. Cambridge, MA: The MIT Press, 1999. T. Dutoit and F. Marques, Applied Signal Processing. A MATLAB-Based Proof of Concept. New York, N.Y.: Springer, 2009 (πρόσβαση στο e-book μέσω του www.lib.auth.gr).
Last Update