DISCRETE TIME SPEECH SYNTHESIS - LANGUAGE TECHNOLOGY

Course Information
TitleΨΗΦΙΑΚΗ ΣΥΝΘΕΣΗ ΟΜΙΛΙΑΣ - ΓΛΩΣΣΙΚΗ ΤΕΧΝΟΛΟΓΙΑ / DISCRETE TIME SPEECH SYNTHESIS - LANGUAGE TECHNOLOGY
CodeDM06
FacultySciences
SchoolInformatics
Cycle / Level2nd / Postgraduate
Teaching PeriodWinter
CoordinatorKonstantinos(constantine) Kotropoulos
CommonNo
StatusActive
Course ID40002281

Programme of Study: PPS School of Informatics (2014-today)

Registered students: 5
OrientationAttendance TypeSemesterYearECTS
TECΗNOLOGIES GNŌSĪS DEDOMENŌN KAI LOGISMIKOUElective Courses117.5
TECΗNOLOGIES PLĪROFORIAS KAI EPIKOINŌNIŌN STĪN EKPAIDEUSĪElective Courses117.5
PSĪFIAKA MESA- YPOLOGISTIKĪ NOĪMOSYNĪElective Courses belonging to the selected specialization117.5
DIKTYAKA SYSTĪMATAElective Courses117.5

Programme of Study: PPS of School of Informatics (2013-today)

Registered students: 0
OrientationAttendance TypeSemesterYearECTS
Information SystemsElective Courses217.5
Information And Communication Technologies In EducationElective Courses217.5
Digital MediaCompulsory Course217.5
Communication Systems and TechnologiesElective Courses217.5

Class Information
Academic Year2015 – 2016
Class PeriodWinter
Faculty Instructors
Weekly Hours3
Class ID
600011291
Type of the Course
  • Scientific Area
Course Category
Specific Foundation / Core
Mode of Delivery
  • Face to face
Erasmus
The course is also offered to exchange programme students.
Language of Instruction
  • Greek (Instruction, Examination)
  • English (Instruction, Examination)
Prerequisites
General Prerequisites
Prior exposition to undergraduate courses on Artificial Intelligence, Pattern Recognition, and Discrete-Time Speech Processing as well as pre-existing computer programming skills facilitate the grasping of concepts by the student as well as their successful enrollment in the course.
Learning Outcomes
Cognitive: Exposition to speech science fundamentals. Treating speech recognition as a pattern recognition problem. Systematic transition from deterministic techniques, such as Dynamic Time Warping to statistical ones (e.g., Hidden Markov Models). Decomposition of text-to-speech (TTS) synthesis into sub-problems that can be solved by either classical techniques from Artificial Intelligence (e.g., finite-state automata, finite-state transducers, context-free grammars) or Digital Signal Processing for converting phonetic transcription into speech. Deep understanding of prosody. Skills: Building the foundations for professional enrollment in language technologies. Promoting analytical and programming skills. Acquaintance with platforms, such as SONIC, HTK, Sphinx, SRI Language Toolkit, CMU Language Toolkit, Festival through two team large-scale projects one in speech recognition and another one in TTS.
General Competences
  • Apply knowledge in practice
  • Retrieve, analyse and synthesise data and information, with the use of necessary technologies
  • Adapt to new situations
  • Make decisions
  • Work autonomously
  • Work in teams
  • Generate new research ideas
  • Design and manage projects
  • Be critical and self-critical
  • Advance free, creative and causative thinking
Course Content (Syllabus)
Automatic speech recognition from the point of view of pattern recognition. Dynamic Time Warping. Gaussian mixture models. Hidden Markov models. Statistical language modeling. Estimation of probabilities and language models with maximum entropy techniques. Review of discrete-time speech processing. Text-to-speech (TTS) transcription. Grammars, inference, parsing, and transduction. Natural language processing architectures for TTS synthesis. Morpho-syntactic analysis. Automatic phonetization. Automatic prosody generation. Synthesis strategies.
Keywords
peech recognition, text-to-speech synthesis, natural language processing
Educational Material Types
  • Slide presentations
  • Book
Use of Information and Communication Technologies
Use of ICT
  • Use of ICT in Course Teaching
  • Use of ICT in Laboratory Teaching
  • Use of ICT in Communication with Students
Description
Slides and Demos.
Course Organization
ActivitiesWorkloadECTSIndividualTeamworkErasmus
Lectures752.5
Reading Assigment301
Project752.5
Written assigments301
Exams150.5
Total2257.5
Student Assessment
Description
Assignment, implementation, demonstration of large-scale computer-based projects and written examination. Students are assessed with respect to the progress they make during project implementation (50%), course attendance and active participation in them (10%) and their performance in written exams (40%). Homework assignment and deadlines are announced in the course web page at http://pileas.csd.auth.gr. Students pass the course, if their total grade is greater than on equal to 5. Details on the grading procedure can be found in the course web page.
Student Assessment methods
  • Written Exam with Short Answer Questions (Formative, Summative)
  • Written Assignment (Formative, Summative)
  • Performance / Staging (Formative, Summative)
Bibliography
Additional bibliography for study
Προτεινόμενη βιβλιογραφία: X. Huang, A. Acero, and H. -S. Hon, Spoken Language Processing. Upper Saddle River, N.J.: Pearson Education-Prentice Hall, 2001. D. Jurafsky and J. H. Martin, Speech and Language Processing, 2/e. Upper Saddle River, N.J.: Pearson Education-Prentice Hall, 2009. Άλλοι ενδεικτικoί τίτλοι (Πλήρης κατάλογος στην ιστοσελίδα του μαθήματος): J. R. Deller, J. G. Proakis, and J. H. L. Hansen, Discrete-Time Processing of Speech Signals. New York, Ν.Y.: Wiley-IEEE, 1999. T. F. Quartieri, Discrete-Time Speech Signal Processing: Principles and Practice. Upper Saddle River, N.J.: Pearson Education-Prentice Hall, 2002. L. R. Rabiner and R. W. Schafer, Theory and Applications of Digital Speech Processing. Upper Saddle River, N.J.: Pearson Education-Prentice Hall, 2011. S. E. Levinson, Mathematical Models for Speech Technology. New York, N.Y.: J. Wiley & Sons, 2005. T. Dutoit, Ιntroduction to Speech Synthesis, 1/e. Dordrecht, The Netherlands: Kluwer Academic Publishers, 1997. F. Jelinek, Statistical Methods for Speech Recognition. Cambridge, MA: The MIT Press, 1999. T. Dutoit and F. Marques, Applied Signal Processing. A MATLAB-Based Proof of Concept. New York, N.Y.: Springer, 2009 (πρόσβαση στο e-book μέσω του www.lib.auth.gr)
Last Update
15-02-2016