ASR: Automatic Speech Recognition

Overview

  • There are 18 lectures, taking place in weeks 1-9. Lectures are held on Mondays and Thursdays at 14:10, starting Monday 12 January. Monday lectures will be held in G.159 in Old College (just inside the Law School entrance on the north side of the quad) and Thursday lectures will be in 2.35 in the Edinburgh Futures Institute.
  • Lecture live streaming is available via Media Hopper Replay for students not able to attend in person – the link can be found on Learn under “Course Materials”.
  • Weekly lab sessions start in Week 3

The course

Automatic Speech Recognition (ASR) is concerned with models, algorithms, and systems for automatically transcribing recorded speech into text. This a hard problem since recorded speech can be highly variable - we do not necessarily who the speaker is, where the speech is recorded, or if there are other acoustic sources (such as noise or competing talkers) in the signal.

Addressing the problem of speech recognition requires some understanding of machine learning, signal processing, and acoustic phonetics. In this course we'll cover the required theoretical background, and how the theory can be transformed into useful speech recognition systems. Lab sessions – and the coursework – will use the open source OpenFst toolkit together with Python to build and run speech recognition systems.

People

  • Course organiser: Peter Bell
  • Assistant lecturer: Hao Tang
  • Guest lecturer: Ondrej Klejch
  • Teaching Assistant: George Karakasidis
  • Demonstrators: Emily Gaughan, Yen Meng

Required background

The perfect background for the ASR course would include the Speech Processing course and a machine learning course such as Applied Machine Learning (AML) or the machine learning practical (MLP).

However, because of the way people's degree programmes are structured, not many people who do ASR will have the perfect background! This is fine.

If you've done AML and/or MLP, but not Speech Processing, then you'll require some speech background. A couple of the earlier lectures will include some material that was in Speech Processing, but it is also recommended that you do some background study:

  • Look at the relevant chapters in Jurafsky and Martin: in particular, chapter 7 (Phonetics), especially sections 7.4 and 7.5.
  • Study some of the material covered in the Speech Processing course.

We'll point out useful links as we go through the course.

If you have taken Speech Processing, but not AML or MLP, then you'll require some machine learning background, especially to do with neural networks. There will be a couple of introductory lectures on neural networks, and we'll also point out useful additional background reading when relevant.


This page is maintained by Peter Bell.
 

License
Creative Commons - Attribution Non-Commercial Share A Like