Master Thesis Topic

Simple and Offline Speech Recognition through Genetic Programming

There has been much progress in speech recognition in the last decade. However, much of the research focus on online speech recognition done in real-time. It is also based on relatively complex signal processing algorithms. This thesis investigates if a much simpler algorithm can be used if we relax timing requirements and allow longer processing time, i.e. doing the recognition off-line.

Success would have many implications for implementing simpler Software Engineering tools, as well as a number of other applications.

Steps

The thesis project will involve

  1. a review of the existing studies and algorithms for (offline) speech recognition,
  2. developing a prototype system (in a dynamic programming language) for offline speech recognition
  3. evaluating the prototype on short sentences of speech in Apple's m4a format

We will use a dynamic programming language since speed of recognition is not a (main) factor and a dynamic language will speed up prototyping. A basic design of the algorithm is already available; focus here is on refining, implementing and evaluating it. The input format will be the m4a format since this makes it easier to work with audio collected on mobile devices such as the IPhone and IPad.

Prerequisites

Students interested in this topic should preferably have knowledge/experience/interest in:

  1. speech/audio processing (some),
  2. machine learning (optional),
  3. genetic algorithms/programming (optional),
  4. dynamic programming language like Ruby or Python (optional, easy to pick up),

Links / Input