Speech technology

Recognition, understanding and synthesis of human speech, using a range of techniques and focusing on how systems recognise and generate the sounds of language.

We aim to maintain the size of this research area relative to the whole EPSRC portfolio. This strategy recognises the importance of Speech Technology to data science and the development of intelligent interfaces.

By the end of the current Delivery Plan period, we aim to have a research area which:

  • Addresses the challenges identified for this area, encompassing speech modelling, speech recognition, text-to-speech synthesis and spoken dialogue systems
  • Makes a significant contribution to data science in terms of the large-scale processing and understanding of multi-modal data, including text, audio and video
  • Includes research and training that contributes to development of intelligent dialogue interfaces which will serve as ways of communicating between humans and systems. The increased importance of the development of spoken dialogue systems will lead to strengthened links between Speech Technology and the Artificial Intelligence (AI) Technologies, Natural Language Processing (NLP) and Human-Computer Interaction research areas
  • Continues to support research into assistive technologies (e.g. personalised speech synthesis and recognition of disordered speech) and speech and language therapy technology
  • Includes a greater proportion of early-career researchers, to ensure the area's longer-term health

Researchers have the opportunity to play an important role in delivering the objectives of EPSRC priorities for the Information and Communication Technologies (ICT) Theme (especially Data Enabled Decision Making, People at the Heart of ICT and Future Intelligent Technologies).

To maximise the impact of these contributions, Speech Technology researchers should ensure effective communication with researchers in other contributing areas such as AI Technologies, NLP, Image and Vision Computing, and Human-Computer Interaction.

Highlights:

The UK has some of the world's internationally leading Speech Technology researchers, who form a small but strong community, as evidenced by publications in top journals and by conferences - identified by the Research Excellence Framework (REF) 2014 - and by the publication and maintenance of open-source software and open data used by the international community (Evidence source 1).

Many research challenges have opened up - including speech modelling, speech recognition, text-to-speech synthesis and spoken dialogue systems. The use of deep neural nets (machine learning) has been driving improvements to speech recognition systems. Mobile technologies (including very small devices without touchscreens) will become the main focus for interactive applications, where expressivity and multi-modality will become key requirements (Evidence source 2,3).

There has been growth in research carried out in areas combining Speech Technology with NLP, underpinned by AI Technologies. UK researchers are well-positioned for future growth in this field with the very strong NLP expertise which is also present in this country. This provides the UK with a unique capability in an international context.

The UK's strength at the interface of speech and language technologies and AI is evidenced by the very significant investment being made in the UK by major industry players (e.g. Amazon, Google, Facebook, Apple and Bloomberg), who have created or expanded UK-based research facilities and are heavily recruiting UK researchers with PhD or postdoctoral experience in Speech Technology, NLP and AI (Evidence source 2,4).

There is also large demand among small companies for PhD and postdoctoral-level researchers in Speech Technology. Given the increase in industrial recruitment, there is a question-mark over whether there are enough students coming through the system - for both the academic and the industrial pipeline. There is also a danger that UK academic institutions are being depleted of a postdoctoral workforce, leading to risks to the future leadership of the area (which is currently very strong in the UK). This may threaten the UK's international standing in Speech Technology (Evidence source 5).

This research area is expected to contribute to all EPSRC Outcomes, and particularly the following Ambitions:

C1: Enable a competitive, data-driven economy

This research area is expected to contribute to the smart tools and analytical techniques needed to generate actionable information from large, diverse audio and multi-modal datasets.

C3: Deliver intelligent technologies and systems

Speech Technology will contribute to the development of smart tools and intelligent technologies that turn data flows into physical action. This will include serving as an interface for communicating between intelligent systems and people using them.

H1: Transform community health and care

Speech Technology will contribute to new models of community-based health and care through robotics and autonomous systems for home-based care and rehabilitation.

  1. Input from the ICT and Digital Economy Strategic Advisory Teams (SATs), the UK Computing Research Committee (CRC) Executive Committee and REF panellists.
  2. ICT REF engagement with international experts on the future of the Speech Technology research area (2015).
  3. CITIA, CITIA Roadmap, (2016).
  4. Analysis of EPSRC data.
  5. Community engagement (individual input, group feedback and team visits).

Research area connections

This diagram shows the top 10 connections between Research Areas within the EPSRC research portfolio. The depth of the segment relates to value of grants and the width of the segment relates to the number of grants shared by those two Research Areas. Please click to see the related Research Area rationale.

Maintain

We aim to maintain this area as a proportion of the EPSRC portfolio.

Visualising our Portfolio (VoP)
Visualising our portfolio (VoP) is a tool for users to visually interact with the EPSRC portfolio and data relationships.

EPSRC Support by Research Area in Speech Technology (GoW)
Search EPSRC's research and training grants.

Contact Details

In the following table, contact information relevant to the page. The first column is for visual reference only. Data is in the right column.

Name: Zoe Brown
Job title: Portfolio Manager
Department: ICT
Organisation: EPSRC
Telephone: 01793 444087