Do machines finally understand us? Advances in Spoken Language Technologies
Spoken language technologies (SLTs) are fundamental to artificial intelligence research. Firmly based on machine learning, advances of SLT in the last decade have inspired many applications and brought some to the real world. With solid engineering approaches technology is now mature in many areas - however the interface to human understanding remain brittle and unsatisfactory. In this presentation I review the field, how recent changes fundamentally altered the approaches taken, and what challenges remain.
Thomas Hain is Professor for Computer Science at the University of Sheffield. He holds the degree `Dipl.-Ing' in Electrical and Communication Engineering from the University of Technology, Vienna, and a PhD in Information Engineering from Cambridge University (2002). During undergraduate studies he received a scholarship to conduct parts of his studies at RWTH Aachen, Germany. After receiving a first class degree from the University of Technology Vienna he worked at Philips Speech Processing, Vienna where he left as Senior Technologist in 1997 to join the Cambridge University Engineering Department as a PhD student and unusually, Research Associate at the same time. Shortly before graduating as a PhD student he was appointed Lecturer at Cambridge University in 2001. Prof. Hain moved to Sheffield University in 2004 to become a member of the Speech and Hearing Research Group (SpandH). After a series of intermediary promotions, he was appointed as Full Professor in 2013. Since 2009 he leads the 15-strong subgroup on Machine Intelligence for Natural Interfaces and in 2016 he took the role of Head of SpandH - a group which now consists of 9 academics, 12 postdoctoral researchers, and more than 35 PhD students. In 2016 Prof. Hain also became a member of the Machine Learning Research group, Director of the Voicebase Centre for Speech and Language Technology in 2018, and Director of the UKRI Centre for Doctoral Training in Speech and Language Technologies and Their Applications in 2019. He has more than 190 publications on Machine Learning and speech recognition topics (Google citations 12k, h-index 35). In addition to membership of many technical committees, including repeated positions of area chair at ICASSP, Interspeech, and ICPR, he has been organising committee member of Interspeech 2009, IEEE ASRU 2011 and 2013. He is also one of the key organisers and technical chair of Interspeech 2019. He was as Assoc. Editor of ACM Transactions on Speech and Language Processing, and is currently a member of the editorial board of Computer Speech & Language and twice elected member of the IEEE Speech Technical Committee, and currently serves on the ISCA Technical Committee. Prof. Hain has been an investigator on more than 20 projects, funded by FP6, FP7, EPSRC, UKRI, DARPA, and industry, with a cumulative research budget of £18.5M (£13.5M as PI). Until recently he served as PI at Sheffield for the prestigious EPSRC programme grant NST (total budget £ 6.2M, Sheffield £2.2M), and works on numerous industrial projects (e.g. sponsored by 3M, Google). Current projects include MAUDIE (Innovate UK), Tuto (industry) and BiT(University), and the Voicebase Centre for Speech and Language Technology. Prof. Hain is the PI and Director of the recently created UKRI Centre for Doctoral Training in Speech and Language Technologies and Their Applications - an £8M programme to train more than 60 PhD students over 8 years, in collaboration with industry.
This talk is co-sponsored by Silicon Austria Labs, IEEE Austria COM/MTT Joint Chapter, Johannes Kepler University.
For information about the Zoom and physical meeting, please contact Hans-Peter Bernhard: h.p.bernhard [at] ieee.org