Combining Speech Attribute Detection and Penalized Logistic Regression for Phoneme Recognition
- Authors: SINISCALCHI, SABATO MARCO
- Publication year: 2012
- Type: Articolo in rivista
- OA Link: http://hdl.handle.net/10447/649522
Abstract
Over the past few years, there has been a resurgence of interest in designing high-accuracy automatic speech recognition (ASR) systems due to the key rule they can play in many real-world applications, such as voice print for biometric identification, language identification, and call-scanning. Improving current state-of-the-art technology is therefore vital for the success of those aforementioned applications, yet this is not simple with the standard technology based on hidden Markov models (HMMs) trained on short-term spectral features. This paper offers an innovative prospective on how two novel prominent approaches to ASR, namely speech attribute detection and discriminative training, can be combined into a unified framework with beneficial effects on the overall speech recognition performance. This goal is achieved by embedding phonetic feature detection into a penalized logistic regression machine (PLRM). The proposed approach is evaluated on both isolated and continuous phoneme recognition tasks. Experimental evidence indicate that the proposed framework is able to achieve state-of-the-art performance in the isolated speech recognition task and to outperform current technology in the continuous speech recognition task.