SABATO MARCO SINISCALCHI

Detection-Based ASR in the Automatic Speech Attribute Transcription Project

Autori: I. BROMBERG; Q. FU; J. HOU; J. LI; C. MA; B. MATTHEWS; A. D. MORENO; J. MORRIS; S. M. SINISCALCHI; Y. TSAO; Y. WANG
Anno di pubblicazione: 2007
Tipologia: Contributo in atti di convegno pubblicato in volume
OA Link: http://hdl.handle.net/10447/649499

Abstract

We present methods of detector design in the Automatic Speech Attribute Transcription project. This paper details the results of a student-led, cross-site collaboration between Georgia Institute of Technology, The Ohio State University and Rutgers University. The work reported in this paper describes and evaluates the detection-based ASR paradigm and discusses phonetic attribute classes, methods of detecting framewise phonetic attributes and methods of combining attribute detectors for ASR. We use Multi-Layer Perceptrons, Hidden Markov Models and Support Vector Machines to compute confidence scores for several prescribed sets of phonetic attribute classes. We use Conditional Random Fields (CRFs) and knowledge-based rescoring of phone lattices to combine framewise detection scores for continuous phone recognition on the TIMIT database. With CRFs, we achieve a phone accuracy of 70.63%, outperforming the baseline and enhanced HMM systems, by incorporating all of the attribute detectors discussed in the paper