Skip to main content
Passa alla visualizzazione normale.

SABATO MARCO SINISCALCHI

Consumer-level multimedia event detection through unsupervised audio signal modeling

  • Authors: B. Byun; I. Kim; S. M. Siniscalchi; C.-H. Lee
  • Publication year: 2012
  • Type: Contributo in atti di convegno pubblicato in volume
  • OA Link: http://hdl.handle.net/10447/649503

Abstract

In this work, a novel acoustic characterization approach to multimedia event detection (MED) task for unconstrained and unstructured consumer-level videos through audio signal modeling is proposed. The key idea is to characterize the acoustic space of interest with a set of fundamental acoustic units around which a set of acoustic segment models (ASMs) is built. A vector space modeling technique to address MED is here adopted, where an incoming audio signal is first decoded into a sequence of acoustic segments. Then, a feature vector is generated by using co-occurrence statistics of acoustic units, and the MED final decision is implemented with a vector space language classifier. Experimental evidence on the TRECVID2011 MED demonstrates the viability of the proposed approach. Furthermore, it better accounts for temporal dependencies than previously proposed MFCC bag-of-word approaches.