header image

Multimodal Fusion using Dynamic Hybrid Models (WACV 2014)

Posted by: | October 23, 2013 | No Comment |

We propose a novel staged hybrid model that exploits the strength of discriminative classifiers along with the representational power of generative models. Our focus is on detecting multimodal events in time varying data sequences. Discriminative classifiers have been shown to achieve higher performances than the corresponding generative likelihood-based classifiers. On the other hand, generative models learn a rich informative space which allows for data generation and joint feature representation that discriminative models lack. We employ a deep temporal generative model for unsupervised learning of a shared representation across multiple modalities with time varying data. The temporal generative model takes into account short term temporal phenomena and allows for filling in missing data by generating data within or across modalities. The hybrid model involves augmenting the temporal generative model with a Conditional Random Field based temporal discriminative model for event detection, classification, and generation, which enables modeling long range temporal dynamics. We evaluate our approach on multiple audio-visual datasets (AVEC, AVLetters, and CUAVE) and demonstrate its superiority compared to the state-of-the-art. Paper

Print Friendly, PDF & Email
under: Main Conference, Publications

Leave a response

Your response: