This paper addresses the problem of fully automated mining of public space video data. A novel Markov Clustering Topic Model (MCTM) is introduced which builds on existing Dynamic Bayesian Network models (e.g. HMMs) and Bayesian topic models (e.g. Latent Dirichlet Allocation), and overcomes their drawbacks on accuracy, robustness and computational efficiency. Specifically, our model profiles complex dynamic scenes by robustly clustering visual events into activities and these activities into global behaviours, and correlates behaviours over time. A collapsed Gibbs sampler is derived for offline learning with unlabeled training data, and significantly, a new approximation to online Bayesian inference is formulated to enable dynamic scene understanding and behaviour mining in new video data online in real-time. The strength of this model is demonstrated by unsupervised learning of dynamic scene models, mining behaviours and detecting salient events in three complex and crowded public scenes.
|Title of host publication||2009 IEEE 12th International Conference on Computer Vision|
|Publisher||Institute of Electrical and Electronics Engineers (IEEE)|
|Number of pages||8|
|Publication status||Published - Sep 2009|