header image

A Chains Model for Localizing Participants of Group Activities in Videos (ICCV 2011)

Posted by: | August 10, 2011 | No Comment |

Given a video, we would like to recognize group activities,localize video parts where these activities occur, anddetect actors involved in them. This advances prior workthat typically focuses only on video classification. We makea number of contributions. First, we specify a new, midlevel,video feature aimed at summarizing local visual cuesinto bags of the right detections (BORDs). BORDs seek toidentify the right people who participate in a target groupactivity among many noisy people detections. Second, weformulate a new, generative, chains model of group activities.Inference of the chains model identifies a subset ofBORDs in the video that belong to occurrences of the activity,and organizes them in an ensemble of temporal chains.The chains extend over, and thus localize, the time intervalsoccupied by the activity. We formulate a new MAP inferencealgorithm that iterates two steps: i) Warps the chainsof BORDs in space and time to their expected locations,so the transformed BORDs can better summarize local visualcues; and ii) Maximizes the posterior probability of thechains. We outperform the state of the art on benchmarkUT-Human Interaction and Collective Activities datasets,under reasonable running times. Paper Poster Code

under: Publications

Leave a response






Your response:

Categories