My two papers: Multiobject Tracking as Maximum-Weight Independent Set and Probabilistic Event Logic for Interval-Based Event Recognition got accepted to CVPR 2011.
Abstract – Multiobject Tracking as Maximum-Weight Independent Set: This paper addresses the problem of simultaneous tracking of multiple targets representing occurrences of distinct object classes in complex scenes. We apply object detectors to every frame, and build a graph of tracklets, defined as pairs of detection responses from every two consecutive frames. The graph helps transitively link the best matching detections that do not violate hard and soft contextual constraints between the resulting tracks. We prove that this data association problem can be formulated as finding the heaviest subset of non-adjacent tracklets in the graph, called the maximum-weight independent set (MWIS). We present a new, polynomial-time MWIS algorithm, and prove that it converges to an optimum. Similarity between object detections, and the contextual constraints between the tracks, used for data association, are learned online from object appearance and motion properties. Long-term occlusions are addressed by iteratively repeating MWIS to hierarchically merge smaller tracks into longer ones. We outperform the state of the art on the benchmark datasets, and show the advantages of simultaneously accounting for soft and hard constraints in multitarget tracking.
Abstract – Probabilistic Event Logic for Interval-Based Event Recognition: This paper is about detecting and segmenting interrelated events which occur in challenging videos with motion blur, occlusions, dynamic backgrounds, and missing
observations. We argue that holistic reasoning about time intervals of events, and their temporal constraints is critical in such domains to overcome the noise inherent to low-level video representations. For this purpose, our first contribution is the formulation of probabilistic event logic (PEL) for representing temporal constraints among events. A PEL knowledge base consists of confidence-weighted formulas from a temporal event logic, and specifies a joint distribution over the occurrence time intervals of all events. Our second contribution is a MAP inference algorithm for PEL that addresses the scalability issue of reasoning about an enormous number of time intervals and their constraints in a typical video. Specifically, our algorithm leverages the spanning-interval data structure for compactly representing and manipulating entire sets of time intervals without enumerating them. Our experiments on interpreting basketball videos show that PEL inference is able to jointly detect events and identify their time intervals, based on noisy input from primitive-event detectors.
The code will be made available soon.

