header image

Archive for Journals

We present a novel approach to computational modeling of social interactions based on modeling of essential social interaction predicates (ESIPs) such as joint attention and entrainment. Based on sound social psychological theory and methodology, we collect a new “Tower Game” dataset consisting of audio-visual capture of dyadic interactions labeled with the ESIPs. We expect this dataset to provide a new avenue for research in computational social interaction modeling. We propose a novel joint Discriminative Conditional Restricted Boltzmann Machine (DCRBM) model that combines a discriminative component with the generative power of CRBMs. Such a combination enables us to uncover actionable constituents of the ESIPs in two steps. First, we train the DCRBM model on the labeled data and get accurate (76%-49% across various ESIPs) detection of the predicates. Second, we exploit the generative capability of DCRBMs to activate the trained model so as to generate the lower-level data corresponding to the specific ESIP that closely matches the actual training data (with mean square error 0.01-0.1 for generating 100 frames). We are thus able to decompose the ESIPs into their constituent actionable behaviors. Such a purely computational determination of how to establish an ESIP such as engagement is unprecedented. Preprint

under: Journals, Publications

This paper presents an approach to estimating the 2.1D sketch from monocular, low-level visual cues. We use a low-level segmenter to partition the image into regions, and, then, estimate their 2.1D sketch, subject to figure-ground and similarity constraints between neighboring regions. The 2.1D sketch assigns a depth ordering to image regions which are expected to correspond to objects and surfaces in the scene. This is cast as a constrained convex optimization problem, and solved within the optimization transfer framework. The optimization objective takes into account the curvature and convexity of parts of region boundaries, appearance, and spatial layout properties of regions. Our new optimization transfer algorithm admits a closed-form expression of the duality gap, and thus allows explicit computation of the achieved accuracy. The algorithm is efficient with quadratic complexity in the number of constraints between image regions. Quantitative and qualitative results on challenging, real-world images of Berkeley segmentation, Geometric Context, and Stanford Make3D datasets demonstrate our high accuracy, efficiency, and robustness. Preprint  Supplement  Code

under: Journals, Publications