DRUM :: Browsing by Author "Davis, Larry S."

Browsing by Author "Davis, Larry S."

Now showing 1 - 9 of 9

Design and Implementation of the University of Maryland Keck Laboratory for the Analysis of Visual Movement
(2002-02-08) Laboratory for the Analysis of Visual Movement; Cutler, Ross G.; Duraiswami, Ramani; Qian, J. Hench; Davis, Larry S.
The Keck Laboratory for the Analysis of Vision Motion is a state-of-the art multi-perspective imaging laboratory recently established at the University of Maryland. In this paper, we describe the design and architecture of the lab, that is currently being used to support many computer vision studies. In particular, we discuss: camera synchronization, image resolution analysis, image noise analysis, stereo error analysis, video capture, lighting, calibration hardware. (Also UMIACS-TR-2002-11)
Human Emotion Recognition from Motion Using a Radial Basis Function Network Architecture
(1998-10-15) Rosenblum, Mark; Yacoob, Yaser; Davis, Larry S.
(Also cross-referenced as CAR-TR-721) In this paper a radial basis function network architecture is developed that learns the correlation between facial feature motion patterns and human emotions. We describe a hierarchical approach which at the highest level identifies emotions, at the mid level determines motions of facial features, and at the low level recovers motion directions. Individual emotion networks were trained to recognize the 'smile" and "surprise" emotions. Each network was trained by viewing a set of sequences of one emotion for many subjects. The trained neural network was then tested for retention, extrapolation and rejection ability. Success rates were about 88% for retention, 73Wo for extrapolation, and 79% for rejection.
Learning to Detect Carried Objects with Minimal Supervision
(2012-12-21) Dondera, Radu; Morariu, Vlad I.; Davis, Larry S.
We propose a learning-based method for detecting carried objects that generates candidate image regions from protrusion, color contrast and occlusion boundary cues, and uses a classifier to filter out the regions unlikely to be carried objects. The method achieves higher accuracy than state of the art, which can only detect protrusions from the human shape, and the discriminative model it builds for the silhouette context-based region features generalizes well. To reduce annotation effort, we investigate training the model in a Multiple Instance Learning framework where the only available supervision is "walk" and "carry" labels associated with intervals of human tracks, i.e., the spatial extent of carried objects is not annotated. We present an extension to the miSVM algorithm that uses knowledge of the fraction of positive instances in positive bags and that scales to training sets of hundreds of thousands of instances.
Multiple Vehicle Detection and Tracking in Hard Real Time
(1998-10-15) Betke, Margrit; Haritaoglu, Esin; Davis, Larry S.
A vision system has been developed that recognizes and tracks multiple vehicles from sequences of gray-scale images taken from a moving car in hard real-time. Recognition is accomplished by combining the analysis of single image frames with the analysis of the motion information provided by multiple consecutive image frames. In single image frames, cars are recognized by matching deformable gray-scale templates, by detecting image features, such as corners, and by evaluating how these features relate to each other. Cars are also recognized by differencing consecutive image frames and by tracking motion parameters that are typical for cars. The vision system utilizes the hard real-time operating system Maruti which guarantees that the timing constraints on the various processes of the vision system are satisfied. The dynamic creation and termination of tracking processes optimizes the amount of computational resources spent and allows fast detection and tracking of multiple cars. Experimental results demonstrate robust, real-time recognition and tracking over thousands of image frames. (Also cross-referenced as UMIACS-TR-96-52)
A One-Threshold Algorithm for Detecting Abandoned Packages Under Severe Occlusions Using a Single Camera
(2006-02-13T21:42:15Z) Lim, Ser-Nam; Davis, Larry S.
We describe a single-camera system capable of detecting abandoned packages under severe occlusions, which leads to complications on several levels. The first arises when frames containing only background pixels are unavailable for initializing the background model - a problem for which we apply a novel discriminative measure. The proposed measure is essentially the probability of observing a particular pixel value, conditioned on the probability that no motion is detected, with the pdf on which the latter is based being estimated as a zero-mean and unimodal Gaussian distribution from observing the difference values between successive frames. We will show that such a measure is a powerful discriminant even under severe occlusions, and can deal robustly with the foreground aperture effect - a problem inherently caused by differencing successive frames. The detection of abandoned packages then follows at both the pixel and region level. At the pixel-level, an ``abandoned pixel'' is detected as a foreground pixel, at which no motion is observed. At the region-level, abandoned pixels are ascertained in a Markov Random Field (MRF), after which they are clustered. These clusters are only finally classified as abandoned packages, if they display temporal persistency in their size, shape, position and color properties, which is determined using conditional probabilities of these attributes. The algorithm is also carefully designed to avoid any thresholding, which is the pitfall of many vision systems, and which significantly improves the robustness of our system. Experimental results from real-life train station sequences demonstrate the robustness and applicability of our algorithm.
Parallel Algorithms for Image Enhancement and Segmentation by Region Growing with an Experimental Study
(1998-10-15) Bader, David A.; JaJa, Joseph; Harwood, David; Davis, Larry S.
This paper presents efficient and portable implementations of a useful image enhancement process, the Symmetric Neighborhood Filter (SNF), and an image segmentation technique which makes use of the SNF and a variant of the conventional connected components algorithm which we call delta-Connected Components. Our general framework is a single-address space, distributed memory programming model. We use efficient techniques for distributing and coalescing data as well as efficient combinations of task and data parallelism. The image segmentation algorithm makes use of an efficient connected components algorithm which uses a novel approach for parallel merging. The algorithms have been coded in Split-C and run on a variety of platforms, including the Thinking Machines CM-5, IBM SP-1 and SP-2, Cray Research T3D, Meiko Scientific CS-2, Intel Paragon, and workstation clusters. Our experimental results are consistent with the theoretical analysis (and provide the best known execution times for segmentation, even when compared with machine-specific implementations.) Our test data include difficult images from the Landsat Thematic Mapper (TM) satellite data. More efficient implementations of Split-C will likely result in even faster execution times. (Also cross-referenced as UMIACS-TR-95-44.)
Rendering Localized Spatial Audio in a Virtual Auditory Space
(2002-04-04) Zotkin, Dmitry; Duraiswami, Ramani; Davis, Larry S.
High-quality virtual audio scene rendering is a must for emerging virtual and augmented reality applications, for perceptual user interfaces, and sonification of data. We describe algorithms for creation of virtual auditory spaces by rendering cues that arise from anatomical scattering, environmental scattering, and dynamical effects. We use a novel way of personalizing the head related transfer functions (HRTFs) from a database, based on anatomical measurements. Details of algorithms for HRTF interpolation, room impulse response creation, HRTF selection from a database, and audio scene presentation are presented. Our system runs in real time on an office PC without specialized DSP hardware. Also UMIACS-TR-2002-28
Task-Driven Video Collection
(2006-01-23T17:16:09Z) Lim, Ser-Nam; Mittal, Anuarg; Davis, Larry S.
Vision systems are increasingly being deployed to perform complex surveillance tasks. While improved algorithms are being developed to perform these tasks, it is also important that data suitable for these algorithms be acquired - a non-trivial task in a dynamic and crowded scene viewed by multiple PTZ cameras. In this paper, we describe a multi-camera system that collects images and videos of moving objects in such scenes, subject to task constraints. The system constructs "task visibility intervals" that contain information about what can be sensed in future time intervals. Constructing these intervals requires prediction of future object motion and consideration of several factors such as object occlusion and camera control parameters. Using a plane-sweep algorithm, these atomic intervals can be combined to form multi-task intervals, during which a single camera can collect videos suitable for multiple tasks simultaneously. Although cameras can then be scheduled based on the constructed intervals, finding an optimal schedule is a typical NP-hard problem. Due to this, and the lack of exact future information in a dynamic environment, we propose several methods for fast camera scheduling that yield solutions within a small constant factor of optimal. Experimental results illustrate system capabilities for both real and more complicated simulated scenarios.
Visibility Planning: Predicting Continuous Period of Unobstructed Views
(2004-04-19) Lim, Ser-Nam; Davis, Larry S.; Wan, Yung-Chun (Justin)
To perform surveillance tasks effectively, unobstructed views of objects are required e.g. unobstructed video of objects are often needed for gait recognition. As a result, we need to determine intervals for video collection during which a desired object is visible w.r.t. a given sensor. In addition, these intervals are in the future so that the system can effectively plan and schedule sensors for collecting these videos. We describe an approach to determine these visibility intervals. A Kalman filter is first used to predict the trajectories of the objects. The trajectories are converted to polar coordinate representations w.r.t. a given sensor. Trajectories with the same angular displacement w.r.t. the sensor over time can be found by determining intersection points of functions representing these trajectories. Intervals between these intersection points are suitable for video collection. We also address the efficiency issue of finding these intersection points. An obvious brute force approach of $O(N^2)$ exists, where $N$ is the number of objects. This approach suffices when $N$ is small. When $N$ is large, we introduce an optimal segment intersection algorithm of $O(N\log^2N+I)$, $I$ being the number of intersection points. Finally, we model the prediction errors associated with the Kalman filter using a circular object representation. Experimental results that compare the performance of the brute force and the optimal segment intersection algorithms are shown. (UMIACS-TR-2004-22)

Browsing by Author "Davis, Larry S."

Results Per Page

Sort Options