Electrical & Computer Engineering

Permanent URI for this communityhttp://hdl.handle.net/1903/2234

Browse

Search Results

Now showing 1 - 10 of 61
  • Item
    Content Recognition and Context Modeling for Document Analysis and Retrieval
    (2009) Zhu, Guangyu; Chellappa, Rama; Doermann, David S; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    The nature and scope of available documents are changing significantly in many areas of document analysis and retrieval as complex, heterogeneous collections become accessible to virtually everyone via the web. The increasing level of diversity presents a great challenge for document image content categorization, indexing, and retrieval. Meanwhile, the processing of documents with unconstrained layouts and complex formatting often requires effective leveraging of broad contextual knowledge. In this dissertation, we first present a novel approach for document image content categorization, using a lexicon of shape features. Each lexical word corresponds to a scale and rotation invariant local shape feature that is generic enough to be detected repeatably and is segmentation free. A concise, structurally indexed shape lexicon is learned by clustering and partitioning feature types through graph cuts. Our idea finds successful application in several challenging tasks, including content recognition of diverse web images and language identification on documents composed of mixed machine printed text and handwriting. Second, we address two fundamental problems in signature-based document image retrieval. Facing continually increasing volumes of documents, detecting and recognizing unique, evidentiary visual entities (\eg, signatures and logos) provides a practical and reliable supplement to the OCR recognition of printed text. We propose a novel multi-scale framework to detect and segment signatures jointly from document images, based on the structural saliency under a signature production model. We formulate the problem of signature retrieval in the unconstrained setting of geometry-invariant deformable shape matching and demonstrate state-of-the-art performance in signature matching and verification. Third, we present a model-based approach for extracting relevant named entities from unstructured documents. In a wide range of applications that require structured information from diverse, unstructured document images, processing OCR text does not give satisfactory results due to the absence of linguistic context. Our approach enables learning of inference rules collectively based on contextual information from both page layout and text features. Finally, we demonstrate the importance of mining general web user behavior data for improving document ranking and other web search experience. The context of web user activities reveals their preferences and intents, and we emphasize the analysis of individual user sessions for creating aggregate models. We introduce a novel algorithm for estimating web page and web site importance, and discuss its theoretical foundation based on an intentional surfer model. We demonstrate that our approach significantly improves large-scale document retrieval performance.
  • Item
    Localizing the Effects of Link Flooding Attacks in the Internet
    (2009) Lee, Soo Bum; Gligor, Virgil D.; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Malware-contaminated hosts organized as a ``bot network'' can target and flood network links (e.g., routers). Yet, none of the countermeasures to link flooding proposed to date have provided dependable link access (i.e., link access guarantees) for legitimate traffic during such attacks. Network-layer capabilities offer strong protection against link flooding by authorizing individual flows with unforgeable credentials (i.e., capabilities). However, network-layer capabilities are insufficient for dependable link access, for several reasons: (1) the capability-setup channel is vulnerable to flooding attacks that prevent legitimate clients from acquiring capabilities; i.e., Denial of Capability (DoC) attacks, (2) compromised attack sources that have acquired capabilities in a legitimate way can flood the privileged channel reserved for capability carrying packets, and (3) the global effects of flooding attacks are still unavoidable with ``per-flow'' based capabilities. In this dissertation, we present a router-level design that confines the effects of link flooding attacks to specified locales or neighborhoods (e.g., one or more administrative domains of the Internet) based on network-layer capabilities. Our design provides differential guarantees for access to network links that favor packets from uncontaminated domains by attack sources (e.g., bots) and yet do not deny access to packets from contaminated domains. For connection-request packets (i.e., capability requests), differential access guarantees are defined as the probabilistic lower bounds for link access: requests from uncontaminated domains have higher probabilistic lower bounds for link access than those from contaminated domains. For all other packets, differential access guarantees are defined in terms of the the bandwidth allocated to packet flows; i.e., flows of malware-uncontaminated domains receive higher bandwidth guarantees than flows of contaminated ones, and legitimate flows of contaminated domains are guaranteed substantially higher bandwidth than attack flows. Potential side-effects of attack flows (e.g., multiple congested links) are mitigated by a differential routing scheme, whereby flows of malware-uncontaminated domains are routed through less congested paths while those of contaminated domains are routed through the ``pinned'' default paths. We present analytical models for the proposed notions of dependable link access, and evaluate our router design both by comprehensive simulations under different attack scenarios and by comparisons with other flooding-defense schemes.
  • Item
    VISUAL TRACKING AND ILLUMINATION RECOVERY VIA SPARSE REPRESENTATION
    (2009) Mei, Xue; Jacobs, David W; Qu, Gang; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Compressive sensing, or sparse representation, has played a fundamental role in many fields of science. It shows that the signals and images can be reconstructed from far fewer measurements than what is usually considered to be necessary. Sparsity leads to efficient estimation, efficient compression, dimensionality reduction, and efficient modeling. Recently, there has been a growing interest in compressive sensing in computer vision and it has been successfully applied to face recognition, background subtraction, object tracking and other problems. Sparsity can be achieved by solving the compressive sensing problem using L1 minimization. In this dissertation, we present the results of a study of applying sparse representation to illumination recovery, object tracking, and simultaneous tracking and recognition. Illumination recovery, also known as inverse lighting, is the problem of recovering an illumination distribution in a scene from the appearance of objects located in the scene. It is used for Augmented Reality, where the virtual objects match the existing image and cast convincing shadows on the real scene rendered with the recovered illumination. Shadows in a scene are caused by the occlusion of incoming light, and thus contain information about the lighting of the scene. Although shadows have been used in determining the 3D shape of the object that casts shadows onto the scene, few studies have focused on the illumination information provided by the shadows. In this dissertation, we recover the illumination of a scene from a single image with cast shadows given the geometry of the scene. The images with cast shadows can be quite complex and therefore cannot be well approximated by low-dimensional linear subspaces. However, in this study we show that the set of images produced by a Lambertian scene with cast shadows can be efficiently represented by a sparse set of images generated by directional light sources. We first model an image with cast shadows as composed of a diffusive part (without cast shadows) and a residual part that captures cast shadows. Then, we express the problem in an L1-regularized least squares formulation, with nonnegativity constraints (as light has to be nonnegative at any point in space). This sparse representation enjoys an effective and fast solution, thanks to recent advances in compressive sensing. In experiments on both synthetic and real data, our approach performs favorably in comparison to several previously proposed methods. Visual tracking, which consistently infers the motion of a desired target in a video sequence, has been an active and fruitful research topic in computer vision for decades. It has many practical applications such as surveillance, human computer interaction, medical imaging and so on. Many challenges to design a robust tracking algorithm come from the enormous unpredictable variations in the target, such as deformations, fast motion, occlusions, background clutter, and lighting changes. To tackle the challenges posed by tracking, we propose a robust visual tracking method by casting tracking as a sparse approximation problem in a particle filter framework. In this framework, occlusion, noise and other challenging issues are addressed seamlessly through a set of trivial templates. Specifically, to find the tracking target at a new frame, each target candidate is sparsely represented in the space spanned by target templates and trivial templates. The sparsity is achieved by solving an L1-regularized least squares problem. Then the candidate with the smallest projection error is taken as the tracking target. After that, tracking is continued using a Bayesian state inference framework in which a particle filter is used for propagating sample distributions over time. Three additional components further improve the robustness of our approach: 1) a velocity incorporated motion model that helps concentrate the samples on the true target location in the next frame, 2) the nonnegativity constraints that help filter out clutter that is similar to tracked targets in reversed intensity patterns, and 3) a dynamic template update scheme that keeps track of the most representative templates throughout the tracking procedure. We test the proposed approach on many challenging sequences involving heavy occlusions, drastic illumination changes, large scale changes, non-rigid object movement, out-of-plane rotation, and large pose variations. The proposed approach shows excellent performance in comparison with four previously proposed trackers. We also extend the work to simultaneous tracking and recognition in vehicle classification in IR video sequences. We attempt to resolve the uncertainties in tracking and recognition at the same time by introducing a static template set that stores target images in various conditions such as different poses, lighting, and so on. The recognition results at each frame are propagated to produce the final result for the whole video. The tracking result is evaluated at each frame and low confidence in tracking performance initiates a new cycle of tracking and classification. We demonstrate the robustness of the proposed method on vehicle tracking and classification using outdoor IR video sequences.
  • Item
    SECURE, POLICY-BASED, MULTI-RECIPIENT DATA SHARING
    (2009) Bobba, Rakesh B.; Gligor, Virgil D.; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    In distributed systems users often need to share sensitive data with other users based on the latter's ability to satisfy various policies. In many cases the data owner may not even know the identities of the data recipients, but deems it crucial that they are legitimate; i.e., satisfy the policy. Enabling such data sharing over the Internet faces the challenge of (1) securely associating access policies with data and enforcing them, and (2) protecting data as it traverses untrusted proxies and intermediate repositories. Furthermore, it is desirable to achieve properties such as: (1) flexibility of access policies; (2) privacy of sensitive access policies; (3) minimal reliance on trusted third parties; and (4) efficiency of access policy enforcement. Often schemes enabling controlled data sharing need to trade one property for another. In this dissertation, we propose two complimentary policy-based data sharing schemes that achieve different subsets of the above desired properties. In the first part of this dissertation, we focus on CiphertextPolicy Attribute- Based Encryption (CP-ABE) schemes that specify and enforce access policies cryptographically and eliminate trusted mediators. We motivate the need for flexible attribute organization within user keys for efficient support of many practical applications. We then propose Ciphertext-Policy Attribute-Set Based Encryption (CP-ASBE) which is the first CP-ABE scheme to (1) efficiently support naturally occurring compound attributes, (2) support multiple numerical assignments for a given attribute in a single key and (3) provide efficient key management. While the CP-ASBE scheme minimizes reliance on trusted mediators, it can support neither context-based policies nor policy privacy. In the second part of this dissertation, we propose Policy Based Encryption System (PBES), which employs mediated decryption and supports both context-based policies and policy privacy. Finally, we integrate the proposed schemes into practical applications (i.e., CP-ASBE scheme with Attribute-Based Messaging (ABM) and PBES scheme with a conditional data sharing application in the Power Grid) and demonstrate their usefulness in practice.
  • Item
    Robust and Efficient Inference of Scene and Object Motion in Multi-Camera Systems
    (2009) Sankaranarayanan, Aswin C; Chellappa, Rama; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Multi-camera systems have the ability to overcome some of the fundamental limitations of single camera based systems. Having multiple view points of a scene goes a long way in limiting the influence of field of view, occlusion, blur and poor resolution of an individual camera. This dissertation addresses robust and efficient inference of object motion and scene in multi-camera and multi-sensor systems. The first part of the dissertation discusses the role of constraints introduced by projective imaging towards robust inference of multi-camera/sensor based object motion. We discuss the role of the homography and epipolar constraints for fusing object motion perceived by individual cameras. For planar scenes, the homography constraints provide a natural mechanism for data association. For scenes that are not planar, the epipolar constraint provides a weaker multi-view relationship. We use the epipolar constraint for tracking in multi-camera and multi-sensor networks. In particular, we show that the epipolar constraint reduces the dimensionality of the state space of the problem by introducing a ``shared'' state space for the joint tracking problem. This allows for robust tracking even when one of the sensors fail due to poor SNR or occlusion. The second part of the dissertation deals with challenges in the computational aspects of tracking algorithms that are common to such systems. Much of the inference in the multi-camera and multi-sensor networks deal with complex non-linear models corrupted with non-Gaussian noise. Particle filters provide approximate Bayesian inference in such settings. We analyze the computational drawbacks of traditional particle filtering algorithms, and present a method for implementing the particle filter using the Independent Metropolis Hastings sampler, that is highly amenable to pipelined implementations and parallelization. We analyze the implementations of the proposed algorithm, and in particular concentrate on implementations that have minimum processing times. The last part of the dissertation deals with the efficient sensing paradigm of compressing sensing (CS) applied to signals in imaging, such as natural images and reflectance fields. We propose a hybrid signal model on the assumption that most real-world signals exhibit subspace compressibility as well as sparse representations. We show that several real-world visual signals such as images, reflectance fields, videos etc., are better approximated by this hybrid of two models. We derive optimal hybrid linear projections of the signal and show that theoretical guarantees and algorithms designed for CS can be easily extended to hybrid subspace-compressive sensing. Such methods reduce the amount of information sensed by a camera, and help in reducing the so called data deluge problem in large multi-camera systems.
  • Item
    Robust Methods for Visual Tracking and Model Alignment
    (2009) Wu, Hao; Chellappa, Rama; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    The ubiquitous presence of cameras and camera networks needs the development of robust visual analytics algorithms. As the building block of many video visual surveillance tasks, a robust visual tracking algorithm plays an important role in achieving the goal of automatic and robust surveillance. In practice, it is critical to know when and where the tracking algorithm fails so that remedial measures can be taken to resume tracking. We propose a novel performance evaluation strategy for tracking systems using a time-reversed Markov chain. We also present a novel bidirectional tracker to achieve better robustness. Instead of looking only forward in the time domain, we incorporate both forward and backward processing of video frames using a time-reversibility constraint. When the objects of interest in surveillance applications have relatively stable structures, the parameterized shape model of objects can be usually built or learned from sample images, which allows us to perform more accurate tracking. We present a machine learning method to learn a scoring function without local extrema to guide the gradient descent/accent algorithm and find the optimal parameters of the shape model. These algorithms greatly improve the robustness of video analysis systems in practice.
  • Item
    Random Codes and Graphs for Secure Communication
    (2009) Anthapadmanabhan, Nagaraj Prasanth; Barg, Alexander; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    This dissertation considers two groups of problems related to secure communication. The first line of research is devoted to theoretical problems of copyright protection of digital content. Embedding identification data in the content is a well-developed technique of content protection known under the name of fingerprinting. Schemes that provide such protection are known as fingerprinting codes in the literature. We study limits of the number of users of a fingerprinting system as well as constructions of low-complexity fingerprinting codes that support a large number of users. The second problem that is addressed in the dissertation relates to connectivity analysis of ad hoc wireless networks. One of the basic requirements in such environments is to ensure that none of the nodes are completely isolated from the network. We address the problem of characterizing threshold parameters for node isolation that enable the system designer to choose the power needed for network operation based on the outage probability of links in the network. The methods of this research draw from coding theory, information theory and random graphs. An idea that permeates most results in this dissertation is the application of randomization both in the analysis of fingerprinting and node isolation. The main contributions of this dissertation belong in the area of fingerprinting and are described as follows. We derive new lower and upper bounds on the optimal trade-off between the number of users and the length of the fingerprints required to ensure reliability of the system, which we call fingerprinting capacity. Information-theoretic techniques employed in our proofs of bounds on capacity originate in coding theorems for channels with multiple inputs. Constructions of fingerprinting codes draw on methods of coding theory related to list decoding and code concatenation. We also analyze random graph models for ad hoc networks with link failures and secure sensor networks that employ randomized key distribution. We establish a precise zero-one law for node isolation in the model with link failures for nodes placed on the circle. We further generalize this result to obtain a one-law for secure sensor networks on some surfaces.
  • Item
    SIMULTANEOUS MULTI-VIEW FACE TRACKING AND RECOGNITION IN VIDEO USING PARTICLE FILTERING
    (2009) Seo, Naotoshi; Chellappa, Rama; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Recently, face recognition based on video has gained wide interest especially due to its role in surveillance systems. Video-based recognition has superior advantages over image-based recognition because a video contains image sequences as well as temporal information. However, surveillance videos are generally of low-resolution and contain faces mostly in non-frontal poses. We propose a multi-view, video-based face recognition algorithm using the Bayesian inference framework. This method represents an appearance of each subject by a complex nonlinear appearance manifold expressed as a collection of simpler pose manifolds and the connections, represented by transition probabilities, among them. A Bayesian inference formulation is introduced to utilize the temporal information in the video via the transition probabilities among pose manifolds. The Bayesian inference formulation realizes video-based face recognition by progressively accumulating the recognition confidences in frames. The accumulation step possibly enables to solve face recognition problems in low-resolution videos, and the progressive characteristic is especially useful for a real-time processing. Furthermore, this face recognition framework has another characteristic that does not require processing all frames in a video if enough recognition confidence is accumulated in an intermediate frame. This characteristic gives an advantage over batch methods in terms of a computational efficiency. Furthermore, we propose a simultaneous multi-view face tracking and recognition algorithm. Conventionally, face recognition in a video is performed in tracking-then-recognition scenario that extracts the best facial image patch in the tracking and then recognizes the identity of the facial image. Simultaneous face tracking and recognition works in a different fashion, by handling both tracking and recognition simultaneously. Particle filter is a technique for implementing a Bayesian inference filter by Monte Carlo simulation, which has gained prevalence in the visual tracking literature since the Condensation algorithm was introduced. Since we have proposed a video-based face recognition algorithm based on the Bayesian inference framework, it is easy to integrate the particle filter tracker and our proposed recognition method into one, using the particle filter for both tracking and recognition simultaneously. This simultaneous framework utilizes the temporal information in a video for not only tracking but also recognition by modeling the dynamics of facial poses. Although the time series formulation remains more general, only the facial pose dynamics is utilized for recognition in this thesis.
  • Item
    Modeling Shape, Appearance and Motion for Human Movement Analysis
    (2009) Lin, Zhe; Davis, Larry S; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Shape, Appearance and Motion are the most important cues for analyzing human movements in visual surveillance. Representation of these visual cues should be rich, invariant and discriminative. We present several approaches to model and integrate them for human detection and segmentation, person identification, and action recognition. First, we describe a hierarchical part-template matching approach to simultaneous human detection and segmentation combining local part-based and global shape-based schemes. For learning generic human detectors, a pose-adaptive representation is developed based on a hierarchical tree matching scheme and combined with an support vector machine classifier to perform human/non-human classification. We also formulate multiple occluded human detection using a Bayesian framework and optimize it through an iterative process. We evaluated the approach on several public pedestrian datasets. Second, given regions of interest provided by human detectors, we introduce an approach to iteratively estimates segmentation via a generalized Expectation-Maximization algorithm. The approach incorporates local Markov random field constraints and global pose inferences to propagate beliefs over image space iteratively to determine a coherent segmentation. Additionally, a layered occlusion model and a probabilistic occlusion reasoning scheme are introduced to handle inter-occlusion. The approach is tested on a wide variety of real-life images. Third, we describe an approach to appearance-based person recognition. In learning, we perform discriminative analysis through pairwise coupling of training samples, and estimate a set of normalized invariant profiles by marginalizing likelihood ratio functions which reflect local appearance differences. In recognition, we calculate discriminative information-based distances by a soft voting approach, and combine them with appearance-based distances for nearest neighbor classification. We evaluated the approach on videos of 61 individuals under significant illumination and viewpoint changes. Fourth, we describe a prototype-based approach to action recognition. During training, a set of action prototypes are learned in a joint shape and motion space via $k$-means clustering; During testing, humans are tracked while a frame-to-prototype correspondence is established by nearest neighbor search, and then actions are recognized using dynamic prototype sequence matching. Similarity matrices used for sequence matching are efficiently obtained by look-up table indexing. We experimented the approach on several action datasets.
  • Item
    Multipath Routing Algorithms for Communication Networks: Ant Routing and Optimization Based Approaches
    (2009) Purkayastha, Punyaslok; Baras, John S; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    In this dissertation, we study two algorithms that accomplish multipath routing in communication networks. The first algorithm that we consider belongs to the class of Ant-Based Routing Algorithms (ARA) that have been inspired by experimental observations of ant colonies. It was found that ant colonies are able to `discover' the shorter of two paths to a food source by laying and following `pheromone' trails. ARA algorithms proposed for communication networks employ probe packets called ant packets (analogues of ants) to collect measurements of various quantities (related to routing performance) like path delays. Using these measurements, analogues of pheromone trails are created, which then influence the routing tables. We study an ARA algorithm, proposed earlier by Bean and Costa, consisting of a delay estimation scheme and a routing probability update scheme, that updates routing probabilities based on the delay estimates. We first consider a simple scenario where data traffic entering a source node has to be routed to a destination node, with N available parallel paths between them. An ant stream also arrives at the source and samples path delays en route to the destination. We consider a stochastic model for the arrival processes and packet lengths of the streams, and a queuing model for the link delays. Using stochastic approximation methods, we show that the evolution of the link delay estimates can be closely tracked by a deterministic ODE (Ordinary Differential Equation) system. A study of the equilibrium points of the ODE enables us to obtain the equilibrium routing probabilities and the path delays. We then consider a network case, where multiple input traffic streams arriving at various sources have to be routed to a single destination. For both the N parallel paths network as well as for the general network, the vector of equilibrium routing probabilities satisfies a fixed point equation. We present various supporting simulation results. The second routing algorithm that we consider is based on an optimization approach to the routing problem. We consider a problem where multiple traffic streams entering at various source nodes have to be routed to their destinations via a network of links. We cast the problem in a multicommodity network flow optimization framework. Our cost function, which is a function of the individual link delays, is a measure of congestion in the network. Our approach is to consider the dual optimization problem, and using dual decomposition techniques we provide primal-dual algorithms that converge to the optimal routing solution. A classical interpretation of the Lagrange multipliers (drawing an analogy with electrical networks) is as `potential differences' across the links. The link potential difference can be then thought of as `driving the flow through the link'. Using the relationships between the link potential differences and the flows, we show that our algorithm converges to a loop-free routing solution. We then incorporate in our framework a rate control problem and address a joint rate control and routing problem.