UMD Theses and Dissertations

Permanent URI for this collectionhttp://hdl.handle.net/1903/3

New submissions to the thesis/dissertation collections are added automatically as they are received from the Graduate School. Currently, the Graduate School deposits all theses and dissertations from a given semester after the official graduation date. This means that there may be up to a 4 month delay in the appearance of a given thesis/dissertation in DRUM.

More information is available at Theses and Dissertations at University of Maryland Libraries.

Browse

Search Results

Now showing 1 - 10 of 281
  • Thumbnail Image
    Item
    Content Recognition and Context Modeling for Document Analysis and Retrieval
    (2009) Zhu, Guangyu; Chellappa, Rama; Doermann, David S; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    The nature and scope of available documents are changing significantly in many areas of document analysis and retrieval as complex, heterogeneous collections become accessible to virtually everyone via the web. The increasing level of diversity presents a great challenge for document image content categorization, indexing, and retrieval. Meanwhile, the processing of documents with unconstrained layouts and complex formatting often requires effective leveraging of broad contextual knowledge. In this dissertation, we first present a novel approach for document image content categorization, using a lexicon of shape features. Each lexical word corresponds to a scale and rotation invariant local shape feature that is generic enough to be detected repeatably and is segmentation free. A concise, structurally indexed shape lexicon is learned by clustering and partitioning feature types through graph cuts. Our idea finds successful application in several challenging tasks, including content recognition of diverse web images and language identification on documents composed of mixed machine printed text and handwriting. Second, we address two fundamental problems in signature-based document image retrieval. Facing continually increasing volumes of documents, detecting and recognizing unique, evidentiary visual entities (\eg, signatures and logos) provides a practical and reliable supplement to the OCR recognition of printed text. We propose a novel multi-scale framework to detect and segment signatures jointly from document images, based on the structural saliency under a signature production model. We formulate the problem of signature retrieval in the unconstrained setting of geometry-invariant deformable shape matching and demonstrate state-of-the-art performance in signature matching and verification. Third, we present a model-based approach for extracting relevant named entities from unstructured documents. In a wide range of applications that require structured information from diverse, unstructured document images, processing OCR text does not give satisfactory results due to the absence of linguistic context. Our approach enables learning of inference rules collectively based on contextual information from both page layout and text features. Finally, we demonstrate the importance of mining general web user behavior data for improving document ranking and other web search experience. The context of web user activities reveals their preferences and intents, and we emphasize the analysis of individual user sessions for creating aggregate models. We introduce a novel algorithm for estimating web page and web site importance, and discuss its theoretical foundation based on an intentional surfer model. We demonstrate that our approach significantly improves large-scale document retrieval performance.
  • Thumbnail Image
    Item
    Localizing the Effects of Link Flooding Attacks in the Internet
    (2009) Lee, Soo Bum; Gligor, Virgil D.; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Malware-contaminated hosts organized as a ``bot network'' can target and flood network links (e.g., routers). Yet, none of the countermeasures to link flooding proposed to date have provided dependable link access (i.e., link access guarantees) for legitimate traffic during such attacks. Network-layer capabilities offer strong protection against link flooding by authorizing individual flows with unforgeable credentials (i.e., capabilities). However, network-layer capabilities are insufficient for dependable link access, for several reasons: (1) the capability-setup channel is vulnerable to flooding attacks that prevent legitimate clients from acquiring capabilities; i.e., Denial of Capability (DoC) attacks, (2) compromised attack sources that have acquired capabilities in a legitimate way can flood the privileged channel reserved for capability carrying packets, and (3) the global effects of flooding attacks are still unavoidable with ``per-flow'' based capabilities. In this dissertation, we present a router-level design that confines the effects of link flooding attacks to specified locales or neighborhoods (e.g., one or more administrative domains of the Internet) based on network-layer capabilities. Our design provides differential guarantees for access to network links that favor packets from uncontaminated domains by attack sources (e.g., bots) and yet do not deny access to packets from contaminated domains. For connection-request packets (i.e., capability requests), differential access guarantees are defined as the probabilistic lower bounds for link access: requests from uncontaminated domains have higher probabilistic lower bounds for link access than those from contaminated domains. For all other packets, differential access guarantees are defined in terms of the the bandwidth allocated to packet flows; i.e., flows of malware-uncontaminated domains receive higher bandwidth guarantees than flows of contaminated ones, and legitimate flows of contaminated domains are guaranteed substantially higher bandwidth than attack flows. Potential side-effects of attack flows (e.g., multiple congested links) are mitigated by a differential routing scheme, whereby flows of malware-uncontaminated domains are routed through less congested paths while those of contaminated domains are routed through the ``pinned'' default paths. We present analytical models for the proposed notions of dependable link access, and evaluate our router design both by comprehensive simulations under different attack scenarios and by comparisons with other flooding-defense schemes.
  • Thumbnail Image
    Item
    Lexical Features for Statistical Machine Translation
    (2009) Devlin, Jacob; Dorr, Bonnie; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    In modern phrasal and hierarchical statistical machine translation systems, two major features model translation: rule translation probabilities and lexical smoothing scores. The rule translation probabilities are computed as maximum likelihood estimates (MLEs) of an entire source (or target) phrase translating to a target (or source) phrase. The lexical smoothing scores are also a likelihood estimate of a source (target) phrase translating to a target (source) phrase, but they are computed using independent word-to-word translation probabilities. Intuitively, it would seem that the lexical smoothing score is a less powerful estimate of translation likelihood due to this independence assumption, but I present the somewhat surprising result that lexical smoothing is far more important to the quality of a state-of-the-art hierarchical SMT system than rule translation probabilities. I posit that this is due to a fundamental data sparsity problem: The average word-to-word translation is seen many more times than the average phrase-to-phrase translation, so the word-to-word translation probabilities (or lexical probabilities) are far better estimated. Motivated by this result, I present a number of novel methods for modifying the lexical probabilities to improve the quality of our MT output. First, I examine two methods of lexical probability biasing, where for each test document, a set of secondary lexical probabilities are extracted and interpolated with the primary lexical probability distribution. Biasing each document with the probabilities extracted from its own first-pass decoding output provides a small but consistent gain of about 0.4 BLEU. Second, I contextualize the lexical probabilities by factoring in additional information such as the previous or next word. The key to the success of this context-dependent lexical smoothing is a backoff model, where our "trust" of a context-dependent probability estimation is directly proportional to how many times it was seen in the training. In this way, I avoid the estimation problem seen in translation rules, where the amount of context is high but the probability estimation is inaccurate. When using the surrounding words as context, this feature provides a gain of about 0.6 BLEU on Arabic and Chinese. Finally, I describe several types of discriminatively trained lexical features, along with a new optimization procedure called Expected-BLEU optimization. This new optimization procedure is able to robustly estimate weights for thousands of decoding features, which can in effect discriminatively optimize a set of lexical probabilities to maximize BLEU. I also describe two other discriminative feature types, one of which is the part-of-speech analogue to lexical probabilities, and the other of which estimates training corpus weights based on lexical translations. The discriminative features produce a gain of 0.8 BLEU on Arabic and 0.4 BLEU on Chinese.
  • Thumbnail Image
    Item
    Combinatorial Problems in Online Advertising
    (2009) Malekian, Azarakhsh; Khuller, Samir; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Electronic commerce or eCommerce refers to the process of buying and selling of goods and services over the Internet. In fact, the Internet has completely transformed traditional media based advertising so much so that billions of dollars of advertising revenue is now flowing to search companies such as Microsoft, Yahoo! and Google. In addition, the new advertising landscape has opened up the advertising industry to all players, big and small. However, this transformation has led to a host of new problems faced by the search companies as they make decisions about how much to charge for advertisements, whose ads to display to users, and how to maximize their revenue. In this thesis we focus on an entire suite of problems motivated by the central question of "Which advertisement to display to which user?". Targeted advertisement happens when a user enters a relevant search query. The ads are usually displayed on the sides of the search result page. Internet advertising also takes place by displaying ads on the side of webpages with relevant content. While large advertisers (e.g. Coca Cola) pursue brand recognition by advertisement, small advertisers are happy with instant revenue as a result of a user following their ad and performing a desired action (e.g. making a purchase). Therefore, small advertisers are often happy to get any ad slot related to their ad while large advertisers prefer contracts that will guarantee that their ads will be delivered to enough number of desired users. We first focus on two problems that come up in the context of small advertisers. The first problem we consider deals with the allocation of ads to slots considering the fact that users enter search queries over a period of time, and as a result the slots become available gradually. We use a greedy method for allocation and show that the online ad allocation problem with a fixed distribution of queries over time can be modeled as maximizing a continuous non-decreasing submodular sequence function for which we can guarantee a solution with a factor of at least (1- 1/e) of the optimal. The second problem we consider is query rewriting problem in the context of keyword advertisement. This problem can be posed as a family of graph covering problems to maximize profit. We obtain constant-factor approximation algorithms for these covering problems under two sets of constraints and a realistic notion of ad benefit. We perform experiments on real data and show that our algorithms are capable of outperforming a competitive baseline algorithm in terms of the benefit due to rewrites. We next consider two problems related to premium customers, who need guaranteed delivery of a large number of ads for the purpose of brand recognition and would require signing a contract. In this context, we consider the allocation problem with the objective of maximizing either revenue or fairness. The problems considered in this thesis address just a few of the current challenges in e-Commerce and Internet Advertising. There are many interesting new problems arising in this field as the technology evolves and online-connectivity through interactive media and the internet become ubiquitous. We believe that this is one of the areas that will continue to receive greater attention by researchers in the near future.
  • Thumbnail Image
    Item
    VISUAL TRACKING AND ILLUMINATION RECOVERY VIA SPARSE REPRESENTATION
    (2009) Mei, Xue; Jacobs, David W; Qu, Gang; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Compressive sensing, or sparse representation, has played a fundamental role in many fields of science. It shows that the signals and images can be reconstructed from far fewer measurements than what is usually considered to be necessary. Sparsity leads to efficient estimation, efficient compression, dimensionality reduction, and efficient modeling. Recently, there has been a growing interest in compressive sensing in computer vision and it has been successfully applied to face recognition, background subtraction, object tracking and other problems. Sparsity can be achieved by solving the compressive sensing problem using L1 minimization. In this dissertation, we present the results of a study of applying sparse representation to illumination recovery, object tracking, and simultaneous tracking and recognition. Illumination recovery, also known as inverse lighting, is the problem of recovering an illumination distribution in a scene from the appearance of objects located in the scene. It is used for Augmented Reality, where the virtual objects match the existing image and cast convincing shadows on the real scene rendered with the recovered illumination. Shadows in a scene are caused by the occlusion of incoming light, and thus contain information about the lighting of the scene. Although shadows have been used in determining the 3D shape of the object that casts shadows onto the scene, few studies have focused on the illumination information provided by the shadows. In this dissertation, we recover the illumination of a scene from a single image with cast shadows given the geometry of the scene. The images with cast shadows can be quite complex and therefore cannot be well approximated by low-dimensional linear subspaces. However, in this study we show that the set of images produced by a Lambertian scene with cast shadows can be efficiently represented by a sparse set of images generated by directional light sources. We first model an image with cast shadows as composed of a diffusive part (without cast shadows) and a residual part that captures cast shadows. Then, we express the problem in an L1-regularized least squares formulation, with nonnegativity constraints (as light has to be nonnegative at any point in space). This sparse representation enjoys an effective and fast solution, thanks to recent advances in compressive sensing. In experiments on both synthetic and real data, our approach performs favorably in comparison to several previously proposed methods. Visual tracking, which consistently infers the motion of a desired target in a video sequence, has been an active and fruitful research topic in computer vision for decades. It has many practical applications such as surveillance, human computer interaction, medical imaging and so on. Many challenges to design a robust tracking algorithm come from the enormous unpredictable variations in the target, such as deformations, fast motion, occlusions, background clutter, and lighting changes. To tackle the challenges posed by tracking, we propose a robust visual tracking method by casting tracking as a sparse approximation problem in a particle filter framework. In this framework, occlusion, noise and other challenging issues are addressed seamlessly through a set of trivial templates. Specifically, to find the tracking target at a new frame, each target candidate is sparsely represented in the space spanned by target templates and trivial templates. The sparsity is achieved by solving an L1-regularized least squares problem. Then the candidate with the smallest projection error is taken as the tracking target. After that, tracking is continued using a Bayesian state inference framework in which a particle filter is used for propagating sample distributions over time. Three additional components further improve the robustness of our approach: 1) a velocity incorporated motion model that helps concentrate the samples on the true target location in the next frame, 2) the nonnegativity constraints that help filter out clutter that is similar to tracked targets in reversed intensity patterns, and 3) a dynamic template update scheme that keeps track of the most representative templates throughout the tracking procedure. We test the proposed approach on many challenging sequences involving heavy occlusions, drastic illumination changes, large scale changes, non-rigid object movement, out-of-plane rotation, and large pose variations. The proposed approach shows excellent performance in comparison with four previously proposed trackers. We also extend the work to simultaneous tracking and recognition in vehicle classification in IR video sequences. We attempt to resolve the uncertainties in tracking and recognition at the same time by introducing a static template set that stores target images in various conditions such as different poses, lighting, and so on. The recognition results at each frame are propagated to produce the final result for the whole video. The tracking result is evaluated at each frame and low confidence in tracking performance initiates a new cycle of tracking and classification. We demonstrate the robustness of the proposed method on vehicle tracking and classification using outdoor IR video sequences.
  • Thumbnail Image
    Item
    Algorithmic issues in visual object recognition
    (2009) Hussein, Mohamed Elsayed Ahmed; Davis, Larry; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    This thesis is divided into two parts covering two aspects of research in the area of visual object recognition. Part I is about human detection in still images. Human detection is a challenging computer vision task due to the wide variability in human visual appearances and body poses. In this part, we present several enhancements to human detection algorithms. First, we present an extension to the integral images framework to allow for constant time computation of non-uniformly weighted summations over rectangular regions using a bundle of integral images. Such computational element is commonly used in constructing gradient-based feature descriptors, which are the most successful in shape-based human detection. Second, we introduce deformable features as an alternative to the conventional static features used in classifiers based on boosted ensembles. Deformable features can enhance the accuracy of human detection by adapting to pose changes that can be described as translations of body features. Third, we present a comprehensive evaluation framework for cascade-based human detectors. The presented framework facilitates comparison between cascade-based detection algorithms, provides a confidence measure for result, and deploys a practical evaluation scenario. Part II explores the possibilities of enhancing the speed of core algorithms used in visual object recognition using the computing capabilities of Graphics Processing Units (GPUs). First, we present an implementation of Graph Cut on GPUs, which achieves up to 4x speedup against compared to a CPU implementation. The Graph Cut algorithm has many applications related to visual object recognition such as segmentation and 3D point matching. Second, we present an efficient sparse approximation of kernel matrices for GPUs that can significantly speed up kernel based learning algorithms, which are widely used in object detection and recognition. We present an implementation of the Affinity Propagation clustering algorithm based on this representation, which is about 6 times faster than another GPU implementation based on a conventional sparse matrix representation.
  • Thumbnail Image
    Item
    Combining Static and Dynamic Typing in Ruby
    (2009) Furr, Michael; Foster, Jeffrey S; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    Many popular scripting languages such as Ruby, Python, and Perl are dynamically typed. Dynamic typing provides many advantages such as terse, flexible code and the ability to use highly dynamic language constructs, such as an eval method that evaluates a string as program text. However these dynamic features have traditionally obstructed static analyses leaving the programmer without the benefits of static typing, including early error detection and the documentation provided by type annotations. In this dissertation, we present Diamondback Ruby (DRuby), a tool that blends static and dynamic typing for Ruby. DRuby provides a type language that is rich enough to precisely type Ruby code, without unneeded complexity. DRuby uses static type inference to automatically discover type errors in Ruby programs and provides a type annotation language that serves as verified documentation of a method's behavior. When necessary, these annotations can be checked dynamically using runtime contracts. This allows statically and dynamically checked code to safely coexist, and any runtime errors are properly blamed on dynamic code. To handle dynamic features such as eval, DRuby includes a novel dynamic analysis and transformation that gathers per-application profiles of dynamic feature usage via a program's test suite. Based on these profiles, DRuby transforms the program before applying its type inference algorithm, enforcing type safety for dynamic constructs. By leveraging a program's test suite, our technique gives the programmer an easy to understand trade-off: the more dynamic features covered by their tests, the more static checking is achieved. We evaluated DRuby on a benchmark suite of sample Ruby programs. We found that our profile-guided analysis and type inference algorithms worked well, discovering several previously unknown type errors. Furthermore, our results give us insight into what kind of Ruby code programmers ``want'' to write but is not easily amenable to traditional static typing. This dissertation shows that it is possible to effectively integrate static typing into Ruby without losing the feel of a dynamic language.
  • Thumbnail Image
    Item
    Cooperative Particle Swarm Optimization for Combinatorial Problems
    (2009) Lapizco Encinas, Grecia del Carmen; Reggia, James A; Kingsford, Carl L; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    A particularly successful line of research for numerical optimization is the well-known computational paradigm particle swarm optimization (PSO). In the PSO framework, candidate solutions are represented as particles that have a position and a velocity in a multidimensional search space. The direct representation of a candidate solution as a point that flies through hyperspace (i.e., Rn) seems to strongly predispose the PSO toward continuous optimization. However, while some attempts have been made towards developing PSO algorithms for combinatorial problems, these techniques usually encode candidate solutions as permutations instead of points in search space and rely on additional local search algorithms. In this dissertation, I present extensions to PSO that by, incorporating a cooperative strategy, allow the PSO to solve combinatorial problems. The central hypothesis is that by allowing a set of particles, rather than one single particle, to represent a candidate solution, combinatorial problems can be solved by collectively constructing solutions. The cooperative strategy partitions the problem into components where each component is optimized by an individual particle. Particles move in continuous space and communicate through a feedback mechanism. This feedback mechanism guides them in the assessment of their individual contribution to the overall solution. Three new PSO-based algorithms are proposed. Shared-space CCPSO and multispace CCPSO provide two new cooperative strategies to split the combinatorial problem, and both models are tested on proven NP-hard problems. Multimodal CCPSO extends these combinatorial PSO algorithms to efficiently sample the search space in problems with multiple global optima. Shared-space CCPSO was evaluated on an abductive problem-solving task: the construction of parsimonious set of independent hypothesis in diagnostic problems with direct causal links between disorders and manifestations. Multi-space CCPSO was used to solve a protein structure prediction subproblem, sidechain packing. Both models are evaluated against the provable optimal solutions and results show that both proposed PSO algorithms are able to find optimal or near-optimal solutions. The exploratory ability of multimodal CCPSO is assessed by evaluating both the quality and diversity of the solutions obtained in a protein sequence design problem, a highly multimodal problem. These results provide evidence that extended PSO algorithms are capable of dealing with combinatorial problems without having to hybridize the PSO with other local search techniques or sacrifice the concept of particles moving throughout a continuous search space.
  • Thumbnail Image
    Item
    SECURE, POLICY-BASED, MULTI-RECIPIENT DATA SHARING
    (2009) Bobba, Rakesh B.; Gligor, Virgil D.; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    In distributed systems users often need to share sensitive data with other users based on the latter's ability to satisfy various policies. In many cases the data owner may not even know the identities of the data recipients, but deems it crucial that they are legitimate; i.e., satisfy the policy. Enabling such data sharing over the Internet faces the challenge of (1) securely associating access policies with data and enforcing them, and (2) protecting data as it traverses untrusted proxies and intermediate repositories. Furthermore, it is desirable to achieve properties such as: (1) flexibility of access policies; (2) privacy of sensitive access policies; (3) minimal reliance on trusted third parties; and (4) efficiency of access policy enforcement. Often schemes enabling controlled data sharing need to trade one property for another. In this dissertation, we propose two complimentary policy-based data sharing schemes that achieve different subsets of the above desired properties. In the first part of this dissertation, we focus on CiphertextPolicy Attribute- Based Encryption (CP-ABE) schemes that specify and enforce access policies cryptographically and eliminate trusted mediators. We motivate the need for flexible attribute organization within user keys for efficient support of many practical applications. We then propose Ciphertext-Policy Attribute-Set Based Encryption (CP-ASBE) which is the first CP-ABE scheme to (1) efficiently support naturally occurring compound attributes, (2) support multiple numerical assignments for a given attribute in a single key and (3) provide efficient key management. While the CP-ASBE scheme minimizes reliance on trusted mediators, it can support neither context-based policies nor policy privacy. In the second part of this dissertation, we propose Policy Based Encryption System (PBES), which employs mediated decryption and supports both context-based policies and policy privacy. Finally, we integrate the proposed schemes into practical applications (i.e., CP-ASBE scheme with Attribute-Based Messaging (ABM) and PBES scheme with a conditional data sharing application in the Power Grid) and demonstrate their usefulness in practice.
  • Thumbnail Image
    Item
    The Lattice Project: A Multi-model Grid Computing System
    (2009) Bazinet, Adam Lee; Cummings, Michael P; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    This thesis presents The Lattice Project, a system that combines multiple models of Grid computing. Grid computing is a paradigm for leveraging multiple distributed computational resources to solve fundamental scientific problems that require large amounts of computation. The system combines the traditional Service model of Grid computing with the Desktop model of Grid computing, and is thus capable of utilizing diverse resources such as institutional desktop computers, dedicated computing clusters, and machines volunteered by the general public to advance science. The production Grid system includes a fully-featured user interface, support for a large number of popular scientific applications, a robust Grid-level scheduler, and novel enhancements such as a Grid-wide file caching scheme. A substantial amount of scientific research has already been completed using The Lattice Project.