UMD Theses and Dissertations
Permanent URI for this collectionhttp://hdl.handle.net/1903/3
New submissions to the thesis/dissertation collections are added automatically as they are received from the Graduate School. Currently, the Graduate School deposits all theses and dissertations from a given semester after the official graduation date. This means that there may be up to a 4 month delay in the appearance of a given thesis/dissertation in DRUM.
More information is available at Theses and Dissertations at University of Maryland Libraries.
Browse
9 results
Search Results
Item ON SCHEDULING AND COMMUNICATION ISSUES IN DATA CENTERS(2020) Yang, Sheng; Khuller, Samir; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)The proliferation of datacenters to handle the rapidly growing amount of data being managed in the cloud, necessitates the design, management and effective utilization of the thousands of machines that constitute a data center. Many modern big data applications require access to a large number of machines and datasets for training neural nets or for other big data processing. In this thesis, we present research challenges and progress along two fronts. The first challenge addresses the need to schedule communication between machines in a much more effective manner, as several running applications compete for network bandwidth. We address a basic question known as coflow scheduling to optimize the weighted average completion time of tasks that are running across different machines in a datacenter and to effectively handle their communication needs. Sometimes, we are forced to distribute a task among multiple datacenters due to cost or legal reasons. For this case, we also study a related model that addresses communication needs of tasks that process data on multiple data centers and handles communication requirements of such tasks across a wide area network with possibly widely varying bandwidth and network structures across different pairs of machines. The second challenge is from a cloud user's perspective - since access to resources such as those provided by Amazon AWS can be expensive at scale, cloud computing providers often sell under utilized resources at a significant discount via a spot instance market. However, these instances are not dedicated and while they offer a cheaper alternative, there is a chance that the user's job will be interrupted to make room for higher priority tasks. Certain non-critical applications are not significantly impacted by delays due to interruptions, and we develop an initial framework to study some basic scheduling questions under this circumstance. In all of these topics, the problems we study are NP-hard and our focus is on developing good approximation algorithms. In addition, while we attack these problems from a theoretical perspective, all the algorithms developed in this thesis are practical and efficient, and can be easily deployed in practice, some are already deployed.Item STRUCTANT: A CONTEXT-AWARE TASK MANAGEMENT FRAMEWORK FOR HETEROGENEOUS COMPUTATIONAL ENVIRONMENTS(2019) Pachulski, Andrew J; Agrawala, Ashok; Computer Science; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)The Internet of Things has produced a plethora of devices, systems, and networks able to produce, transmit, and process data at unprecedented rates. These data can have tremendous value for businesses, organizations, and researchers who wish to better serve an audience or understand a topic. Pipelining is a common technique used to automate the scraping, processing, transport, and analytic steps necessary for collecting and utilizing these data.Each step in a pipeline may have specific physical, virtual, and organizational processing requirements that dictate when the step can run and what machines can run it. Physical processing requirements may include hardware specific computing capabilities such as the presence of Graphics Processing Units (GPU), memory capacity, and specific CPU instruction sets. Virtual processing requirements may include job precedence, machine architecture, availability of input datasets, runtime libraries, and executable code. Organizational processing requirements may include encryption standards for data transport and data at rest, physical server security, and monetary budget constraints. Moreover, these processing requirements may have dynamic or temporal properties not known until schedule time.These processing requirements can greatly impact the ability organizations to use these data. Despite the popularity of Big Data and cloud computing and the plethora of tools they provide, organizations still face challenges when attempting to adopt these solutions. These challenges include the need to recreate the pipeline, cryptic configuration parameters, and inability to support rapid deployment and modification for data exploration. Prior work has focused on solutions that apply only to specific steps, platforms, or algorithms in the pipeline, without considering the abundance of information that describes the processing environment and operations.In this dissertation, we present Structant, a context-aware task management framework and scheduler that helps users manage complex physical, virtual, and organizational processing requirements. Structant models jobs, machines, links, and datasets by storing contextual information for each entity in the Computational Environment. Through inference of this contextual information, Structant creates mappings of jobs to resources that satisfy all relevant processing requirements. As jobs execute, Structant observes performance and creates runtime estimates for new jobs based on prior execution traces and relevant context selection. Using runtime estimates, Structant can schedule jobs with respect to dynamic and temporal processing requirements.We present results from three experiments to demonstrate how Structant can aid a user in running both simple and complex pipelines. In our first experiment, we demonstrate how Structant can schedule data collection, processing, and movement with virtual processing requirements to facilitate forward prediction of communities at risk for opioid epidemics. In our second experiment, we demonstrate how Structant can profile operations and obey temporal organizational policies to schedule data movement with fewer preemptions than two naive scheduling algorithms. In our third experiment, we demonstrate how Structant can acquire external contextual information from server room monitors and maintain regulatory compliance of the processing environment by shutting down machines according to a predetermined pipeline.Item Modeling and Mapping of Optimized Schedules for Embedded Signal Processing Systems(2013) Wu, Hsiang-Huang; Bhattacharyya, Shuvra S.; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)The demand for Digital Signal Processing (DSP) in embedded systems has been increasing rapidly due to the proliferation of multimedia- and communication-intensive devices such as pervasive tablets and smart phones. Efficient implementation of embedded DSP systems requires integration of diverse hardware and software components, as well as dynamic workload distribution across heterogeneous computational resources. The former implies increased complexity of application modeling and analysis, but also brings enhanced potential for achieving improved energy consumption, cost or performance. The latter results from the increased use of dynamic behavior in embedded DSP applications. Furthermore, parallel programming is highly relevant in many embedded DSP areas due to the development and use of Multiprocessor System-On-Chip (MPSoC) technology. The need for efficient cooperation among different devices supporting diverse parallel embedded computations motivates high-level modeling that expresses dynamic signal processing behaviors and supports efficient task scheduling and hardware mapping. Starting with dynamic modeling, this thesis develops a systematic design methodology that supports functional simulation and hardware mapping of dynamic reconfiguration based on Parameterized Synchronous Dataflow (PSDF) graphs. By building on the DIF (Dataflow Interchange Format), which is a design language and associated software package for developing and experimenting with dataflow-based design techniques for signal processing systems, we have developed a novel tool for functional simulation of PSDF specifications. This simulation tool allows designers to model applications in PSDF and simulate their functionality, including use of the dynamic parameter reconfiguration capabilities offered by PSDF. With the help of this simulation tool, our design methodology helps to map PSDF specifications into efficient implementations on field programmable gate arrays (FPGAs). Furthermore, valid schedules can be derived from the PSDF models at runtime to adapt hardware configurations based on changing data characteristics or operational requirements. Under certain conditions, efficient quasi-static schedules can be applied to reduce overhead and enhance predictability in the scheduling process. Motivated by the fact that scheduling is critical to performance and to efficient use of dynamic reconfiguration, we have focused on a methodology for schedule design, which complements the emphasis on automated schedule construction in the existing literature on dataflow-based design and implementation. In particular, we have proposed a dataflow-based schedule design framework called the dataflow schedule graph (DSG), which provides a graphical framework for schedule construction based on dataflow semantics, and can also be used as an intermediate representation target for automated schedule generation. Our approach to applying the DSG in this thesis emphasizes schedule construction as a design process rather than an outcome of the synthesis process. Our approach employs dataflow graphs for representing both application models and schedules that are derived from them. By providing a dataflow-integrated framework for unambiguously representing, analyzing, manipulating, and interchanging schedules, the DSG facilitates effective codesign of dataflow-based application models and schedules for execution of these models. As multicore processors are deployed in an increasing variety of embedded image processing systems, effective utilization of resources such as multiprocessor systemon-chip (MPSoC) devices, and effective handling of implementation concerns such as memory management and I/O become critical to developing efficient embedded implementations. However, the diversity and complexity of applications and architectures in embedded image processing systems make the mapping of applications onto MPSoCs difficult. We help to address this challenge through a structured design methodology that is built upon the DSG modeling framework. We refer to this methodology as the DEIPS methodology (DSG-based design and implementation of Embedded Image Processing Systems). The DEIPS methodology provides a unified framework for joint consideration of DSG structures and the application graphs from which they are derived, which allows designers to integrate considerations of parallelization and resource constraints together with the application modeling process. We demonstrate the DEIPS methodology through cases studies on practical embedded image processing systems.Item Representation and Scheduling of Scalable Dataflow Graph Topologies(2011) Wu, Shenpei; Bhattacharyya, Shuvra S; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)In dataflow-based application models, the underlying graph representations often consist of smaller sub-structures that repeat multiple times. In order to enable concise and scalable specification of digital signal processing (DSP) systems, a graphical modeling construct called "topological pattern" has been introduced in recent work. In this thesis, we present new design capabilities for specifying and working with topological patterns in the dataflow interchange format (DIF) framework, which is a software tool for model-based design and implementation of signal processing systems. We also present a plug-in to the DIF framework for deriving parameterized schedules, and a code generation module for generating code that implements these schedules. A novel schedule model called the scalable schedule tree (SST) is formulated. The SST model represents an important class of parameterized schedule structures in a form that is intuitive for representation, efficient for code generation, and flexible to support powerful forms of adaptation. We demonstrate our methods for topological pattern representation, SST derivation, and associated dataflow graph code generation using a case study centered around an image registration application.Item Development and Evaluation of Algorithms for Scheduling Two Unrelated Parallel Processors(2007-08-09) Leber, Dennis D; Herrmann, Jeffrey W; Mechanical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Given a group of tasks and two non-identical processors with the ability to complete each task, how should the tasks be assigned to complete the group of tasks as quickly as possible? This thesis considers this unrelated parallel machine scheduling problem with the objective of minimizing the completion time of a group of tasks (the makespan) from the perspective of a local printed circuit board manufacturer. An analytical model representing the job dependent processing time for each manufacturing line is developed and actual job data supplied by the manufacturer is used for analysis. Two versions of a complete enumeration algorithm which identify the optimal assignment schedule are presented. Several classic assignment heuristics are considered with several additional heuristics developed as part of this work. The algorithms are evaluated and their performance compared for jobs built at the local manufacturing site. Finally, a cost-benefit tradeoff for the algorithms considered is presented.Item Phased Development of Rail Transit Routes(2007-08-10) CHENG, WEI-CHEN; Schonfeld, Paul M; Civil Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)This thesis develops a method for optimizing the construction phases for rail transit extension projects with the objective of maximizing net present worth and examines the economic feasibility of such extension projects under different financial constraints. A Simulated Annealing Algorithm is used for solving this problem. A rail transit project is often divided into several phases due to its huge capital costs. A model is developed to optimize these phases for a simple, one-route rail transit system, running from a Central Business District to a suburban area. The most interesting result indicates that the economic feasibility of links with low demand is affected by the completion time of those links and their demand growth rate after extensions. Sensitivity analyses explore the effects of input parameters on optimized results. These analyses contribute useful guidelines for transportation planners and decision-makers in determining construction phases for rail transit extension projects.Item Dataflow Integration and Simulation Techniques for DSP System Design Tools(2007-04-27) Hsu, Chia-Jui; Bhattacharyya, Shuvra S.; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)System-level modeling, simulation, and synthesis using dataflow models of computation are widespread in electronic design automation (EDA) tools for digital signal processing (DSP) systems. Over the past few decades, various dataflow models and techniques have been developed for different DSP application domains; and many system design tools incorporate dataflow semantics for different objectives in the design process. In addition, a variety of digital signal processors and other types of embedded processors have been evolving continuously; and many off-the-shelf DSP libraries are optimized for specific processor architectures. To explore their heterogeneous capabilities, we develop a novel framework that centers around the dataflow interchange format (DIF) for helping DSP system designers to integrate the diversity of dataflow models, techniques, design tools, DSP libraries, and embedded processing platforms. The dataflow interchange format is designed as a standard language for specifying DSP-oriented dataflow graphs, and the DIF framework is developed to achieve the following unique combination of objectives: 1) developing dataflow models and techniques to explore the complex design space for embedded DSP systems; 2) porting DSP designs across various tools, libraries, and embedded processing platforms; and 3) synthesizing software implementations from high-level dataflow-based program specifications. System simulation using synchronous dataflow (SDF) is widely adopted in design tools for many years. However, for modern communication and signal processing systems, their SDF representations often consist of large-scale, complex topology, and heavily multirate behavior that challenge simulation -- simulating such systems using conventional SDF scheduling techniques generally leads to unacceptable simulation time and memory requirements. In this thesis, we develop a simulation-oriented scheduler (SOS) for efficient, joint minimization of scheduling time and memory requirements in conventional single-processor environments. Nowadays, multi-core processors that provide on-chip, thread-level parallelism are increasingly popular for the potential in high performance. However, current simulation tools gain only minimal performance improvements due to their sequential SDF execution semantics. Motivated by the trend towards multi-core processors, we develop a novel multithreaded simulation scheduler (MSS) to pursue simulation runtime speed-up through multithreaded execution of SDF graphs on multi-core processors. Our results from SOS and MSS demonstrate large improvements in simulating real-world wireless communication systems.Item A GUIDED SIMULATION METHODOLOGY FOR DYNAMIC PROBABILISTIC RISK ASSESSMENT OF COMPLEX SYSTEMS(2005-04-20) HU, YUNWEI; MOSLEH, ALI; Mechanical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Probabilistic risk assessment (PRA) is a systematic process of examining how engineered systems work to ensure safety. With the growth of the size of the dynamic systems and the complexity of the interactions between hardware, software, and humans, it is extremely difficult to enumerate the risky scenarios by the traditional PRA methods. Over the past 15 years, a host of DPRA methods have been proposed to serve as supplemental tools to traditional PRA to deal with complex dynamic systems. A new dynamic probabilistic risk assessment framework is proposed in this dissertation. In this framework a new exploration strategy is employed. The engineering knowledge of the system is explicitly used to guide the simulation to achieve higher efficiency and accuracy. The engineering knowledge is reflected in the "Planner" which is responsible for generating plans as a high level map to guide the simulation. A scheduler is responsible for guiding the simulation by controlling the timing and occurrence of the random events. During the simulation the possible random events are proposed to the scheduler at branch points. The scheduler decides which events are to be simulated. Scheduler would favor the events with higher values. The value of a proposed event depends on the information gain from exploring that scenario, and the importance factor of the scenario. The information gain is measured by the information entropy, and the importance factor is based on the engineering judgment. The simulation results are recorded and grouped for later studies. The planner may "learn" from the simulation results, and update the plan to guide further simulation. SIMPRA is the software package which implements the new methodology. It provides the users with a friendly interface and a rich DPRA library to aid in the construction of the simulation model. The engineering knowledge can be input into the Planner, which would generate a plan automatically. The scheduler would guide the simulation according to the plan. The simulation generates many accident event sequences and estimates of the end state probabilities.Item Communication-Driven Codesign for Multiprocessor Systems(2004-04-30) Bambha, Neal Kumar; Bhattacharyya, Shuvra S; Electrical EngineeringSeveral trends in technology have important implications for embedded systems of the future. One trend is the increasing density and number of transistors that can be placed on a chip. This allows designers to fit more functionality into smaller devices, and to place multiple processing cores on a single chip. Another trend is the increasing emphasis on low power designs. A third trend is the appearance of bottlenecks in embedded system designs due to the limitations of long electrical interconnects, and increasing use of optical interconnects to overcome these bottlenecks. These trends lead to rapidly increasing complexity in the design process, and the necessity to develop tools that automate the process. This thesis will present techniques and algorithms for developing such tools. Automated techniques are especially important for multiprocessor designs. Programming such systems is difficult, and this is one reason why they are not as prevalent today. In this thesis we explore techniques for automating and optimizing the process of mapping applications onto system architectures containing multiple processors. We examine different processor interconnection methods and topologies, and the design implications of different levels of connectivity between the processors. Using optics, it is practical to construct processor interconnections having arbitrary topologies. This can offer advantages over regular interconnection topologies. However, existing scheduling techniques do not work in general for such arbitrarily connected systems. We present an algorithm that can be used to supplement existing scheduling techniques to enable their use with arbitrary interconnection patterns. We use our scheduling techniques to explore the larger problem of synthesizing an optimal interconnection network for a problem or group of problems. We examine the problem of optimizing synchronization costs in multiprocessor systems, and propose new architectures that reduce synchronization costs and permit efficient performance analysis. All the trends listed above combine to add dimensions to the already vast design space for embedded systems. Optimizations in embedded system design invariably reduce to searching vast design spaces. We describe a new hybrid global/local framework that combines evolutionary algorithms with problem-specific local search and demonstrate that it is more efficient in searching these spaces.