A. James Clark School of Engineering
Permanent URI for this communityhttp://hdl.handle.net/1903/1654
The collections in this community comprise faculty research works, as well as graduate theses and dissertations.
Browse
14 results
Search Results
Item Methods and Tools for Real-Time Neural Image Processing(2023) Xie, Jing; Bhattacharyya, Shuvra; Chen, Rong; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)As a rapidly developing form of bioengineering technology, neuromodulationsystems involve extracting information from signals that are acquired from the brain and utilizing the information to stimulate brain activity. Neuromodulation has the potential to treat a wide range of neurological diseases and psychiatric conditions, as well as the potential to improve cognitive function. Neuromodulation integrates neural decoding and stimulation. As one of the twocore parts of neuromodulation systems, neural decoding subsystems interpret signals acquired through neuroimaging devices. Neuroimaging is a field of neuroscience that uses imaging techniques to study the structure and function of the brain and other central nervous system functions. Extracting information from neuroimaging signals, as is required in neural decoding, involves key challenges due to requirements of real-time, energy-efficient, and accurate processing and for large-scale, high resolution image data that are characteristic of neuromodulation systems. To address these challenges, we develop new methods and tools for design andimplementation of efficient neural image processing systems. Our contributions are organized along three complementary directions. First, we develop a prototype system for real-time neuron detection and activity extraction called the Neuron Detection and Signal Extraction Platform (NDSEP). This highly configurable system processes neural images from video streams in real-time or off-line, and applies techniques of dataflow modeling to enable extensibility and experimentation with a wide variety of image processing algorithms. Second,we develop a parameter optimization framework to tune the performance of neural image processing systems. This framework, referred to as the NEural DEcoding COnfiguration (NEDECO) package, automatically optimizes arbitrary collections of parameters in neural image processing systems under customizable constraints. The framework allows system designers to explore alternative neural image processing trade-offs involving execution time and accuracy. NEDECO is also optimized for efficient operation on multicore platforms, which allows for faster execution of the parameter optimization process. Third, we develop a neural network inference engine targeted to mobile devices.The framework can be applied to neural network implementation in many application areas, including neural image processing. The inference engine, called ShaderNN, is the first neural network inference engine that exploits both graphics-centric abstractions (fragment shaders) and compute-centric abstractions (compute shaders). The integration of fragment shaders and compute shaders makes improved use of the parallel computing advantages of GPUs on mobile devices. ShaderNN has favorable performance especially in parametrically small models.Item Model-Based Design Optimization and Simulation Techniques for Dynamic, Data-Driven Application Systems(2020) Li, Honglei; Bhattacharyya, Shuvra S; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Dynamic Data Driven Application Systems (DDDAS) are an important class of systems in which computations on data and instrumentation components for acquiring the data are incorporated within a feedback control loop. In DDDAS, the modeling and data-driven adaptation of instrumentation is incorporated as an important aspect of the design process. Due to its potential to enhance capabilities of accurate analysis, dynamic decision making, and scalable simulation, the DDDAS paradigm plays an increasingly important role in innovative systems for a wide variety of applications. This thesis develops new model-based, software design tools to support the design and implementation of DDDAS. The methods are developed in the context of two application domains in which DDDAS principles are highly relevant --- multispectral/hyperspectral video processing, and wireless-integrated factory automation systems. Recent advances in multispectral and hyperspectral video capture technology along with system design trade-offs introduced by these advances present new challenges and opportunities in the area of DDDAS for video analytics. Video analytics plays an important role in a wide variety of defense-, monitoring- and surveillance-related systems for air and ground environments. In this context, multispectral video processing is attracting increased interest in recent years, due in part to technological advances in video capture. Compared with monochromatic video, multispectral video offers better spectral resolution, and different bands of multispectral video streams can enhance video analytics capabilities in different ways. Video processing systems that incorporate multispectral technology involve novel trade-offs among system design complexities such as spectral resolution, equipment cost, and computational efficiency. The design space of multispectral video processing systems is enriched by considering only the required subset of spectral bands to process as a parameter that can be adjusted dynamically based on data characteristics and constraints involving accuracy, communication, and computation. Based on this view of selectively-processed bands from multispectral video data, we introduce in this thesis a novel system design framework for dynamic, data-driven video processing using lightweight dataflow (LD) techniques. Our proposed framework, called LDspectral, applies LD, which is an approach for model-based design of signal and information processing systems. LD facilitates efficient and reliable real-time implementation. LD is ``lightweight'' in the sense that it is based on a compact set of application programming interfaces, and can be integrated relatively easily into existing design processes. We develop a framework for adaptively configuring multispectral video processing configurations in LDspectral, and develop a prototype implementation using LD methods that are integrated with OpenCV, which is a popular library of computer vision modules. We demonstrate and evaluate the performance of LDspectral capabilities using a background subtraction application. As compared to a standard video processing pipeline, the capabilities in LDspectral for optimized selection and fusion of spectral bands enhance trade-offs that can be realized between video processing accuracy and computational efficiency. Using the DDDAS paradigm, the elements of sensor measurements, statistical processing, target modeling, and system software are analyzed by frequency bands, video analytics, environmental analysis, and dataflow techniques, respectively. In this thesis, the LDspectral framework is also extended to hyperspectral video, which offers great spectral resolution and has significant potential to enhance the effectiveness of information extraction from image scenes. An important challenge in the development of hyperspectral video systems is managing the high computational load and storage requirements required to process the large volumes of data that are acquired by these systems. We also investigate DDDAS-inspired methods in context of distributed, smart factory systems that are equipped with wireless communication capability. We refer to this class of systems as wireless-integrated factory systems (WIFSs). An important challenge in the development of this class of systems is ensuring reliable, low latency communication under the harsh wireless channel conditions of factory environments. To support the application of the DDDAS paradigm in WIFSs, we develop a model-based software tool for design space exploration. We refer to this tool as the Wireless-Integrated factory System Evaluator (WISE). WISE supports the rapid simulation-based evaluation of interactions among the placement of factory subsystems, the partitioning of factory subsystems into nodes of a wireless network, the performance of the wireless network, and overall factory system performance. WISE also incorporates a new graphical model called the cyber-physical flow graph, which provides integrated modeling for the flow of physical entities (such as parts that are processed in a factory) and the flow of information. The cyber-physical flow graph also models distributed flows in which information is communicated across multiple network nodes.Item DESIGN SPACE EXPLORATION FOR SIGNAL PROCESSING SYSTEMS USING LIGHTWEIGHT DATAFLOW GRAPHS(2018) Li, Lin; Bhattacharyya, Shuvra S; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Digital signal processing (DSP) is widely used in many types of devices, including mobile phones, tablets, personal computers, and numerous forms of embedded systems. Implementation of modern DSP applications is very challenging in part due to the complex design spaces that are involved. These design spaces involve many kinds of configurable parameters associated with the signal processing algorithms that are used, as well as different ways of mapping the algorithms onto the targeted platforms. In this thesis, we develop new algorithms, software tools and design methodologies to systematically explore the complex design spaces that are involved in design and implementation of signal processing systems. To improve the efficiency of design space exploration, we develop and apply compact system level models, which are carefully formulated to concisely capture key properties of signal processing algorithms, target platforms, and algorithm-platform interactions. Throughout the thesis, we develop design methodologies and tools for integrating new compact system level models and design space exploration methods with lightweight dataflow (LWDF) techniques for design and implementation of signal processing systems. LWDF is a previously-introduced approach for integrating new forms of design space exploration and system-level optimization into design processes for DSP systems. LWDF provides a compact set of retargetable application programming interfaces (APIs) that facilitates the integration of dataflow-based models and methods. Dataflow provides an important formal foundation for advanced DSP system design, and the flexible support for dataflow in LWDF facilitates experimentation with and application of novel design methods that are founded in dataflow concepts. Our developed methodologies apply LWDF programming to facilitate their application to different types of platforms and their efficient integration with platform-based tools for hardware/software implementation. Additionally, we introduce novel extensions to LWDF to improve its utility for digital hardware design and adaptive signal processing implementation. To address the aforementioned challenges of design space exploration and system optimization, we present a systematic multiobjective optimization framework for dataflow-based architectures. This framework builds on the methodology of multiobjective evolutionary algorithms and derives key system parameters subject to time-varying and multidimensional constraints on system performance. We demonstrate the framework by applying LWDF techniques to develop a dataflow-based architecture that can be dynamically reconfigured to realize strategic configurations in the underlying parameter space based on changing operational requirements. Secondly, we apply Markov decision processes (MDPs) for design space exploration in adaptive embedded signal processing systems. We propose a framework, known as the Hierarchical MDP framework for Compact System-level Modeling (HMCSM), which embraces MDPs to enable autonomous adaptation of embedded signal processing under multidimensional constraints and optimization objectives. The framework integrates automated, MDP-based generation of optimal reconfiguration policies, dataflow-based application modeling, and implementation of embedded control software that carries out the generated reconfiguration policies. Third, we present a new methodology for design and implementation of signal processing systems that are targeted to system-on-chip (SoC) platforms. The methodology is centered on the use of LWDF concepts and methods for applying principles of dataflow design at different layers of abstraction. The development processes integrated in our approach are software implementation, hardware implementation, hardware-software co-design, and optimized application mapping. The proposed methodology facilitates development and integration of signal processing hardware and software modules that involve heterogeneous programming languages and platforms. Through three case studies involving complex applications, we demonstrate the effectiveness of the proposed contributions for compact system level design and design space exploration: a digital predistortion (DPD) system, a reconfigurable channelizer for wireless communication, and a deep neural network (DNN) for vehicle classification.Item Design Tools for Dynamic, Data-Driven, Stream Mining Systems(2015) Sudusinghe, Kishan Palintha; Bhattacharyya, Shuvra S; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)The proliferation of sensing devices and cost- and energy-efficient embedded processors has contributed to an increasing interest in adaptive stream mining (ASM) systems. In this class of signal processing systems, knowledge is extracted from data streams in real-time as the data arrives, rather than in a store-now, process later fashion. The evolution of machine learning methods in many application areas has contributed to demands for efficient and accurate information extraction from streams of data arriving at distributed, mobile, and heterogeneous processing nodes. To enhance accuracy, and meet the stringent constraints in which they must be deployed, it is important for ASM systems to be effective in adapting knowledge extraction approaches and processing configurations based on data characteristics and operational conditions. In this thesis, we address these challenges in design and implementation of ASM systems. We develop systematic methods and supporting design tools for ASM systems that integrate (1) foundations of dataflow modeling for high level signal processing system design, and (2) the paradigm on Dynamic Data-Driven Application Systems (DDDAS). More specifically, the contributions of this thesis can be broadly categorized in to three major directions: 1. We develop a new design framework that systematically applies dataflow methodologies for high level signal processing system design, and adaptive stream mining based on dynamic topologies of classifiers. In particular, we introduce a new design environment, called the lightweight dataflow for dynamic data driven application systems environment (LiD4E). LiD4E provides formal semantics, rooted in dataflow principles, for design and implementation of a broad class of stream mining topologies. Using this novel application of dataflow methods, LiD4E facilitates the efficient and reliable mapping and adaptation of classifier topologies into implementations on embedded platforms. 2. We introduce new design methods for data-driven digital signal processing (DSP) systems that are targeted to resource- and energy-constrained embedded environments, such as unmanned areal vehicles (UAVs), mobile communication platforms, and wireless sensor networks. We develop a design and implementation framework for multi-mode, data driven embedded signal processing systems, where application modes with complementary trade-offs are selected, configured, executed, and switched dynamically, in a data-driven manner. We demonstrate the utility of our proposed new design methods on an energy-constrained, multi-mode face detection application. 3. We introduce new methods for multiobjective, system-level optimization that have been incorporated into the LiD4E design tool described previously. More specifically, we develop new methods for integrated modeling and optimization of real-time stream mining constraints, multidimensional stream mining performance (e.g., precision and recall), and energy efficiency. Using a design methodology centered on data-driven control of and coordination between alternative dataflow subsystems for stream mining (classification modes), we develop systematic methods for exploring complex, multidimensional design spaces associated with dynamic stream mining systems, and deriving sets of Pareto-optimal system configurations that can be switched among based on data characteristics and operating constraints.Item PROFILE- AND INSTRUMENTATION- DRIVEN METHODS FOR EMBEDDED SIGNAL PROCESSING(2015) Chukhman, Ilya; Bhattacharyya, Shuvra; Petrov, Peter; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Modern embedded systems for digital signal processing (DSP) run increasingly sophisticated applications that require expansive performance resources, while simultaneously requiring better power utilization to prolong battery-life. Achieving such conflicting objectives requires innovative software/hardware design space exploration spanning a wide-array of techniques and technologies that offer trade-offs among performance, cost, power utilization, and overall system design complexity. To save on non-recurring engineering (NRE) costs and in order to meet shorter time-to-market requirements, designers are increasingly using an iterative design cycle and adopting model-based computer-aided design (CAD) tools to facilitate analysis, debugging, profiling, and design optimization. In this dissertation, we present several profile- and instrumentation-based techniques that facilitate design and maintenance of embedded signal processing systems: 1. We propose and develop a novel, translation lookaside buffer (TLB) preloading technique. This technique, called context-aware TLB preloading (CTP), uses a synergistic relationship between the (1) compiler for application specific analysis of a task's context, and (2) operating system (OS), for run-time introspection of the context and efficient identification of TLB entries for current and future usage. CTP works by (1) identifying application hotspots using compiler-enabled (or manual) profiling, and (2) exploiting well-understood memory access patterns, typical in signal processing applications, to preload the TLB at context switch time. The benefits of CTP in eliminating inter-task TLB interference and preemptively allocating TLB entries during context-switch are demonstrated through extensive experimental results with signal processing kernels. 2. We develop an instrumentation-driven approach to facilitate the conversion of legacy systems, not designed as dataflow-based applications, to dataflow semantics by automatically identifying the behavior of the core actors as instances of well-known dataflow models. This enables the application of powerful dataflow-based analysis and optimization methods to systems to which these methods have previously been unavailable. We introduce a generic method for instrumenting dataflow graphs that can be used to profile and analyze actors, and we use this instrumentation facility to instrument legacy designs being converted and then automatically detect the dataflow models of the core functions. We also present an iterative actor partitioning process that can be used to partition complex actors into simpler entities that are more prone to analysis. We demonstrate the utility of our proposed new instrumentation-driven dataflow approach with several DSP-based case studies. 3. We extend the instrumentation technique discussed in (2) to introduce a novel tool for model-based design validation called dataflow validation framework (DVF). DVF addresses the problem of ensuring consistency between (1) dataflow properties that are declared or otherwise assumed as part of dataflow-based application models, and (2) the dataflow behavior that is exhibited by implementations that are derived from the models. The ability of DVF to identify disparities between an application's formal dataflow representation and its implementation is demonstrated through several signal processing application development case studies.Item Modeling and Mapping of Optimized Schedules for Embedded Signal Processing Systems(2013) Wu, Hsiang-Huang; Bhattacharyya, Shuvra S.; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)The demand for Digital Signal Processing (DSP) in embedded systems has been increasing rapidly due to the proliferation of multimedia- and communication-intensive devices such as pervasive tablets and smart phones. Efficient implementation of embedded DSP systems requires integration of diverse hardware and software components, as well as dynamic workload distribution across heterogeneous computational resources. The former implies increased complexity of application modeling and analysis, but also brings enhanced potential for achieving improved energy consumption, cost or performance. The latter results from the increased use of dynamic behavior in embedded DSP applications. Furthermore, parallel programming is highly relevant in many embedded DSP areas due to the development and use of Multiprocessor System-On-Chip (MPSoC) technology. The need for efficient cooperation among different devices supporting diverse parallel embedded computations motivates high-level modeling that expresses dynamic signal processing behaviors and supports efficient task scheduling and hardware mapping. Starting with dynamic modeling, this thesis develops a systematic design methodology that supports functional simulation and hardware mapping of dynamic reconfiguration based on Parameterized Synchronous Dataflow (PSDF) graphs. By building on the DIF (Dataflow Interchange Format), which is a design language and associated software package for developing and experimenting with dataflow-based design techniques for signal processing systems, we have developed a novel tool for functional simulation of PSDF specifications. This simulation tool allows designers to model applications in PSDF and simulate their functionality, including use of the dynamic parameter reconfiguration capabilities offered by PSDF. With the help of this simulation tool, our design methodology helps to map PSDF specifications into efficient implementations on field programmable gate arrays (FPGAs). Furthermore, valid schedules can be derived from the PSDF models at runtime to adapt hardware configurations based on changing data characteristics or operational requirements. Under certain conditions, efficient quasi-static schedules can be applied to reduce overhead and enhance predictability in the scheduling process. Motivated by the fact that scheduling is critical to performance and to efficient use of dynamic reconfiguration, we have focused on a methodology for schedule design, which complements the emphasis on automated schedule construction in the existing literature on dataflow-based design and implementation. In particular, we have proposed a dataflow-based schedule design framework called the dataflow schedule graph (DSG), which provides a graphical framework for schedule construction based on dataflow semantics, and can also be used as an intermediate representation target for automated schedule generation. Our approach to applying the DSG in this thesis emphasizes schedule construction as a design process rather than an outcome of the synthesis process. Our approach employs dataflow graphs for representing both application models and schedules that are derived from them. By providing a dataflow-integrated framework for unambiguously representing, analyzing, manipulating, and interchanging schedules, the DSG facilitates effective codesign of dataflow-based application models and schedules for execution of these models. As multicore processors are deployed in an increasing variety of embedded image processing systems, effective utilization of resources such as multiprocessor systemon-chip (MPSoC) devices, and effective handling of implementation concerns such as memory management and I/O become critical to developing efficient embedded implementations. However, the diversity and complexity of applications and architectures in embedded image processing systems make the mapping of applications onto MPSoCs difficult. We help to address this challenge through a structured design methodology that is built upon the DSG modeling framework. We refer to this methodology as the DEIPS methodology (DSG-based design and implementation of Embedded Image Processing Systems). The DEIPS methodology provides a unified framework for joint consideration of DSG structures and the application graphs from which they are derived, which allows designers to integrate considerations of parallelization and resource constraints together with the application modeling process. We demonstrate the DEIPS methodology through cases studies on practical embedded image processing systems.Item Design and testing methodologies for signal processing systems using DICE(2010) Kedilaya, Soujanya Akirebari; Bhattacharyya, Shuvra S; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)The design and integration of embedded systems in heterogeneous programming environments is still largely done in an ad hoc fashion making the overall development process more complicated, tedious and error-prone. In this work, we propose enhancements to existing design flows that utilize model-based design to verify cross-platform correctness of individual actors. The DSPCAD Integrative Command Line Environment (DICE) is a realization of managing these enhancements. We demonstrate this design flow with two case studies. By using DICE's novel test framework on modules of a triggering system in the Large Hadron Collider, we demonstrate how the cross-platform model-based approach, automatic testbench creation and integration of testing in the design process alleviate the rigors of developing such a complex digital system. The second case study is an exploration study into the required precision for eigenvalue decomposition using the Jacobi algorithm. This case study is a demonstration of the use of dataflow modeling in early stage application exploration and the use of DICE in the overall design flow.Item Representation and Scheduling of Scalable Dataflow Graph Topologies(2011) Wu, Shenpei; Bhattacharyya, Shuvra S; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)In dataflow-based application models, the underlying graph representations often consist of smaller sub-structures that repeat multiple times. In order to enable concise and scalable specification of digital signal processing (DSP) systems, a graphical modeling construct called "topological pattern" has been introduced in recent work. In this thesis, we present new design capabilities for specifying and working with topological patterns in the dataflow interchange format (DIF) framework, which is a software tool for model-based design and implementation of signal processing systems. We also present a plug-in to the DIF framework for deriving parameterized schedules, and a code generation module for generating code that implements these schedules. A novel schedule model called the scalable schedule tree (SST) is formulated. The SST model represents an important class of parameterized schedule structures in a form that is intuitive for representation, efficient for code generation, and flexible to support powerful forms of adaptation. We demonstrate our methods for topological pattern representation, SST derivation, and associated dataflow graph code generation using a case study centered around an image registration application.Item MODELING AND OPTIMIZATION TECHNIQUES FOR EFFICIENT IMPLEMENTATION OF PARALLEL EMBEDDED SYSTEMS(2010) GU, RUIRUI; Bhattacharyya, Shuvra S; Levine, William S; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)Embedded systems are becoming more and more important. The products containing embedded systems span from day-to-day household and consumer products, such as digital TVs, mobile phones, and automobiles, to industrial devices and equipment, including, for example, robots, aviation equipment, and high end military and scientific devices such as aircraft. Previously, because embedded systems were highly limited in computational capability, memory size, and power consumption, much research was dedicated to making the best use of limited system resources. In these works, system performance issues, such as execution time, were traded off with system resources, and resources were carefully scheduled and utilized. With more available computational capability in embedded system devices, and more complicated requirements demanding more intensive computation, the most critical design concerns are changing in some important application domains. In such application areas, researchers are paying more and more attention to improving system execution time, which is also the core topic of our work. Execution time is especially critical to real time systems, in the sense that it is related not only to system performance, but also to system correctness and reliability. Multi-core devices, which incorporate two or more processors on the same integrated circuits, are becoming increasingly relevant to the design and implementation of embedded systems. In multi-core platforms, carefully managing communication and synchronization among different cores is important to achieve efficient implementations. Two or more processing cores sharing the same system bus and memory bandwidth limit the achievable performance improvements. The ability of multi-core processors to increase application performance depends on the use of multiple concurrent tasks within applications. Therefore, if code is written in a form that facilitates decomposition into concurrent tasks, the multi-core technologies can be exploited more effectively. Dataflow-based languages are suitable for such decomposition into concurrent tasks, particularly in the broad domain of digital signal processing (DSP) applications. Dataflow representations of DSP software have been explored actively since the 1980s. Such representations have proved to be useful in identifying bottlenecks in DSP algorithms, improving the efficiency of the computations, and designing appropriate hardware for implementing the algorithms. Dataflow descriptions have been used in a wide range of DSP application areas, such as multimedia processing, and wireless communications. Among various forms of dataflow modeling, synchronous dataflow (SDF) is geared towards static scheduling of computational modules, which improves system performance and predictability. However, many DSP applications do not fully conform to the restrictions of SDF modeling. More general dataflow models, such as CAL, have been developed to describe dynamically-structured DSP applications. Such generalized models can express dynamically changing functionality, but lose the powerful static scheduling capabilities provided by SDF. This thesis explores modeling and optimization techniques for efficient implementation of parallel embedded systems. We propose a dataflow based framework, which covers modeling, analysis and optimization and bridges between user-friendly design and efficient implementation. The framework is applied to two kinds of applications: control systems and video processing systems. Model Predictive Control (MPC) has been used in a wide range of application areas including chemical engineering, food processing, automotive engineering, aerospace, and metallurgy. An important limitation on the application of MPC is the difficulty in completing the necessary computations within the sampling interval. Recent trends in computing hardware towards greatly increased parallelism offer a solution to this problem. Our work describes modeling and analysis tools to facilitate implementing MPC algorithms on parallel computers, thereby greatly reducing the time needed to complete the calculations. The use of these tools is illustrated by an application to the critical components of an important class of MPC problems, including the Newton-KKT algorithm, the active set method and linear system solvers. This thesis also presents an in-depth case study of dataflow-based analysis and exploitation of parallelism in the design and implementation of an MPEG RVC (reconfigurable video coding) decoder. Because dataflow models are effective in exposing concurrency and other important forms of high level application structure, dataflow techniques are promising for implementing complex DSP applications on multi-core systems, and other kinds of parallel processing platforms. Targeting video processing systems, we use the CAL language as a concrete framework for representing and demonstrating dataflow design techniques. Furthermore, we also analyze our application of the DIF package (TDP), which helps to automatically process regions that are extracted from the original network, and exhibit properties similar to synchronous dataflow (SDF) models. Detection of SDF-like regions is an important step for applying static scheduling techniques within a dynamic dataflow framework. Furthermore, segmenting a system into SDF-like regions also allows us to explore cross-actor concurrency that results from dynamic dependencies among different regions. Using SDF-like region detection as a preprocessing step to software synthesis generally provides an efficient way for mapping tasks to multi-core systems, and improves the system performance of video processing applications on multi-core platforms. Finally the automation from system design to efficient implementation helps our dataflow based modeling and optimization techniques extend into a wide range of embedded applications.Item Dataflow Integration and Simulation Techniques for DSP System Design Tools(2007-04-27) Hsu, Chia-Jui; Bhattacharyya, Shuvra S.; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)System-level modeling, simulation, and synthesis using dataflow models of computation are widespread in electronic design automation (EDA) tools for digital signal processing (DSP) systems. Over the past few decades, various dataflow models and techniques have been developed for different DSP application domains; and many system design tools incorporate dataflow semantics for different objectives in the design process. In addition, a variety of digital signal processors and other types of embedded processors have been evolving continuously; and many off-the-shelf DSP libraries are optimized for specific processor architectures. To explore their heterogeneous capabilities, we develop a novel framework that centers around the dataflow interchange format (DIF) for helping DSP system designers to integrate the diversity of dataflow models, techniques, design tools, DSP libraries, and embedded processing platforms. The dataflow interchange format is designed as a standard language for specifying DSP-oriented dataflow graphs, and the DIF framework is developed to achieve the following unique combination of objectives: 1) developing dataflow models and techniques to explore the complex design space for embedded DSP systems; 2) porting DSP designs across various tools, libraries, and embedded processing platforms; and 3) synthesizing software implementations from high-level dataflow-based program specifications. System simulation using synchronous dataflow (SDF) is widely adopted in design tools for many years. However, for modern communication and signal processing systems, their SDF representations often consist of large-scale, complex topology, and heavily multirate behavior that challenge simulation -- simulating such systems using conventional SDF scheduling techniques generally leads to unacceptable simulation time and memory requirements. In this thesis, we develop a simulation-oriented scheduler (SOS) for efficient, joint minimization of scheduling time and memory requirements in conventional single-processor environments. Nowadays, multi-core processors that provide on-chip, thread-level parallelism are increasingly popular for the potential in high performance. However, current simulation tools gain only minimal performance improvements due to their sequential SDF execution semantics. Motivated by the trend towards multi-core processors, we develop a novel multithreaded simulation scheduler (MSS) to pursue simulation runtime speed-up through multithreaded execution of SDF graphs on multi-core processors. Our results from SOS and MSS demonstrate large improvements in simulating real-world wireless communication systems.