A. James Clark School of Engineering

Permanent URI for this communityhttp://hdl.handle.net/1903/1654

The collections in this community comprise faculty research works, as well as graduate theses and dissertations.

Browse

Search Results

Now showing 1 - 2 of 2
  • Thumbnail Image
    Item
    Speculative Data Distribution in Shared Memory Multiprocessors
    (2008-04-16) Leventhal, Sean; Franklin, Manoj; Electrical Engineering; Digital Repository at the University of Maryland; University of Maryland (College Park, Md.)
    This work explores the possibility of using speculation at the directories in a cache coherent non-uniform memory access multiprocessor architecture to improve performance by forwarding data to their destinations before requests are sent. It improves on previous consumer prediction techniques, showing how to construct a predictor that can handle a tradeoff of accuracy and coverage. This dissertation then explores the correct time to perform consumer prediction, and show how a directory protocol can incorporate such a scheme. The consumer prediction enhanced protocol that is developed is able to reduce the runtime of a set of scientific benchmarks by 10%-20%, without substantially reducing the runtime of other benchmarks; specifically, those benchmarks feature simple phased behavior and regularly distribute data to more than two processors. This work then explores the interaction of consumer prediction with two other forms of prediction, migratory prediction and last touch prediction. It demonstrates a mechanism by which migratory prediction can be implemented using only the storage elements already present in a consumer predictor. By combining this migratory predictor with a consumer predictor, it is possible to produce greater speedups than did either individually. Finally, the signatures of the last touch predictor can be applied to improve the performance of consumer prediction.
  • Thumbnail Image
    Item
    Communication-Driven Codesign for Multiprocessor Systems
    (2004-04-30) Bambha, Neal Kumar; Bhattacharyya, Shuvra S; Electrical Engineering
    Several trends in technology have important implications for embedded systems of the future. One trend is the increasing density and number of transistors that can be placed on a chip. This allows designers to fit more functionality into smaller devices, and to place multiple processing cores on a single chip. Another trend is the increasing emphasis on low power designs. A third trend is the appearance of bottlenecks in embedded system designs due to the limitations of long electrical interconnects, and increasing use of optical interconnects to overcome these bottlenecks. These trends lead to rapidly increasing complexity in the design process, and the necessity to develop tools that automate the process. This thesis will present techniques and algorithms for developing such tools. Automated techniques are especially important for multiprocessor designs. Programming such systems is difficult, and this is one reason why they are not as prevalent today. In this thesis we explore techniques for automating and optimizing the process of mapping applications onto system architectures containing multiple processors. We examine different processor interconnection methods and topologies, and the design implications of different levels of connectivity between the processors. Using optics, it is practical to construct processor interconnections having arbitrary topologies. This can offer advantages over regular interconnection topologies. However, existing scheduling techniques do not work in general for such arbitrarily connected systems. We present an algorithm that can be used to supplement existing scheduling techniques to enable their use with arbitrary interconnection patterns. We use our scheduling techniques to explore the larger problem of synthesizing an optimal interconnection network for a problem or group of problems. We examine the problem of optimizing synchronization costs in multiprocessor systems, and propose new architectures that reduce synchronization costs and permit efficient performance analysis. All the trends listed above combine to add dimensions to the already vast design space for embedded systems. Optimizations in embedded system design invariably reduce to searching vast design spaces. We describe a new hybrid global/local framework that combines evolutionary algorithms with problem-specific local search and demonstrate that it is more efficient in searching these spaces.