Now showing items 1-10 of 33
Last Level Cache (LLC) Performance of Data Mining Workloads On a CMP — Case Study of Parallel Bioinformatics Workloads
With the continuing growth in the amount of genetic data, members of the bioinformatics community are developing a variety of data-mining applications to understand the data and discover meaningful information. These ...
DDR2 and Low Latency Variants
This paper describes a performance examination of the DDR2 DRAM architecture and the proposed cache-enhanced variants. These preliminary studies are based upon ongoing collaboration between the authors and the Joint ...
Extended Split-Issue: Enabling Flexibility in the Hardware Implementation of NUAL VLIW DSPs
VLIW architecture based DSPs have become widespread due to the combined benefits of simple hardware and compiler-extracted instruction-level parallelism. However, the VLIW instruction set architecture and its hardware ...
Software-Managed Address Translation
In this paper we explore software-managed address translation. The purpose of the study is to specify the memory management design for a high clock-rate PowerPC implementation in which a simple design is a prerequisite ...
In-Line Interrupt Handling and Lock-Up Free Translation Lookaside Buffers (TLBs)
The effects of the general-purpose precise interrupt mechanisms in use for the past few decades have received very little attention. When modern out-of-order processors handle interrupts precisely, they typically begin by ...
Virtual Memory in Contemporary Microprocessors
THIS SURVEY OF SIX COMMERCIAL MEMORY-MANAGEMENT DESIGNS DESCRIBES HOW EACH PROCESSOR ARCHITECTURE SUPPORTS THE COMMON FEATURES OF VIRTUAL MEMORY: ADDRESS SPACE PROTECTION, SHARED MEMORY, AND LARGE ADDRESS SPACES.
Hardware/Software Co-Design of I/O Interfacing Hardware and Real-Time Device Drivers for Embedded Systems
We have conceptualized a hardware-software codesign strategy for creating I/O interfacing hardware and real-time operating system device drivers for microcontrollers, enabling hardware independent access to I/O devices ...
Segmented Addressing Solves the Virtual Cache Synonym Problem
If one is interested solely in processor speed, one must use virtually indexed caches. The traditional purported weakness of virtual caches is their inability to support shared memory. Many implementations of shared ...
Uniprocessor Virtual Memory Without TLBs
We present a feasibility study for performing virtual address translation without specialized translation hardware. Removing address translation hardware and instead managing address translation in software has the potential ...
Using Virtual Load/Store Queues (VLSQs) to Reduce the Negative Effects of Reordered Memory Instructions
The use of large instruction windows coupled with aggressive out-of order and prefetching capabilities has provided significant improvements in processor performance. In this paper, we quantify the effects of increased ...