VLIW Instruction Scheduling for Reduced Code Size

Haga, Steve Wayne

VLIW Instruction Scheduling for Reduced Code Size

dc.contributor.advisor	Barua, Rajeev K	en_US
dc.contributor.author	Haga, Steve Wayne	en_US
dc.contributor.department	Electrical Engineering	en_US
dc.contributor.publisher	Digital Repository at the University of Maryland	en_US
dc.contributor.publisher	University of Maryland (College Park, Md.)	en_US
dc.date.accessioned	2006-02-04T06:43:59Z
dc.date.available	2006-02-04T06:43:59Z
dc.date.issued	2005-12-05	en_US
dc.description.abstract	Code size is important to the cost of embedded systems. Although VLIW architectures are popular for embedded systems, they impose constraints on instruction placement that make it difficult to find a compact schedule. Existing VLIW instruction scheduling methods primarily target run-time but not code size. The usual approach has two components. First, methods such as trace scheduling provide a mechanism to correctly move instructions across basic blocks. Second, the instructions within a trace are scheduled, perhaps moving instructions across blocks. Because run-time is the only consideration, this approach increases code size by inserting compensation code. Methods such as superblocking increase the size even further by duplicating code. We present a compiler method for instruction scheduling that, for the first time, uses the power of across-block scheduling methods such as trace scheduling to reduce code size as well as run-time. For a certain class of VLIWs, we show that trace scheduling, previously synonymous with increased code size, can in fact reduce it. Our within-trace scheduler uses a cost-model driven, back-tracking approach. Starting with an optimal, exponential-time algorithm, branch-and-bound techniques and non-optimal heuristics reduce the compile time to within a factor of 2 of the original, on average. The code size for our benchmarks is reduced by 16.3% versus the best existing across-block scheduler, while being within 0.8% of its run-time, on a 6-wide VLIW. For a 3-wide VLIW, code size improves by 14.7%, with the same 0.8% run-time cost. Thus, the code size improvements are fairly stable across VLIW widths. We further explore the impact of our techniques on machines with predication support or small I-cache sizes. In the process, we present a novel predication analysis of general applicability. If predication is present, the code size improves to 16.6%. In addition, for machines with small I-caches, the reduced code size of our approach tends to yield better cache hit rates. We find that, although this effect is modest, the performance improvement more than offsets the run-time costs of our method. Therefore, on machines with small I-caches, our code size improvements are achievable at no run-time cost.	en_US
dc.format.extent	1330613 bytes
dc.format.mimetype	application/pdf
dc.identifier.uri	http://hdl.handle.net/1903/3077
dc.language.iso	en_US
dc.subject.pqcontrolled	Engineering, Electronics and Electrical	en_US
dc.subject.pqcontrolled	Computer Science	en_US
dc.subject.pquncontrolled	VLIW	en_US
dc.subject.pquncontrolled	instruction scheduling	en_US
dc.subject.pquncontrolled	code size	en_US
dc.subject.pquncontrolled	code compression	en_US
dc.subject.pquncontrolled	NOPs	en_US
dc.title	VLIW Instruction Scheduling for Reduced Code Size	en_US
dc.type	Dissertation	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: umi-umd-2885.pdf
Size:: 1.27 MB
Format:: Adobe Portable Document Format

Download

Collections

UMD Theses and Dissertations
Electrical & Computer Engineering Theses and Dissertations