Hardware Design, Prototyping and Studies of the Explicit Multi-Threading (XMT) Paradigm

dc.contributor.advisorVishkin, Uzien_US
dc.contributor.authorWen, Xingzhien_US
dc.contributor.departmentElectrical Engineeringen_US
dc.contributor.publisherDigital Repository at the University of Marylanden_US
dc.contributor.publisherUniversity of Maryland (College Park, Md.)en_US
dc.date.accessioned2008-10-11T05:52:10Z
dc.date.available2008-10-11T05:52:10Z
dc.date.issued2008-08-06en_US
dc.description.abstractWith the end of exponential performance improvements in sequential computers, parallel computers, dubbed "chip multiprocessor", "multicore", or "manycore", has been introduced. Unfortunately, programming current parallel computers tends to be far more difficult than programming sequential computers. The Parallel Random Access Model (PRAM) is known to be an easy-to-program parallel computer model and has been widely used by theorists to develop parallel algorithms because it abstracts away architecture details and allows algorithm designers to focus on critical issues. The eXplicit Multi-Threading (XMT) PRAM-On-Chip project seeks to build an easy-to-program on-chip parallel processor by supporting a PRAM-like programming (performance) model. This dissertation focuses on the design, study of the micro-architecture of the XMT processor as well as performance optimization. The main contributions are:(1) Presented a scalable micro-architecture of the XMT based on high level description of the architecture. (2) Designed a synthesizable Verilog HDL (hardware design language) description of XMT, which lead to the first commitment to the silicon of the XMT processor, a 75 MHz XMT FPGA computer. With the same design, we expect to see the first XMT ASIC processor using IBM 90nm technology. (3) Proposed and implemented some architecture upgrades to the XMT: (i)value broadcasting, (ii)hardware/software co-managed prefetch buffers and (iii) hardware/software co-managed read-only buffers. (4) Quantitatively studied the performance of XMT using non-trivial application kernels with the 75 MHz XMT FPGA computer, in addition, the performance of a 800MHz XMT processor is projected. (5) The choice of not having local private caches in the XMT architecture is studied by comparing current architecture with an alternative one that includes conventional coherent private caches.en_US
dc.format.extent1434961 bytes
dc.format.mimetypeapplication/pdf
dc.identifier.urihttp://hdl.handle.net/1903/8588
dc.language.isoen_US
dc.subject.pqcontrolledComputer Scienceen_US
dc.subject.pqcontrolledEngineering, Electronics and Electricalen_US
dc.subject.pquncontrolledParallel algorithmsen_US
dc.subject.pquncontrolledPRAMen_US
dc.subject.pquncontrolledOn-chip parallel processoren_US
dc.subject.pquncontrolledEase-of-programmingen_US
dc.subject.pquncontrolledExplicit multi-threadingen_US
dc.subject.pquncontrolledXMTen_US
dc.titleHardware Design, Prototyping and Studies of the Explicit Multi-Threading (XMT) Paradigmen_US
dc.typeDissertationen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
umi-umd-5697.pdf
Size:
1.37 MB
Format:
Adobe Portable Document Format