Browsing by Author "Balkan, Aydin O."
Now showing 1 - 3 of 3
Results Per Page
Sort Options
Item ARBITRATE-AND-MOVE PRIMITIVES FOR HIGH THROUGHPUT ON-CHIP INTERCONNECTION NETWORKS(IEEE, 2004-05) Balkan, Aydin O.; Vishkin, U.; Qu, GangAn n-leaf pipelined balanced binary tree is used for arbitration of order and movement of data from n input ports to one output port. A novel arbitrate-and-move primitive circuit for every node of the tree, which is based on a concept of reduced synchrony that benefits from attractive features of both asynchronous and synchronous designs, is presented. The design objective of the pipelined binary tree is to provide a key building block in a high-throughput mesh-of-trees interconnection network for Explicit Multi Threading (XMT) architecture, a recently introduced parallel computation framework. The proposed reduced synchrony circuit was compared with asynchronous and synchronous designs of arbitrate-and-move primitives. Simulations with 0.18m technology show that compared to an asynchronous design, the proposed reduced synchrony implementation achieves a higher throughput, up to 2 Giga- Requests per second on an 8-leaf binary tree. Our circuit also consumes less power than the synchronous design, and requires less silicon area than both the synchronous and asynchronous designs.Item Mesh-of-Trees and Alternative Interconnection Networks for Single Chip Parallel Processing (Extended Abstract)(2006-06) Balkan, Aydin O.; Qu, Gang; Vishkin, UziMany applications have stimulated the recent surge of interest single-chip parallel processing. In such machines, it is crucial to implement a high-throughput low-latency interconnection network to connect the on-chip components, especially the processing units and the memory units. In this paper, we propose a new mesh of trees (MoT) implementation of the interconnection network and evaluate it relative to metrics such as wire area, register count, total switch delay, maximum throughput, latency-throughput relation and delay effects of long wires. We show that on-chip interconnection networks can facilitate higher bandwidth between processors and shared first-level cache than previously considered possible. This has significant impact for chip multiprocessing. MoT is also compared, both analytically and experimentally, to some other traditional network topologies, such as hypercube, butterfly, fat trees and butterfly fat trees. When we evaluate a 64-terminal MoT network at 65nm technology, concrete results show that MoT provides higher throughput and lower latency especially when the input traffic (or the on-chip parallelism) is high, at the cost of larger area. A recurring problem in networking and communication is that of achieving good sustained throughput in contrast to just high theoretical peak performance that does not materialize for typical work loads. Our quantitative results demonstrate a clear advantage of the proposed MoT network in the context of single-chip parallel processing.Item Programmer's Manual for XMTC Language, XMTC Compiler and XMT Simulator(2006-06) Balkan, Aydin O.; Vishkin, UziExplicit Multi-Threading (XMT) is a computing framework developed at the University of Maryland as part of a PRAM-on-chip vision (http://www.umiacs.umd.edu/users/vishkin/XMT). Much in the same way that performance programming of standard computers relies on C language, XMT performance programming is done using an extension of C called XMTC. This manual presents the second generation of XMTC programming paradigm. It is intended to be used by an application programmer, who is new to XMTC. In the first part of this technical report (UMIACS-TR 2005-45 Part 1 of 2), we define and describe key concepts, list the limitations and restrictions, and give examples. The second part (UMIACS-TR 2005-45 Part 2 of 2) is a brief tutorial, and it demonstrates the basic programming concepts of XMTC language with examples and exercises.