Exploiting Multi-Loop Parallelism on Heterogeneous Microprocessors

Zuzak, Michael; Yeung, Donald

Exploiting Multi-Loop Parallelism on Heterogeneous Microprocessors

dc.contributor.author	Zuzak, Michael
dc.contributor.author	Yeung, Donald
dc.date.accessioned	2016-11-14T02:55:46Z
dc.date.available	2016-11-14T02:55:46Z
dc.date.issued	2016-11-10
dc.description.abstract	Heterogeneous microprocessors integrate CPUs and GPUs on the same chip, providing fast CPU-GPU communication and enabling cores to compute on data "in place." These advantages will permit integrated GPUs to exploit a smaller unit of parallelism. But one challenge will be exposing sufficient parallelism to keep all of the on-chip compute resources fully utilized. In this paper, we argue that integrated CPU-GPU chips should exploit parallelism from multiple loops simultaneously. One example of this is nested parallelism in which one or more inner SIMD loops are nested underneath a parallel outer (non- SIMD) loop. By scheduling the parallel outer loop on multiple CPU cores, multiple dynamic instances of the inner SIMD loops can be scheduled on the GPU cores. This boosts GPU utilization and parallelizes the non-SIMD code. Our preliminary results show exploiting such multi-loop parallelism provides a 3.12x performance gain over exploiting parallelism from individual loops one at a time.	en_US
dc.identifier	https://doi.org/10.13016/M2FR55
dc.identifier.uri	http://hdl.handle.net/1903/18886
dc.language.iso	en_US	en_US
dc.relation.ispartofseries	UMIACS;UMIACS-TR-2016-01
dc.relation.ispartofseries	UM Computer Science Department;CS-TR-5052
dc.title	Exploiting Multi-Loop Parallelism on Heterogeneous Microprocessors	en_US
dc.type	Technical Report	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: UMIACS-TR-2016-01.pdf
Size:: 185.64 KB
Format:: Adobe Portable Document Format
Description:

Download

Collections

Technical Reports from UMIACS