Data-centric Performance Measurement and Mapping for Highly Parallel Programming Models

dc.contributor.advisorHollingsworth, Jeffrey K.en_US
dc.contributor.authorZhang, Huien_US
dc.contributor.departmentElectrical Engineeringen_US
dc.contributor.publisherDigital Repository at the University of Marylanden_US
dc.contributor.publisherUniversity of Maryland (College Park, Md.)en_US
dc.date.accessioned2019-02-01T06:38:41Z
dc.date.available2019-02-01T06:38:41Z
dc.date.issued2018en_US
dc.description.abstractModern supercomputers have complex features: many hardware threads, deep memory hierarchies, and many co-processors/accelerators. Productively and effectively designing programs to utilize those hardware features is crucial in gaining the best performance. There are several highly parallel programming models in active development that allow programmers to write efficient code on those architectures. Performance profiling is a very important technique in the development to achieve the best performance. In this dissertation, I proposed a new performance measurement and mapping technique that can associate performance data with program variables instead of code blocks. To validate the applicability of my data-centric profiling idea, I designed and implemented a profiler for PGAS and CUDA. For PGAS, I developed ChplBlamer, for both single-node and multi-node Chapel programs. My tool also provides new features such as data-centric inter-node load imbalance identification. For CUDA, I developed CUDABlamer for GPU-accelerated applications. CUDABlamer also attributes performance data to program variables, which is a feature that was not found in any previous CUDA profilers. Directed by the insights from the tools, I optimized several widely-studied benchmarks and significantly improved program performance by a factor of up to 4x for Chapel and 47x for CUDA kernels.en_US
dc.identifierhttps://doi.org/10.13016/6ajr-llrt
dc.identifier.urihttp://hdl.handle.net/1903/21638
dc.language.isoenen_US
dc.subject.pqcontrolledComputer engineeringen_US
dc.subject.pqcontrolledComputer scienceen_US
dc.subject.pquncontrolledCode Optimizationen_US
dc.subject.pquncontrolledGPU Performance Analysisen_US
dc.subject.pquncontrolledHigh Performance Computingen_US
dc.subject.pquncontrolledParallel Programsen_US
dc.subject.pquncontrolledPerformance Profilingen_US
dc.subject.pquncontrolledPGASen_US
dc.titleData-centric Performance Measurement and Mapping for Highly Parallel Programming Modelsen_US
dc.typeDissertationen_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Zhang_umd_0117E_19478.pdf
Size:
2.59 MB
Format:
Adobe Portable Document Format