Compiler Optimizations for Irregular Memory Access Patterns in the PGAS Programming Model
dc.contributor.advisor | Sussman, Alan | en_US |
dc.contributor.author | Rolinger, Thomas Blaine | en_US |
dc.contributor.department | Computer Science | en_US |
dc.contributor.publisher | Digital Repository at the University of Maryland | en_US |
dc.contributor.publisher | University of Maryland (College Park, Md.) | en_US |
dc.date.accessioned | 2023-10-06T05:42:50Z | |
dc.date.available | 2023-10-06T05:42:50Z | |
dc.date.issued | 2023 | en_US |
dc.description.abstract | Applications that operate on large, sparse graphs and matrices exhibit fine-grain irregular memory accesses patterns, leading to both performance and productivity challenges on today's distributed-memory systems. The Partitioned Global Address Space (PGAS) model attempts to address these challenges by combining the memory of physically distributed nodes into a logical global address space, simplifying how programmers perform communication in their applications. However, while the PGAS model can provide high developer productivity, the performance issues that arise from irregular memory accesses are still present. This dissertation aims to bridge the gap between high productivity and high performance for irregular applications in the PGAS programming model. To achieve that goal, I designed and implemented COPPER, a framework that performs Compiler Optimizations for Productivity and PERformance. COPPER automatically performs static analysis to identify irregular memory access patterns to distributed data within parallel loops, and then applies code transformations to perform optimizations at runtime. These optimizations perform small message aggregation, adaptive prefetching and selective data replication. Furthermore, they are applied without requiring user intervention, thereby improving performance and developer productivity. I demonstrate the capabilities of COPPER by implementing it within the Chapel parallel programming language and conducting performance evaluations across a variety of irregular workloads and hardware platforms. These evaluations show that COPPER can achieve runtime speed-ups of 1.08 -- 87x for small message aggregation, 0.78 -- 3.2x for adaptive prefetching and 1.2 -- 444x for selective data replication. | en_US |
dc.identifier | https://doi.org/10.13016/dspace/ulgc-v3wl | |
dc.identifier.uri | http://hdl.handle.net/1903/30764 | |
dc.language.iso | en | en_US |
dc.subject.pqcontrolled | Computer science | en_US |
dc.subject.pquncontrolled | compiler optimizations | en_US |
dc.subject.pquncontrolled | distributed-memory | en_US |
dc.subject.pquncontrolled | irregular memory accesses | en_US |
dc.title | Compiler Optimizations for Irregular Memory Access Patterns in the PGAS Programming Model | en_US |
dc.type | Dissertation | en_US |
Files
Original bundle
1 - 1 of 1
Loading...
- Name:
- Rolinger_umd_0117E_23483.pdf
- Size:
- 3.42 MB
- Format:
- Adobe Portable Document Format