Hybrid-PGAS Memory Hierarchy for Next Generation HPC Systems
Files
Publication or External Link
Date
Authors
Advisor
Citation
DRUM DOI
Abstract
Demands on computational performance, power efficiency, data transfer, resource capacity, and resilience for next generation high performance computing (HPC) systems present a new host of challenges. There is a growing disparity between computational performance vs. network and storage device throughput and among the energy costs of computational, memory, and communication operations. Chapel is a powerful, high-level, parallel, PGAS language designed to streamline development by addressing code complexities and uses a shared memory model for handling large, distributed memory systems. I extended the capabilities of Chapel by providing support of persistent memory with intrinsic and programmatic features for HPC systems. In my approach I explored the efficacy of persistent memory in a hybrid-PGAS environment through latency hiding analysis via cache monitoring, identification and mitigation of performance bottlenecks via data-centric analysis, and hardware profiling to assess performance cost vs. benefits and energy footprint. To manage persistency and ensure resiliency I developed a transaction system with ACID properties that supports hybrid-PGAS virtual addressing and distributed checkpoint and recovery system.