Scalable Fast Multipole Methods on Heterogeneous Architecture

dc.contributor.advisorDuraiswami, Ramanien_US
dc.contributor.advisorGumerov, Nail A.en_US
dc.contributor.authorHu, Qien_US
dc.contributor.departmentComputer Scienceen_US
dc.contributor.publisherDigital Repository at the University of Marylanden_US
dc.contributor.publisherUniversity of Maryland (College Park, Md.)en_US
dc.date.accessioned2013-10-02T05:31:46Z
dc.date.available2013-10-02T05:31:46Z
dc.date.issued2013en_US
dc.description.abstractThe N-body problem appears in many computational physics simulations. At each time step the computation involves an all-pairs sum whose complexity is quadratic, followed by an update of particle positions. This cost means that it is not practical to solve such dynamic N-body problems on large scale. To improve this situation, we use both algorithmic and hardware approaches. Our algorithmic approach is to use the Fast Multipole Method (FMM), which is a divide-and-conquer algorithm that performs a fast N-body sum using a spatial decomposition and is often used in a time-stepping or iterative loop, to reduce such quadratic complexity to linear with guaranteed accuracy. Our hardware approach is to use heterogeneous clusters, which comprised of nodes that contain multi-core CPUs tightly coupled with accelerators, such as graphics processors unit (GPU) as our underline parallel processing hardware, on which efficient implementations require highly non-trivial re-designed algorithms. In this dissertation, we fundamentally reconsider the FMM algorithms on heterogeneous architectures to achieve a significant improvement over recent/previous implementations in literature and to make the algorithm ready for use as a workhorse simulation tool for both time-dependent vortex flow problems and for boundary element methods. Our major contributions include: 1. Novel FMM data structures using parallel construction algorithms for dynamic problems. 2. A fast hetegenenous FMM algorithm for both single and multiple computing nodes. 3. An efficient inter-node communication management using fast parallel data structures. 4. A scalable FMM algorithm using novel Helmholz decomposition for Vortex Methods (VM). The proposed algorithms can handle non-uniform distributions with irregular partition shapes to achieve workload balance and their MPI-CUDA implementations are highly tuned up and demonstrate the state of the art performances.en_US
dc.identifier.urihttp://hdl.handle.net/1903/14454
dc.subject.pqcontrolledComputer scienceen_US
dc.subject.pquncontrolledDynamic Simulationen_US
dc.subject.pquncontrolledFast Multipole Methoden_US
dc.subject.pquncontrolledGPGPUen_US
dc.subject.pquncontrolledHeterogeneous Architectureen_US
dc.subject.pquncontrolledN-bodyen_US
dc.subject.pquncontrolledVortex Methoden_US
dc.titleScalable Fast Multipole Methods on Heterogeneous Architectureen_US
dc.typeDissertationen_US

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Hu_umd_0117E_14381.pdf
Size:
4.52 MB
Format:
Adobe Portable Document Format