Scalable Fast Multipole Methods on Heterogeneous Architecture
dc.contributor.advisor | Duraiswami, Ramani | en_US |
dc.contributor.advisor | Gumerov, Nail A. | en_US |
dc.contributor.author | Hu, Qi | en_US |
dc.contributor.department | Computer Science | en_US |
dc.contributor.publisher | Digital Repository at the University of Maryland | en_US |
dc.contributor.publisher | University of Maryland (College Park, Md.) | en_US |
dc.date.accessioned | 2013-10-02T05:31:46Z | |
dc.date.available | 2013-10-02T05:31:46Z | |
dc.date.issued | 2013 | en_US |
dc.description.abstract | The N-body problem appears in many computational physics simulations. At each time step the computation involves an all-pairs sum whose complexity is quadratic, followed by an update of particle positions. This cost means that it is not practical to solve such dynamic N-body problems on large scale. To improve this situation, we use both algorithmic and hardware approaches. Our algorithmic approach is to use the Fast Multipole Method (FMM), which is a divide-and-conquer algorithm that performs a fast N-body sum using a spatial decomposition and is often used in a time-stepping or iterative loop, to reduce such quadratic complexity to linear with guaranteed accuracy. Our hardware approach is to use heterogeneous clusters, which comprised of nodes that contain multi-core CPUs tightly coupled with accelerators, such as graphics processors unit (GPU) as our underline parallel processing hardware, on which efficient implementations require highly non-trivial re-designed algorithms. In this dissertation, we fundamentally reconsider the FMM algorithms on heterogeneous architectures to achieve a significant improvement over recent/previous implementations in literature and to make the algorithm ready for use as a workhorse simulation tool for both time-dependent vortex flow problems and for boundary element methods. Our major contributions include: 1. Novel FMM data structures using parallel construction algorithms for dynamic problems. 2. A fast hetegenenous FMM algorithm for both single and multiple computing nodes. 3. An efficient inter-node communication management using fast parallel data structures. 4. A scalable FMM algorithm using novel Helmholz decomposition for Vortex Methods (VM). The proposed algorithms can handle non-uniform distributions with irregular partition shapes to achieve workload balance and their MPI-CUDA implementations are highly tuned up and demonstrate the state of the art performances. | en_US |
dc.identifier.uri | http://hdl.handle.net/1903/14454 | |
dc.subject.pqcontrolled | Computer science | en_US |
dc.subject.pquncontrolled | Dynamic Simulation | en_US |
dc.subject.pquncontrolled | Fast Multipole Method | en_US |
dc.subject.pquncontrolled | GPGPU | en_US |
dc.subject.pquncontrolled | Heterogeneous Architecture | en_US |
dc.subject.pquncontrolled | N-body | en_US |
dc.subject.pquncontrolled | Vortex Method | en_US |
dc.title | Scalable Fast Multipole Methods on Heterogeneous Architecture | en_US |
dc.type | Dissertation | en_US |
Files
Original bundle
1 - 1 of 1