Document Type
Article
Publication Date
2016
DOI
10.1016/j.proeng.2016.11.018
Publication Title
Procedia Engineering
Volume
163
Issue
Supplement C
Pages
59-71
Conference Name
25th International Meshing Roundtable
Abstract
In this paper, we present a scalable three dimensional hybrid MPI+Threads parallel Delaunay image-to-mesh conversion algorithm. A nested master-worker communication model for parallel mesh generation is implemented which simultaneously explores process-level parallelization and thread-level parallelization: inter-node communication using MPI and inter-core communication inside one node using threads. In order to overlap the communication (task request and data movement) and computation (parallel mesh refinement), the inter-node MPI communication and intra-node local mesh refinement is separated. The master thread that initializes the MPI environment is in charge of the inter-node MPI communication while the worker threads of each process are only responsible for the local mesh refinement within the node. We conducted a set of experiments to test the performance of the algorithm on Turing, a distributed memory cluster at Old Dominion University High Performance Computing Center and observed that the granularity of coarse level data decomposition, which affects the coarse level concurrency, has a significant influence on the performance of the algorithm. With the proper value of granularity, the algorithm expresses impressive performance potential and is scalable to 30 distributed memory compute nodes with 20 cores each (the maximum number of nodes available for us in the experiments).
Original Publication Citation
Feng, D., Chernikov, A. N., & Chrisochoides, N. P. (2016). A hybrid parallel Delaunay image-to-mesh conversion algorithm scalable on distributed-memory clusters. Procedia Engineering, 163(Supplement C), 59-71. doi: https://doi.org/10.1016/j.proeng.2016.11.018
Repository Citation
Feng, D., Chernikov, A. N., & Chrisochoides, N. P. (2016). A hybrid parallel Delaunay image-to-mesh conversion algorithm scalable on distributed-memory clusters. Procedia Engineering, 163(Supplement C), 59-71. doi: https://doi.org/10.1016/j.proeng.2016.11.018
Comments
This open access article under the CC BY-NC-ND license
(http://creativecommons.org/licenses/by-nc-nd/4.0/)