TY - GEN
T1 - Implementation and evaluation of 3D finite element method application for CUDA
AU - Ohshima, Satoshi
AU - Hayashi, Masae
AU - Katagiri, Takahiro
AU - Nakajima, Kengo
PY - 2013/9/5
Y1 - 2013/9/5
N2 - This paper describes a fast implementation of a FEM application on a GPU. We implemented our own FEM application and succeeded in obtaining a performance improvement in two of our application components: Matrix Assembly and Sparse Matrix Solver. Moreover, we found that accelerating our Boundary Condition Setting component on the GPU and omitting CPU-GPU data transfer between Matrix Assembly and Sparse Matrix Solver slightly further reduces execution time. As a result, the execution time of the entire FEM application was shortened from 44.65 sec on only a CPU (Nehalem architecture, 4 cores, OpenMP) to 17.52 sec on a CPU with a GPU (TeslaC2050).
AB - This paper describes a fast implementation of a FEM application on a GPU. We implemented our own FEM application and succeeded in obtaining a performance improvement in two of our application components: Matrix Assembly and Sparse Matrix Solver. Moreover, we found that accelerating our Boundary Condition Setting component on the GPU and omitting CPU-GPU data transfer between Matrix Assembly and Sparse Matrix Solver slightly further reduces execution time. As a result, the execution time of the entire FEM application was shortened from 44.65 sec on only a CPU (Nehalem architecture, 4 cores, OpenMP) to 17.52 sec on a CPU with a GPU (TeslaC2050).
UR - http://www.scopus.com/inward/record.url?scp=84883279123&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84883279123&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-38718-0_16
DO - 10.1007/978-3-642-38718-0_16
M3 - Conference contribution
AN - SCOPUS:84883279123
SN - 9783642387173
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 140
EP - 148
BT - High Performance Computing for Computational Science, VECPAR 2012 - 10th International Conference, Revised Selected Papers
T2 - 10th International Conference on High Performance Computing for Computational Science, VECPAR 2012
Y2 - 17 July 2012 through 20 July 2012
ER -