Implementation and evaluation of 3D finite element method application for CUDA

Satoshi Ohshima, Masae Hayashi, Takahiro Katagiri, Kengo Nakajima

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

This paper describes a fast implementation of a FEM application on a GPU. We implemented our own FEM application and succeeded in obtaining a performance improvement in two of our application components: Matrix Assembly and Sparse Matrix Solver. Moreover, we found that accelerating our Boundary Condition Setting component on the GPU and omitting CPU-GPU data transfer between Matrix Assembly and Sparse Matrix Solver slightly further reduces execution time. As a result, the execution time of the entire FEM application was shortened from 44.65 sec on only a CPU (Nehalem architecture, 4 cores, OpenMP) to 17.52 sec on a CPU with a GPU (TeslaC2050).

Original languageEnglish
Title of host publicationHigh Performance Computing for Computational Science, VECPAR 2012 - 10th International Conference, Revised Selected Papers
Pages140-148
Number of pages9
DOIs
Publication statusPublished - Sep 5 2013
Event10th International Conference on High Performance Computing for Computational Science, VECPAR 2012 - Kobe, Japan
Duration: Jul 17 2012Jul 20 2012

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume7851 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other10th International Conference on High Performance Computing for Computational Science, VECPAR 2012
CountryJapan
CityKobe
Period7/17/127/20/12

    Fingerprint

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Ohshima, S., Hayashi, M., Katagiri, T., & Nakajima, K. (2013). Implementation and evaluation of 3D finite element method application for CUDA. In High Performance Computing for Computational Science, VECPAR 2012 - 10th International Conference, Revised Selected Papers (pp. 140-148). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 7851 LNCS). https://doi.org/10.1007/978-3-642-38718-0_16