Optimization of hierarchical matrix computation on GPU

Satoshi Ohshima, Ichitaro Yamazaki, Akihiro Ida, Rio Yokota

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Citations (Scopus)

Abstract

The demand for dense matrix computation in large scale and complex simulations is increasing; however, the memory capacity of current computer system is insufficient for such simulations. Hierarchical matrix method (H -matrices) is attracting attention as a computational method that can reduce the memory requirements of dense matrix computations. However, the computation of H -matrices is more complex than that of dense and sparse matrices; thus, accelerating the H -matrices is required. We focus on H -matrix - vector multiplication (HMVM) on a single NVIDIA Tesla P100 GPU. We implement five GPU kernels and compare execution times among various processors (the Broadwell-EP, Skylake-SP, and Knights Landing) by OpenMP. The results show that, although an HMVM kernel can compute many small GEMV kernels, merging such kernels to a single GPU kernel was the most effective implementation. Moreover, the performance of BATCHED BLAS in the MAGMA library was comparable to that of the manually tuned GPU kernel.

Original languageEnglish
Title of host publicationSupercomputing Frontiers - 4th Asian Conference, SCFA 2018, Proceedings
EditorsRio Yokota, Weigang Wu
PublisherSpringer Verlag
Pages274-292
Number of pages19
ISBN (Print)9783319699523
DOIs
Publication statusPublished - Jan 1 2018
Event4th Asian Conference on Supercomputing Frontiers, SCFA 2018 - Singapore, Singapore
Duration: Mar 26 2018Mar 29 2018

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10776 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference4th Asian Conference on Supercomputing Frontiers, SCFA 2018
CountrySingapore
CitySingapore
Period3/26/183/29/18

All Science Journal Classification (ASJC) codes

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint Dive into the research topics of 'Optimization of hierarchical matrix computation on GPU'. Together they form a unique fingerprint.

  • Cite this

    Ohshima, S., Yamazaki, I., Ida, A., & Yokota, R. (2018). Optimization of hierarchical matrix computation on GPU. In R. Yokota, & W. Wu (Eds.), Supercomputing Frontiers - 4th Asian Conference, SCFA 2018, Proceedings (pp. 274-292). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 10776 LNCS). Springer Verlag. https://doi.org/10.1007/978-3-319-69953-0_16