Optimization of hierarchical matrix computation on GPU

Satoshi Ohshima, Ichitaro Yamazaki, Akihiro Ida, Rio Yokota

    研究成果: Chapter in Book/Report/Conference proceedingConference contribution

    3 被引用数 (Scopus)

    抄録

    The demand for dense matrix computation in large scale and complex simulations is increasing; however, the memory capacity of current computer system is insufficient for such simulations. Hierarchical matrix method (H -matrices) is attracting attention as a computational method that can reduce the memory requirements of dense matrix computations. However, the computation of H -matrices is more complex than that of dense and sparse matrices; thus, accelerating the H -matrices is required. We focus on H -matrix - vector multiplication (HMVM) on a single NVIDIA Tesla P100 GPU. We implement five GPU kernels and compare execution times among various processors (the Broadwell-EP, Skylake-SP, and Knights Landing) by OpenMP. The results show that, although an HMVM kernel can compute many small GEMV kernels, merging such kernels to a single GPU kernel was the most effective implementation. Moreover, the performance of BATCHED BLAS in the MAGMA library was comparable to that of the manually tuned GPU kernel.

    本文言語英語
    ホスト出版物のタイトルSupercomputing Frontiers - 4th Asian Conference, SCFA 2018, Proceedings
    編集者Rio Yokota, Weigang Wu
    出版社Springer Verlag
    ページ274-292
    ページ数19
    ISBN(印刷版)9783319699523
    DOI
    出版ステータス出版済み - 2018
    イベント4th Asian Conference on Supercomputing Frontiers, SCFA 2018 - Singapore, シンガポール
    継続期間: 3 26 20183 29 2018

    出版物シリーズ

    名前Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    10776 LNCS
    ISSN(印刷版)0302-9743
    ISSN(電子版)1611-3349

    会議

    会議4th Asian Conference on Supercomputing Frontiers, SCFA 2018
    国/地域シンガポール
    CitySingapore
    Period3/26/183/29/18

    All Science Journal Classification (ASJC) codes

    • 理論的コンピュータサイエンス
    • コンピュータ サイエンス(全般)

    フィンガープリント

    「Optimization of hierarchical matrix computation on GPU」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

    引用スタイル