A Scalable Parallel Partition Tridiagonal Solver for Many-Core and Low B/F Processors

Tatsuya Mitsuda, Kenji Ono

研究成果: 書籍/レポート タイプへの寄稿会議への寄与

抄録

Tridiagonal systems are among the most fundamental computations in science, engineering, and mathematics, and one solver used in such systems is Tree Partitioning Reduction (TPR), which is a divide-and-conquer method that solves large-scale linear equations by dividing them and then computing the parts in parallel within different local memory threads. Herein, we propose an improved TPR algorithm that has a parallel cyclic reduction flavor, with which we reduced the number of algorithm steps by approximately half while simultaneously increasing arithmetic intensity and cache reusability. A performance evaluation conducted on an Intel Skylake-SP microprocessor showed a high hit ratio for the L1 cache and that our solver was as much as 31 times faster on 32 threads for 262144 equations. In the case of a Nvidia Tesla P100 GPU, our method processed 10 MRow/s more than TPR and cuSPARSE.

本文言語英語
ホスト出版物のタイトルProceedings - 2022 IEEE 36th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2022
出版社Institute of Electrical and Electronics Engineers Inc.
ページ860-869
ページ数10
ISBN(電子版)9781665497473
DOI
出版ステータス出版済み - 2022
イベント36th IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2022 - Virtual, Online, フランス
継続期間: 5月 30 20226月 3 2022

出版物シリーズ

名前Proceedings - 2022 IEEE 36th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2022

会議

会議36th IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2022
国/地域フランス
CityVirtual, Online
Period5/30/226/3/22

!!!All Science Journal Classification (ASJC) codes

  • 人工知能
  • コンピュータ ネットワークおよび通信
  • ハードウェアとアーキテクチャ
  • 情報システム
  • ソフトウェア
  • 制御と最適化

フィンガープリント

「A Scalable Parallel Partition Tridiagonal Solver for Many-Core and Low B/F Processors」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル