Effect of reordering internal messages in MPI broadcast according to the load imbalance

Takesi Soga, Takeshi Nanri, Motoyoshi Kurokawa, Kazuaki Murakami

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    To achieve higher scalability of parallel programs on large scale parallel computers, reducing the time spent for collective communications is one of the most important issue. In this paper, a dynamic optimization method to adjust the implementation of Broadcast operation, one of the most popular collective communications, is introduced. Though there have been many attempts to speed up this operation, they assume that each rank starts this operation at the same time. However, in real execution, the time can be different because of load-imbalance among ranks. This paper first claims that this difference can cause increase of the cost for this operation. Then, as a method to avoid this problem, an optimization method that adjusts the order of point-to-point messages in Broadcast operations is introduced. This method uses the wait time of each rank at the operation to determine the status of load-imbalance. From the results of experiments, it is shown that this optimization method can reduced the time for the operation. In addition to that, it is also shown that the effect of the optimization depends on the size of data to be broadcasted and the amount of load-imbalance.

    Original languageEnglish
    Title of host publicationInnovative Architecture for Future Generation High-Performance Processors and Systems, IWIA 2008
    Pages11-16
    Number of pages6
    DOIs
    Publication statusPublished - Dec 1 2008
    Event2008 International Workshop on Innovative Architecture for Future Generation High Performance Processors and Systems, IWIA'08 - Hilo, HI, United States
    Duration: Jan 21 2008Jan 23 2008

    Publication series

    NameProceedings of the Innovative Architecture for Future Generation High-Performance Processors and Systems

    Other

    Other2008 International Workshop on Innovative Architecture for Future Generation High Performance Processors and Systems, IWIA'08
    Country/TerritoryUnited States
    CityHilo, HI
    Period1/21/081/23/08

    All Science Journal Classification (ASJC) codes

    • Hardware and Architecture

    Fingerprint

    Dive into the research topics of 'Effect of reordering internal messages in MPI broadcast according to the load imbalance'. Together they form a unique fingerprint.

    Cite this