To achieve higher scalability of parallel programs on large scale parallel computers, reducing the time spent for collective communications is one of the most important issue. In this paper, a dynamic optimization method to adjust the implementation of Broadcast operation, one of the most popular collective communications, is introduced. Though there have been many attempts to speed up this operation, they assume that each rank starts this operation at the same time. However, in real execution, the time can be different because of load-imbalance among ranks. This paper first claims that this difference can cause increase of the cost for this operation. Then, as a method to avoid this problem, an optimization method that adjusts the order of point-to-point messages in Broadcast operations is introduced. This method uses the wait time of each rank at the operation to determine the status of load-imbalance. From the results of experiments, it is shown that this optimization method can reduced the time for the operation. In addition to that, it is also shown that the effect of the optimization depends on the size of data to be broadcasted and the amount of load-imbalance.