June 17-21, 2012

Hamburg, Germany

Contribution Details

Name: The Design of Ultra Scalable MPI Collective Communication on the K Computer
Time: Monday, June 18, 2012
4:00 PM - 4:30 PM
Room:   Hall C2.2
CCH - Congress Center Hamburg
Speakers:   Tomoya Adachi, Fujitsu
Abstract:   This paper proposes the design of ultra scalable MPI collective communication for the K computer, which consists of over 80 thousand computing nodes and is the world's first system over 10 PFLOPS. The nodes are connected by Tofu interconnect which introduces six dimensional mesh/torus topology. Existing MPI libraries, however, achieve poor performance on a direct network system since they assume typical cluster environments. Thus, we design collective algorithms optimized for the K computer.
On the design of the algorithms, we place importance on collision-freeness for long messages and low latency for short messages. The long-message algorithms use multiple RDMA network interfaces and consist of neighbor communication, in order to gain high bandwidth and avoid message collisions. On the other hand, the short-message algorithms are designed to reduce software overhead, which comes from the number of relaying nodes. The evaluation result on up to 55,296 node system of the K computer shows the new implementation outperforms the existing one for long messages by a factor of 4 to 11 times. It also shows the short-message algorithms complement the long-message ones.
  • Tutorial Pass
  • HPC in Asia Workshop Pass
  • Conference Pass
  • Conference Pass or Exhibition Pass
    Satellite Event marked with * requires separate pass
  • Morning & Afternoon Coffee Breaks
    Midday Lunch Break
Program may be subject to changes.