TY - GEN
T1 - A Case for Uni-directional Network Topologies in Large-Scale Clusters
AU - Koibuchi, Michihiro
AU - Totoki, Tomohiro
AU - Matsutani, Hiroki
AU - Amano, Hideharu
AU - Chaix, Fabien
AU - Fujiwara, Ikki
AU - Casanova, Henri
N1 - Funding Information:
This work was partially supported by KAKEN 15K00144 and 16KK0009.
Publisher Copyright:
© 2017 IEEE.
PY - 2017/9/22
Y1 - 2017/9/22
N2 - Designing low-latency network topologies of switches is a key objective for next-generation large-scale clusters. Low latency is preconditioned on low hop counts, but existing network topologies have hop counts much larger than theoretical lower bounds. To alleviate this problem, we propose building network topologies based on uni-directional graphs that are known to have hop counts close to theoretical lower bounds. A practical difficulty with uni-directional topologies is switch-by-switch flow control, which we resolve by using hot-potato routing. Cycle-Accurate network simulation experiments for various traffic patterns on uni-directional topologies show that hot-potato routing achieves performance comparable to that of conventional deadlock-free routing. Similar experiments are used to compare several uni-directional topologies to bi-directional topologies, showing that the former achieve significantly lower latency and higher throughput. We quantify end-To-end application performance for parallel application benchmarks via discrete-even simulation, showing that uni-directional topologies can lead to large application performance improvements over their bi-directional counterparts. Finally, we discuss practical issues for uni-directional topologies such as cabling complexity and cost, power consumption, and soft-error tolerance. Our results make a compelling case for considering uni-directional topologies for upcoming large-scale clusters.
AB - Designing low-latency network topologies of switches is a key objective for next-generation large-scale clusters. Low latency is preconditioned on low hop counts, but existing network topologies have hop counts much larger than theoretical lower bounds. To alleviate this problem, we propose building network topologies based on uni-directional graphs that are known to have hop counts close to theoretical lower bounds. A practical difficulty with uni-directional topologies is switch-by-switch flow control, which we resolve by using hot-potato routing. Cycle-Accurate network simulation experiments for various traffic patterns on uni-directional topologies show that hot-potato routing achieves performance comparable to that of conventional deadlock-free routing. Similar experiments are used to compare several uni-directional topologies to bi-directional topologies, showing that the former achieve significantly lower latency and higher throughput. We quantify end-To-end application performance for parallel application benchmarks via discrete-even simulation, showing that uni-directional topologies can lead to large application performance improvements over their bi-directional counterparts. Finally, we discuss practical issues for uni-directional topologies such as cabling complexity and cost, power consumption, and soft-error tolerance. Our results make a compelling case for considering uni-directional topologies for upcoming large-scale clusters.
KW - HPC clusters
KW - Hot-potato routing
KW - Interconnection networks
KW - Uni-directional network topologies
UR - http://www.scopus.com/inward/record.url?scp=85032626894&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85032626894&partnerID=8YFLogxK
U2 - 10.1109/CLUSTER.2017.33
DO - 10.1109/CLUSTER.2017.33
M3 - Conference contribution
AN - SCOPUS:85032626894
T3 - Proceedings - IEEE International Conference on Cluster Computing, ICCC
SP - 178
EP - 187
BT - Proceedings - 2017 IEEE International Conference on Cluster Computing, CLUSTER 2017
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2017 IEEE International Conference on Cluster Computing, CLUSTER 2017
Y2 - 5 September 2017 through 8 September 2017
ER -