TY - GEN
T1 - Performance analysis of clearspeed's CSX600 interconnects
AU - Nishikawa, Yuri
AU - Koibuchi, Michihiro
AU - Yoshimi, Masato
AU - Shitara, Akihiro
AU - Miura, Kenichi
AU - Amano, Hideharu
PY - 2009/11/19
Y1 - 2009/11/19
N2 - ClearSpeed's CSX600 that consists of 96 Processing Elements (PEs) employs a one-dimensional array topology for a simple SIMD processing. To clearly show the performance factors and practical issues of NoCs in an existing modern many-core SIMD system, this paper measures and analyzes NoCs of CSX600 called Swazzle and ClearConnect. Evaluation and analysis results show that the sending and receiving overheads are the major limitation factors to the effective network bandwidth. We found that (1) the number of used PEs, (2) the size of transferred data, and (3) data alignment of a shared memory are three main points to make the best use of bandwidth. In addition, we estimated the best- and worst-case latencies of data transfers in parallel applications.
AB - ClearSpeed's CSX600 that consists of 96 Processing Elements (PEs) employs a one-dimensional array topology for a simple SIMD processing. To clearly show the performance factors and practical issues of NoCs in an existing modern many-core SIMD system, this paper measures and analyzes NoCs of CSX600 called Swazzle and ClearConnect. Evaluation and analysis results show that the sending and receiving overheads are the major limitation factors to the effective network bandwidth. We found that (1) the number of used PEs, (2) the size of transferred data, and (3) data alignment of a shared memory are three main points to make the best use of bandwidth. In addition, we estimated the best- and worst-case latencies of data transfers in parallel applications.
UR - http://www.scopus.com/inward/record.url?scp=70449466935&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=70449466935&partnerID=8YFLogxK
U2 - 10.1109/ISPA.2009.102
DO - 10.1109/ISPA.2009.102
M3 - Conference contribution
AN - SCOPUS:70449466935
SN - 9780769537474
T3 - Proceedings - 2009 IEEE International Symposium on Parallel and Distributed Processing with Applications, ISPA 2009
SP - 203
EP - 210
BT - Proceedings - 2009 IEEE International Symposium on Parallel and Distributed Processing with Applications, ISPA 2009
T2 - 2009 IEEE International Symposium on Parallel and Distributed Processing with Applications, ISPA 2009
Y2 - 9 August 2009 through 12 August 2009
ER -