TY - GEN
T1 - A proposal of thread virtualization environment for cell broadband engine
AU - Yamada, Masahiro
AU - Nishikawa, Yuri
AU - Yoshimi, Masato
AU - Amano, Hideharu
PY - 2010
Y1 - 2010
N2 - Effective parallel programming for a PC cluster integrating multi-core processors requires programmers two types of skill that have different nature; that is multi-thread programming to use multiple cores, and programming for inter-node communication by using libraries such as MPI. For the ease of programming development, we propose a Thread Virtualization Environment (TVE) which virtualizes multiple cores in multiple nodes connected by a network as if they are in one node. If we use this environment, we only need the knowledge of multi-thread programming techniques to effectively utilize the computing resources in multiple nodes. As long inter-node communication delay can severely downgrade performance in some applications, we implemented a caching mechanism on each node so that the number of inter-node communication can be reduced. As the result of executing Monte-Carlo method, whose algorithm requires few data transfers, on TVE, we confirmed that performance scaled well as number of cores increased. On the other hand, Levenshtein Distance computation with frequent data transfers, performance using 30 cores was 0.029 times compared to that of using 6 cores. However, by adopting cache mechanism, inter-node data transfer time was shortened to 5% using the same program.
AB - Effective parallel programming for a PC cluster integrating multi-core processors requires programmers two types of skill that have different nature; that is multi-thread programming to use multiple cores, and programming for inter-node communication by using libraries such as MPI. For the ease of programming development, we propose a Thread Virtualization Environment (TVE) which virtualizes multiple cores in multiple nodes connected by a network as if they are in one node. If we use this environment, we only need the knowledge of multi-thread programming techniques to effectively utilize the computing resources in multiple nodes. As long inter-node communication delay can severely downgrade performance in some applications, we implemented a caching mechanism on each node so that the number of inter-node communication can be reduced. As the result of executing Monte-Carlo method, whose algorithm requires few data transfers, on TVE, we confirmed that performance scaled well as number of cores increased. On the other hand, Levenshtein Distance computation with frequent data transfers, performance using 30 cores was 0.029 times compared to that of using 6 cores. However, by adopting cache mechanism, inter-node data transfer time was shortened to 5% using the same program.
KW - Cell/B.E.
KW - Multicore programming
KW - PC cluster
KW - Parallel programming
KW - Playstation3
UR - http://www.scopus.com/inward/record.url?scp=84858645572&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84858645572&partnerID=8YFLogxK
U2 - 10.2316/P.2010.724-027
DO - 10.2316/P.2010.724-027
M3 - Conference contribution
AN - SCOPUS:84858645572
SN - 9780889868786
T3 - Proceedings of the IASTED International Conference on Parallel and Distributed Computing and Systems
SP - 32
EP - 39
BT - Proceedings of the IASTED International Conference on Parallel and Distributed Computing and Systems, PDCS 2010
T2 - IASTED International Conference on Parallel and Distributed Computing and Systems, PDCS 2010
Y2 - 8 November 2010 through 10 November 2010
ER -