TY - GEN
T1 - GraphDEAR
T2 - 30th Euromicro International Conference on Parallel, Distributed and Network-Based Processing, PDP 2022
AU - Hu, Siyi
AU - Kondo, Masaaki
AU - He, Yuan
AU - Sakamoto, Ryuichi
AU - Zhang, Hao
AU - Zhou, Jun
AU - Nakamura, Hiroshi
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Data structure is the key in Edge Computing where various types of data are continuously generated by ubiquitous devices. Within all common data structures, graphs are used to express relationships and dependencies among human identities, objects, and locations; and they are expected to become one of the most important data infrastructure in the near future. Furthermore, as graph processing often requires random accesses to vast memory spaces, conventional memory hierarchies with caches cannot perform efficiently. To alleviate such memory access bottlenecks in graph processing, we present a solution through vertex accesses scheduling and edge array re-ordering, in parallel with the execution of graph processing application to improve both temporal and spatial locality of memory accesses, especially for edge-centric graphs which are popular means in handling dynamic graphs. Our proposed architecture is evaluated and tested through both trace-based cache simulations and cycle-Accurate FPGA-based prototyping. Evaluation results show that our proposal has a potential of significantly reducing the quantity of Miss-Per-Kilo-Instructions (MPKI) for Last Level Cache (LLC) by 56.27% on average.
AB - Data structure is the key in Edge Computing where various types of data are continuously generated by ubiquitous devices. Within all common data structures, graphs are used to express relationships and dependencies among human identities, objects, and locations; and they are expected to become one of the most important data infrastructure in the near future. Furthermore, as graph processing often requires random accesses to vast memory spaces, conventional memory hierarchies with caches cannot perform efficiently. To alleviate such memory access bottlenecks in graph processing, we present a solution through vertex accesses scheduling and edge array re-ordering, in parallel with the execution of graph processing application to improve both temporal and spatial locality of memory accesses, especially for edge-centric graphs which are popular means in handling dynamic graphs. Our proposed architecture is evaluated and tested through both trace-based cache simulations and cycle-Accurate FPGA-based prototyping. Evaluation results show that our proposal has a potential of significantly reducing the quantity of Miss-Per-Kilo-Instructions (MPKI) for Last Level Cache (LLC) by 56.27% on average.
KW - cache
KW - data locality
KW - domainspecific acceleration
KW - graph processing
UR - http://www.scopus.com/inward/record.url?scp=85129633731&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85129633731&partnerID=8YFLogxK
U2 - 10.1109/PDP55904.2022.00029
DO - 10.1109/PDP55904.2022.00029
M3 - Conference contribution
AN - SCOPUS:85129633731
T3 - Proceedings - 30th Euromicro International Conference on Parallel, Distributed and Network-Based Processing, PDP 2022
SP - 135
EP - 143
BT - Proceedings - 30th Euromicro International Conference on Parallel, Distributed and Network-Based Processing, PDP 2022
A2 - Gonzalez-Escribano, Arturo
A2 - Garcia, Jose Daniel
A2 - Torquati, Massimo
A2 - Skavhaug, Amund
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 9 March 2022 through 11 March 2022
ER -