TY - JOUR
T1 - Cache-based network processor architecture
T2 - evaluation with real network traffic
AU - Okuno, Michitaka
AU - Nishimura, Shinji
AU - Ishida, Shin Ichi
AU - Nishi, Hiroaki
PY - 2006/11
Y1 - 2006/11
N2 - A novel cache-based network processor (NP) architecture that can catch up with next generation 100-Gbps packet-processing throughput by exploiting a nature of network traffic is proposed, and the prototype is evaluated with real network traffic traces. This architecture consists of several small processing units (PUs) and a bit-stream manipulation hardware called a burst-stream path (BSP) that has a special cache mechanism called a process-learning cache (PLC) and a cache-miss handler (CMH). The PLC memorizes a packet-processing method with all table-lookup results, and applies it to subsequent packets that have the same information in their header. To avoid packet-processing blocking, the CMH handles cache-miss packets while registration processing is performed at the PLC. The combination of the PLC and CMH enables most packets to skip the execution at the PUs, which dissipate huge power in conventional NPs. We evaluated an FPGA-based prototype with real core network traffic traces of a WIDE backbone router. From the experimental results, we observed a special case where the packet of minimum size appeared in large quantities, and the cache-based NP was able to achieve 100 throughput with only the 10-throughput PUs due to the existence of very high temporal locality of network traffic. From the whole results, the cache-based NP would be able to achieve 100-Gbps throughput by using 10- to 40-Gbps throughput PUs. The power consumption of the cache-based NP, which consists of 40-Gbps throughput PUs, is estimated to be only 44.7 that of a conventional NP.
AB - A novel cache-based network processor (NP) architecture that can catch up with next generation 100-Gbps packet-processing throughput by exploiting a nature of network traffic is proposed, and the prototype is evaluated with real network traffic traces. This architecture consists of several small processing units (PUs) and a bit-stream manipulation hardware called a burst-stream path (BSP) that has a special cache mechanism called a process-learning cache (PLC) and a cache-miss handler (CMH). The PLC memorizes a packet-processing method with all table-lookup results, and applies it to subsequent packets that have the same information in their header. To avoid packet-processing blocking, the CMH handles cache-miss packets while registration processing is performed at the PLC. The combination of the PLC and CMH enables most packets to skip the execution at the PUs, which dissipate huge power in conventional NPs. We evaluated an FPGA-based prototype with real core network traffic traces of a WIDE backbone router. From the experimental results, we observed a special case where the packet of minimum size appeared in large quantities, and the cache-based NP was able to achieve 100 throughput with only the 10-throughput PUs due to the existence of very high temporal locality of network traffic. From the whole results, the cache-based NP would be able to achieve 100-Gbps throughput by using 10- to 40-Gbps throughput PUs. The power consumption of the cache-based NP, which consists of 40-Gbps throughput PUs, is estimated to be only 44.7 that of a conventional NP.
KW - 100-Gbps Ethernet
KW - Cache
KW - Low power
KW - Network processor
KW - Network traffic
UR - http://www.scopus.com/inward/record.url?scp=33845641155&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33845641155&partnerID=8YFLogxK
U2 - 10.1093/ietele/e89-c.11.1620
DO - 10.1093/ietele/e89-c.11.1620
M3 - Article
AN - SCOPUS:33845641155
VL - E89-C
SP - 1620
EP - 1628
JO - IEICE Transactions on Electronics
JF - IEICE Transactions on Electronics
SN - 0916-8524
IS - 11
ER -