A novel cache-based network processor (NP) architecture that can catch up with next generation 100-Gbps packet-processing throughput by exploiting a nature of network traffic is proposed, and the prototype is evaluated with real network traffic traces. This architecture consists of several small processing units (PUs) and a bit-stream manipulation hardware called a burst-stream path (BSP) that has a special cache mechanism called a process-learning cache (PLC) and a cache-miss handler (CMH). The PLC memorizes a packet-processing method with all table-lookup results, and applies it to subsequent packets that have the same information in their header. To avoid packet-processing blocking, the CMH handles cache-miss packets while registration processing is performed at the PLC. The combination of the PLC and CMH enables most packets to skip the execution at the PUs, which dissipate huge power in conventional NPs. We evaluated an FPGA-based prototype with real core network traffic traces of a WIDE backbone router. From the experimental results, we observed a special case where the packet of minimum size appeared in large quantities, and the cache-based NP was able to achieve 100 throughput with only the 10-throughput PUs due to the existence of very high temporal locality of network traffic. From the whole results, the cache-based NP would be able to achieve 100-Gbps throughput by using 10- to 40-Gbps throughput PUs. The power consumption of the cache-based NP, which consists of 40-Gbps throughput PUs, is estimated to be only 44.7 that of a conventional NP.
ASJC Scopus subject areas