This paper presents a novel hardware-oriented stereo vision system based on 1-D cost aggregation. Many researchers have implemented hardware efficient stereo matching to realize real-time systems. However, such methods require a large amount of memory. We proposed a system that is based on a hardware-software hybrid architecture for memory reduction. It consisted of grayscale 1-D cost aggregation HW and 2-D disparity refinement SW. The 1-D processing reduced the size of RAM in our HW to 266 kb with an input image size of 1024×768. We achieved the average error rate for the Middlebury datasets as 6.24%. The processing time was 56.6 ms for the 1024×768 images and an average of 8.6 ms for the Middlebury datasets which have an average size of 400×380. Using the resolution of Middlebury datasets, our system can perform real-time depth-aided image processing.