Superpixel segmentation is a very popular image segmentation technique used in various computer vision tasks. Recently, a number of superpixel algorithms have been proposed in literature. One such algorithm is considered as the-state-of-the-art in superpixel segmentation: Simple Linear Iterative Clustering or SLIC. However, its original implementation has a long execution time on high performance processors designed within the common mobile and enterprise applications, as well on high-end processors such as Intel Xeon. Overall, the execution time for single-threaded implementation is considered critical for real-time or near real-time applications. In this paper, we explore the possibility of accelerating parts of the SLIC image segmentation critical for performance, by designing the image segmentation accelerator for Intel's Arria 10 SoC. We propose a novel architecture to enable hardware acceleration by addressing the problem of hardware/software partitioning to minimize the overall program latency.