The performance gap between processor and main memory is serious problem especially in high performance computing. In order to overcome this problem, we have proposed a new processor architecture called SCIMA, which integrates software-controllable memory (SCM) into a processor chip as a part of main memory in addition to ordinary cache. SCIMA is defined as an extension of a general microprocessor whose load/store unit is extended to control data accesses to the SCM. In this paper, we present a load/store unit of SCIMA by extending the unit of MIPS R10000 processor and evaluate the impact of the extension on area and clock frequency. The evaluation results reveal that SCIMA achieves 1.5 - 10 times higher performance compared with cache based architecture although the cycle time of SCIMA is 5.7% longer.