The performance gap between processor and memory is very serious problem in high performance computing because effective performance is limited by memory ability. In order to overcome this problem, we propose a new VLSI architecture called SCIMA which integrates software controllable memory into a processor chip in addition to ordinary data cache. Most of data access is regular in high performance computing. Software controllable memory is better at making good use of the regularity than conventional cache. This paper presents its architecture and performance evaluation. In SCIMA, the ratio of software controllable memory and cache can be dynamically changed. Due to this feature, SCIMA is upper compatible with conventional memory architecture. Performance is evaluated by using CG and FT kernels of NPB Benchmark and a real application of QCD (Quantum ChromoDynamics). The evaluation results reveal that SCIMA is superior to conventional cache-based architecture. It is also revealed that the superiority of SCIMA increases when access latency of off-chip memory increases or its relative throughput gets lower.