Programming on the cluster with accelerators like GP-GPU tends to be a mixture of intra-node parallel library based on CUDA or OpenCL and inter-node communication library including MPI. In this work, we proposed, implemented and evaluated VEGETA, a middleware that can inject OpenCL program tasks written for multiple OpenCL accelerators in a single chassis to multiple OpenCL accelerators equipped in multiple chassis. Furthermore, we add a new feature called Virtual Direct Memory Access (VDMA) scheme, which supports direct data transfer to other node without writing back to the memory region on user application. In execution of a matrix multiplication benchmark on two, three and four nodes each provided performance improvement of 1.9, 2.8 and 3.8 times. Furthermore, as the result of executing advection term computation based on Cartesian grid method, 78% of the performance compared to that of MPI version was obtained even without use of VDMA, and moreover, 96% of that was achieved the system with VDMA.