Most countries are running shortage of working force due to the aging population and reduction in the birthrate. Robot manipulators are expected to replace human work. However, it it still difficult for manipulators to do simple tasks such as fruit harvesting, foods cooking or toy assembling. A problem for robotic automation arise in the difficulty in teaching how much force manipulators should use for a task execution. Motion reproduction system, which uses bilateral control to store motion data, is one of a method to teach manipulators motion including position and force. The problem concerning motion reproduction system is that the motion reproducing fails if environment is changed between motion saving phase and motion reproducing phase. Motion reproduction system which can understand and adapt to environment is required. Vision sensor can sense environment. Computer vision is mainly focus on how to classify objects. Vision information is seldom combined with motion control especially force motion. Therefore, I propose a motion reproduction system in which reproduced motion is decided based on several motions and collected depth data. Convolutional Neural Network(CNN) was used to estimate a motion command from a depth image. Saved force data was used to generate labels for training. The label decision is different from conventional Machine learning alzorithm.