In recent years, Convolutional Neural Networks (CNNs) have repeatedly shown state-of-the-art performance for their accuracy in the task of object detection, but their heavy computational costs impede their ability for real-time detection when the supporting system is moving, particulary when it is accelerating. At the same time, recent progress on visual inertial systems takes great advantage of movement information to robustly estimate the robot state and its surrounding. This paper proposes to exploit the advantages of inertial odometry research for the purpose of real-time object detection system on mobile robots. We combine a CNN detector with VINS-Mono, a moving visual odometry system, and show reliable improvement in the detection process, especially when the robot accelerates or decelerates. Our system is ready-to-use in that it has very low deployment cost and requires no calibration. The resulting system allows for simultaneous robot state estimation and object detection, as well as object tracking. Lastly, this architecture proves to be flexible because not restrained to a specific object type or detector.