Human motion capture (MoCap) plays a key role in healthcare and human–robot collaboration. Some researchers have combined orientation measurements from inertial measurement units (IMUs) and positional inference from cameras to reconstruct the 3D human motion. Their works utilize multiple cameras or depth sensors to localize the human in three dimensions. Such multiple cameras are not always available in our daily life, but just a single camera attached in a smart IP devices has recently been popular. Therefore, we present a 3D pose estimation approach from IMUs and a single camera. In order to resolve the depth ambiguity of the single camera configuration and localize the global position of the subject, we present a constraint which optimizes the foot-ground contact points. The timing and 3D positions of the ground contact are calculated from the acceleration of IMUs on foot and geometric transformation of foot position detected on image, respectively. Since the results of pose estimation is greatly affected by the failure of the detection, we design the image-based constraints to handle the outliers of positional estimates. We evaluated the performance of our approach on public 3D human pose dataset. The experiments demonstrated that the proposed constraints contributed to improve the accuracy of pose estimation in single and multiple camera setting.
ASJC Scopus subject areas