TY - JOUR
T1 - Optical non-line-of-sight physics-based 3d human pose estimation
AU - Isogawa, Mariko
AU - Yuan, Ye
AU - O'Toole, Matthew
AU - Kitani, Kris
N1 - Funding Information:
While the primary technical focus of this work is to better understand how visual information should be represented and processed to enable 3D pose estimation from NLOS imaging, the technology described in this work also has some practical applications for next generation autonomous systems. In the context of autonomous driving, the ability to detect and track people outside of the line of sight of its sensors can be instrumental in informing planning algorithms and preventing accidents. In the context of domestic robots, the ability to see around walls could help robots make more informed decisions when entering a room or avoiding collisions. Though more research is necessary to lower the financial cost and computational complexity of the NLOS imaging system described in this work, we believe that this preliminary work shows the remarkable potential for higher-level reasoning using NLOS imaging in the real-world. Acknowledgements. We thank Ioannis Gkioulekas for many helpful suggestions. M. Isogawa is supported by NTT Corporation. M. O’Toole is supported by the DARPA REVEAL program.
Publisher Copyright:
© 2020 IEEE
PY - 2020
Y1 - 2020
N2 - We describe a method for 3D human pose estimation from transient images (i.e., a 3D spatio-temporal histogram of photons) acquired by an optical non-line-of-sight (NLOS) imaging system. Our method can perceive 3D human pose by 'looking around corners' through the use of light indirectly reflected by the environment. We bring together a diverse set of technologies from NLOS imaging, human pose estimation and deep reinforcement learning to construct an end-to-end data processing pipeline that converts a raw stream of photon measurements into a full 3D human pose sequence estimate. Our contributions are the design of data representation process which includes (1) a learnable inverse point spread function (PSF) to convert raw transient images into a deep feature vector; (2) a neural humanoid control policy conditioned on the transient image feature and learned from interactions with a physics simulator; and (3) a data synthesis and augmentation strategy based on depth data that can be transferred to a real-world NLOS imaging system. Our preliminary experiments suggest that our method is able to generalize to real-world NLOS measurement to estimate physically-valid 3D human poses.1
AB - We describe a method for 3D human pose estimation from transient images (i.e., a 3D spatio-temporal histogram of photons) acquired by an optical non-line-of-sight (NLOS) imaging system. Our method can perceive 3D human pose by 'looking around corners' through the use of light indirectly reflected by the environment. We bring together a diverse set of technologies from NLOS imaging, human pose estimation and deep reinforcement learning to construct an end-to-end data processing pipeline that converts a raw stream of photon measurements into a full 3D human pose sequence estimate. Our contributions are the design of data representation process which includes (1) a learnable inverse point spread function (PSF) to convert raw transient images into a deep feature vector; (2) a neural humanoid control policy conditioned on the transient image feature and learned from interactions with a physics simulator; and (3) a data synthesis and augmentation strategy based on depth data that can be transferred to a real-world NLOS imaging system. Our preliminary experiments suggest that our method is able to generalize to real-world NLOS measurement to estimate physically-valid 3D human poses.1
UR - http://www.scopus.com/inward/record.url?scp=85094838244&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85094838244&partnerID=8YFLogxK
U2 - 10.1109/CVPR42600.2020.00704
DO - 10.1109/CVPR42600.2020.00704
M3 - Conference article
AN - SCOPUS:85094838244
SN - 1063-6919
SP - 7011
EP - 7020
JO - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
JF - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
M1 - 9157058
T2 - 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020
Y2 - 14 June 2020 through 19 June 2020
ER -