Constant velocity 3d convolution

Yusuke Sekikawa, Kohta Ishikawa, Hideo Saito

Research output: Contribution to journalArticlepeer-review

6 Citations (Scopus)


We propose a novel 3-D convolution method, cv3dconv, for extracting spatiotemporal features from videos. It reduces the number of sum-of-products operations in 3-D convolution by thousands of times by assuming the constant moving velocity of the features. We observed that a specific class of video sequences, such as video captured by an in-vehicle camera, can be well approximated with piece-wise linear movements of 2-D features in a temporal dimension. Our principal finding is that a 3-D kernel, represented by constant velocity, can be decomposed into a convolution of a 2-D-shaped kernel and a 3-D-velocity kernel, which is parameterized using only two parameters. We derived an efficient recursive algorithm for this class of 3-D convolution, which is exceptionally suited for sparse spatiotemporal data, and this parameterized decomposed representation imposes a structured regularization along a temporal direction. We experimentally verified the validity of our approximation using a controlled dataset, and we also showed the effectiveness of the cv3dconv by adopting it for deep neural networks (DNNs) in visual odometry estimation task using publicly available event-based camera dataset captured in urban road scene. Our DNN architecture improves the estimation accuracy for about 30% compared with the existing states-of-the-arts architecture designed for event data.

Original languageEnglish
Article number8543783
Pages (from-to)76490-76501
Number of pages12
JournalIEEE Access
Publication statusPublished - 2018


  • 3D convolution
  • Convolutional neural network
  • event-based camera
  • spatiotemporal convolution

ASJC Scopus subject areas

  • Computer Science(all)
  • Materials Science(all)
  • Engineering(all)


Dive into the research topics of 'Constant velocity 3d convolution'. Together they form a unique fingerprint.

Cite this