Dynamic learning, retrieval, and tracking to augment hundreds of photographs

Julien Pilet, Hideo Saito

Research output: Contribution to journalArticle

Abstract

Tracking is a major issue of virtual and augmented reality applications. Single object tracking on monocular video streams is fairly well understood. However, when it comes to multiple objects, existing methods lack scalability and can recognize only a limited number of objects. Thanks to recent progress in feature matching, state-of-the-art image retrieval techniques can deal with millions of images. However, these methods do not focus on real-time video processing and cannot track retrieved objects. In this paper, we present a method that combines the speed and accuracy of tracking with the scalability of image retrieval. At the heart of our approach is a bi-layer clustering process that allows our system to index and retrieve objects based on tracks of features, thereby effectively summarizing the information available on multiple video frames. Dynamic learning of new viewpoints as the camera moves naturally yields the kind of robustness and reliability expected from an augmented reality engine. As a result, our system is able to track in real-time multiple objects, recognized with low delay from a database of more than 300 entries. We released the source code of our system in a package called Polyora.

Original languageEnglish
Pages (from-to)89-100
Number of pages12
JournalVirtual Reality
Volume18
Issue number2
DOIs
Publication statusPublished - 2014

Fingerprint

Augmented reality
Image retrieval
Scalability
Virtual reality
Cameras
Engines
Processing

Keywords

  • Augmented reality
  • Image retrieval
  • Multiple object tracking

ASJC Scopus subject areas

  • Software
  • Computer Graphics and Computer-Aided Design
  • Human-Computer Interaction

Cite this

Dynamic learning, retrieval, and tracking to augment hundreds of photographs. / Pilet, Julien; Saito, Hideo.

In: Virtual Reality, Vol. 18, No. 2, 2014, p. 89-100.

Research output: Contribution to journalArticle

@article{719263378d0047c895564d4fdfc926a7,
title = "Dynamic learning, retrieval, and tracking to augment hundreds of photographs",
abstract = "Tracking is a major issue of virtual and augmented reality applications. Single object tracking on monocular video streams is fairly well understood. However, when it comes to multiple objects, existing methods lack scalability and can recognize only a limited number of objects. Thanks to recent progress in feature matching, state-of-the-art image retrieval techniques can deal with millions of images. However, these methods do not focus on real-time video processing and cannot track retrieved objects. In this paper, we present a method that combines the speed and accuracy of tracking with the scalability of image retrieval. At the heart of our approach is a bi-layer clustering process that allows our system to index and retrieve objects based on tracks of features, thereby effectively summarizing the information available on multiple video frames. Dynamic learning of new viewpoints as the camera moves naturally yields the kind of robustness and reliability expected from an augmented reality engine. As a result, our system is able to track in real-time multiple objects, recognized with low delay from a database of more than 300 entries. We released the source code of our system in a package called Polyora.",
keywords = "Augmented reality, Image retrieval, Multiple object tracking",
author = "Julien Pilet and Hideo Saito",
year = "2014",
doi = "10.1007/s10055-013-0228-7",
language = "English",
volume = "18",
pages = "89--100",
journal = "Virtual Reality",
issn = "1359-4338",
publisher = "Springer London",
number = "2",

}

TY - JOUR

T1 - Dynamic learning, retrieval, and tracking to augment hundreds of photographs

AU - Pilet, Julien

AU - Saito, Hideo

PY - 2014

Y1 - 2014

N2 - Tracking is a major issue of virtual and augmented reality applications. Single object tracking on monocular video streams is fairly well understood. However, when it comes to multiple objects, existing methods lack scalability and can recognize only a limited number of objects. Thanks to recent progress in feature matching, state-of-the-art image retrieval techniques can deal with millions of images. However, these methods do not focus on real-time video processing and cannot track retrieved objects. In this paper, we present a method that combines the speed and accuracy of tracking with the scalability of image retrieval. At the heart of our approach is a bi-layer clustering process that allows our system to index and retrieve objects based on tracks of features, thereby effectively summarizing the information available on multiple video frames. Dynamic learning of new viewpoints as the camera moves naturally yields the kind of robustness and reliability expected from an augmented reality engine. As a result, our system is able to track in real-time multiple objects, recognized with low delay from a database of more than 300 entries. We released the source code of our system in a package called Polyora.

AB - Tracking is a major issue of virtual and augmented reality applications. Single object tracking on monocular video streams is fairly well understood. However, when it comes to multiple objects, existing methods lack scalability and can recognize only a limited number of objects. Thanks to recent progress in feature matching, state-of-the-art image retrieval techniques can deal with millions of images. However, these methods do not focus on real-time video processing and cannot track retrieved objects. In this paper, we present a method that combines the speed and accuracy of tracking with the scalability of image retrieval. At the heart of our approach is a bi-layer clustering process that allows our system to index and retrieve objects based on tracks of features, thereby effectively summarizing the information available on multiple video frames. Dynamic learning of new viewpoints as the camera moves naturally yields the kind of robustness and reliability expected from an augmented reality engine. As a result, our system is able to track in real-time multiple objects, recognized with low delay from a database of more than 300 entries. We released the source code of our system in a package called Polyora.

KW - Augmented reality

KW - Image retrieval

KW - Multiple object tracking

UR - http://www.scopus.com/inward/record.url?scp=84900425265&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84900425265&partnerID=8YFLogxK

U2 - 10.1007/s10055-013-0228-7

DO - 10.1007/s10055-013-0228-7

M3 - Article

AN - SCOPUS:84900425265

VL - 18

SP - 89

EP - 100

JO - Virtual Reality

JF - Virtual Reality

SN - 1359-4338

IS - 2

ER -