VISTURE: A System for Video-Based Gesture and Speech Generation by Robots

Kaon Shimoyama, Kohei Okuoka, Mitsuhiko Kimoto, Michita Imai

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper proposes VISTURE, a system for generating a robot's gesture and speech by using video as input. VISTURE assumes a situation in which a robot conveys what it saw with a camera to a person who was absent. The value of this paper is that we have performed a case study to investigate the expressions that Japanese people use to describe video scenes, and used the results to build VISTURE. In particular, we found classification of expressions depicting the video scenes throughout the case study: Foreground information that is the relevant event of the scene and Background one that is not the main point of the description giving the entire scene. Foreground and Background are referred in combination. VISTURE employs the classification to generate human-like expressions. Moreover, we designed the method to determine Foreground and Background, and it can generate multiple combinations of expressions. We investigated the people's impression of a robot performing the gestures and speech generated by VISTURE to evaluate the quality of those gestures and speech. The results showed that the robot was perceived as more likable and capable when it performed gestures.

Original languageEnglish
Title of host publicationHAI 2022 - Proceedings of the 10th Conference on Human-Agent Interaction
PublisherAssociation for Computing Machinery, Inc
Pages185-193
Number of pages9
ISBN (Electronic)9781450393232
DOIs
Publication statusPublished - 2022 Dec 5
Event10th Conference on Human-Agent Interaction, HAI 2022 - Christchurch, New Zealand
Duration: 2022 Dec 52022 Dec 8

Publication series

NameHAI 2022 - Proceedings of the 10th Conference on Human-Agent Interaction

Conference

Conference10th Conference on Human-Agent Interaction, HAI 2022
Country/TerritoryNew Zealand
CityChristchurch
Period22/12/522/12/8

Keywords

  • Gesture generation
  • Human-robot interaction
  • Speech generation

ASJC Scopus subject areas

  • Artificial Intelligence
  • Human-Computer Interaction
  • Software

Fingerprint

Dive into the research topics of 'VISTURE: A System for Video-Based Gesture and Speech Generation by Robots'. Together they form a unique fingerprint.

Cite this