TY - GEN
T1 - Grounded language understanding for manipulation instructions using GAN-based classification
AU - Sugiura, Komei
AU - Kawai, Hisashi
N1 - Funding Information:
This work was partially supported by JSPS KAKENHI Grant Number 15K16074. The authors thank Dr. Peng Shen for his suggestions.
Publisher Copyright:
© 2017 IEEE.
PY - 2018/1/24
Y1 - 2018/1/24
N2 - The target task of this study is grounded language understanding for domestic service robots (DSRs). In particular, we focus on instruction understanding for short sentences where verbs are missing. This task is of critical importance to build communicative DSRs because manipulation is essential for DSRs. Existing instruction understanding methods usually estimate missing information only from non-grounded knowledge; therefore, whether the predicted action is physically executable or not was unclear. In this paper, we present a grounded instruction understanding method to estimate appropriate objects given an instruction and situation. We extend the Generative Adversarial Nets (GAN) and build a GAN-based classifier using latent representations. To quantitatively evaluate the proposed method, we have developed a data set based on the standard data set used for visual question answering (VQA). Experimental results have shown that the proposed method gives the better result than baseline methods.
AB - The target task of this study is grounded language understanding for domestic service robots (DSRs). In particular, we focus on instruction understanding for short sentences where verbs are missing. This task is of critical importance to build communicative DSRs because manipulation is essential for DSRs. Existing instruction understanding methods usually estimate missing information only from non-grounded knowledge; therefore, whether the predicted action is physically executable or not was unclear. In this paper, we present a grounded instruction understanding method to estimate appropriate objects given an instruction and situation. We extend the Generative Adversarial Nets (GAN) and build a GAN-based classifier using latent representations. To quantitatively evaluate the proposed method, we have developed a data set based on the standard data set used for visual question answering (VQA). Experimental results have shown that the proposed method gives the better result than baseline methods.
KW - domestic service robots
KW - grounded language understanding
KW - human-robot communication
UR - http://www.scopus.com/inward/record.url?scp=85050606195&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85050606195&partnerID=8YFLogxK
U2 - 10.1109/ASRU.2017.8268980
DO - 10.1109/ASRU.2017.8268980
M3 - Conference contribution
AN - SCOPUS:85050606195
T3 - 2017 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2017 - Proceedings
SP - 519
EP - 524
BT - 2017 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2017 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2017 IEEE Automatic Speech Recognition and Understanding Workshop, ASRU 2017
Y2 - 16 December 2017 through 20 December 2017
ER -