Improving Goal-Oriented Visual Dialogue by Asking Fewer Questions

Soma Kanazawa, Shoya Matsumori, Michita Imai

研究成果: Conference contribution

抄録

An agent who adaptively asks the user questions to seek information is a crucial element in designing a real-world artificial intelligence agent. In particular, goal-oriented visual dialogue, which locates an object of interest from a group of visually presented objects by asking verbal questions, must be able to efficiently narrow down and identify objects through question generation. Several models based on GuessWhat?! and CLEVR Ask have been published, most of which leverage reinforcement learning to maximize the success rate of the task. However, existing models take a policy of asking questions up to a predefined limit, resulting in the generation of redundant questions. Moreover, the generated questions often refer only to a limited number of objects, which prevents efficient narrowing down and the identification of a wide range of attributes. This paper proposes Two-Stream Splitter (TSS) for redundant question reduction and efficient question generation. TSS utilizes a self-attention structure in the processing of image features and location features of objects to enable efficient narrowing down of candidate objects by combining the information content of both. Experimental results on the CLEVR Ask dataset show that the proposed method reduces redundant questions and enables efficient interaction compared to previous models.

本文言語English
ホスト出版物のタイトルNeural Information Processing - 28th International Conference, ICONIP 2021, Proceedings
編集者Teddy Mantoro, Minho Lee, Media Anugerah Ayu, Kok Wai Wong, Achmad Nizar Hidayanto
出版社Springer Science and Business Media Deutschland GmbH
ページ158-169
ページ数12
ISBN(印刷版)9783030922696
DOI
出版ステータスPublished - 2021
イベント28th International Conference on Neural Information Processing, ICONIP 2021 - Virtual, Online
継続期間: 2021 12月 82021 12月 12

出版物シリーズ

名前Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
13109 LNCS
ISSN(印刷版)0302-9743
ISSN(電子版)1611-3349

Conference

Conference28th International Conference on Neural Information Processing, ICONIP 2021
CityVirtual, Online
Period21/12/821/12/12

ASJC Scopus subject areas

  • 理論的コンピュータサイエンス
  • コンピュータ サイエンス(全般)

フィンガープリント

「Improving Goal-Oriented Visual Dialogue by Asking Fewer Questions」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル