Improving Goal-Oriented Visual Dialogue by Asking Fewer Questions

Soma Kanazawa, Shoya Matsumori, Michita Imai

研究成果: Conference contribution


An agent who adaptively asks the user questions to seek information is a crucial element in designing a real-world artificial intelligence agent. In particular, goal-oriented visual dialogue, which locates an object of interest from a group of visually presented objects by asking verbal questions, must be able to efficiently narrow down and identify objects through question generation. Several models based on GuessWhat?! and CLEVR Ask have been published, most of which leverage reinforcement learning to maximize the success rate of the task. However, existing models take a policy of asking questions up to a predefined limit, resulting in the generation of redundant questions. Moreover, the generated questions often refer only to a limited number of objects, which prevents efficient narrowing down and the identification of a wide range of attributes. This paper proposes Two-Stream Splitter (TSS) for redundant question reduction and efficient question generation. TSS utilizes a self-attention structure in the processing of image features and location features of objects to enable efficient narrowing down of candidate objects by combining the information content of both. Experimental results on the CLEVR Ask dataset show that the proposed method reduces redundant questions and enables efficient interaction compared to previous models.

ホスト出版物のタイトルNeural Information Processing - 28th International Conference, ICONIP 2021, Proceedings
編集者Teddy Mantoro, Minho Lee, Media Anugerah Ayu, Kok Wai Wong, Achmad Nizar Hidayanto
出版社Springer Science and Business Media Deutschland GmbH
出版ステータスPublished - 2021
イベント28th International Conference on Neural Information Processing, ICONIP 2021 - Virtual, Online
継続期間: 2021 12月 82021 12月 12


名前Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
13109 LNCS


Conference28th International Conference on Neural Information Processing, ICONIP 2021
CityVirtual, Online

ASJC Scopus subject areas

  • 理論的コンピュータサイエンス
  • コンピュータ サイエンス(全般)


「Improving Goal-Oriented Visual Dialogue by Asking Fewer Questions」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。