LDNN: Linguistic Knowledge Injectable Deep Neural Network for Group Cohesiveness Understanding

Yanan Wang, Jianming Wu, Jinfa Huang, Gen Hattori, Yasuhiro Takishima, Shinya Wada, Rui Kimura, Jie Chen, Satoshi Kurihara

研究成果: Conference contribution

抄録

Group cohesiveness reflects the level of intimacy that people feel with each other, and the development of a dialogue robot that can understand group cohesiveness will lead to the promotion of human communication. However, group cohesiveness is a complex concept that is difficult to predict based only on image pixels. Inspired by the fact that humans intuitively associate linguistic knowledge accumulated in the brain with the visual images they see, we propose a linguistic knowledge injectable deep neural network (LDNN) that builds a visual model (visual LDNN) for predicting group cohesiveness that can automatically associate the linguistic knowledge hidden behind images. LDNN consists of a visual encoder and a language encoder, and applies domain adaptation and linguistic knowledge transition mechanisms to transform linguistic knowledge from a language model to the visual LDNN. We train LDNN by adding descriptions to the training and validation sets of the Group AFfect Dataset 3.0 (GAF 3.0), and test the visual LDNN without any description. Comparing visual LDNN with various fine-tuned DNN models and three state-of-the-art models in the test set, the results demonstrate that the visual LDNN not only improves the performance of the fine-tuned DNN model leading to an MSE very similar to the state-of-the-art model, but is also a practical and efficient method that requires relatively little preprocessing. Furthermore, ablation studies confirm that LDNN is an effective method to inject linguistic knowledge into visual models.

本文言語English
ホスト出版物のタイトルICMI 2020 - Proceedings of the 2020 International Conference on Multimodal Interaction
出版社Association for Computing Machinery, Inc
ページ343-350
ページ数8
ISBN(電子版)9781450375818
DOI
出版ステータスPublished - 2020 10 21
イベント22nd ACM International Conference on Multimodal Interaction, ICMI 2020 - Virtual, Online, Netherlands
継続期間: 2020 10 252020 10 29

出版物シリーズ

名前ICMI 2020 - Proceedings of the 2020 International Conference on Multimodal Interaction

Conference

Conference22nd ACM International Conference on Multimodal Interaction, ICMI 2020
CountryNetherlands
CityVirtual, Online
Period20/10/2520/10/29

ASJC Scopus subject areas

  • Hardware and Architecture
  • Human-Computer Interaction
  • Computer Science Applications
  • Computer Vision and Pattern Recognition

フィンガープリント 「LDNN: Linguistic Knowledge Injectable Deep Neural Network for Group Cohesiveness Understanding」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル