Demo: Situation-aware conversational agent with kinetic earables

Shin Katayama, Akhil Mathur, Tadashi Okoshi, Jin Nakazawa, Fahim Kawsar

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Conversational agents are increasingly becoming digital partners of our everyday computing experiences offering a variety of purposeful information and utility services. Although rich on competency, these agents are entirely oblivious to their users’ situational and emotional context today and incapable of adjusting their interaction style and tone contextually. To this end, we present a first-of-its-kind situation-aware conversational agent on kinetic earable that dynamically adjusts its conversation style, tone, volume in response to users emotional, environmental, social and activity context gathered through speech prosody, ambient sound and motion signatures. In particular the system is composed of the following components: • Perception Builder: This component is responsible for building an approximate view of user’s momentary experience by sensing his/her 1) physical activity, 2) emotional state, 3) social context and 4) environmental context using different purpose-built acoustic and motion sensory models [4, 5]. • Conversation Builder: This component enables a user to interact with the agent using a predefined dialogue base, and for this demo, we have used Dialogflow [1] populated with a set of situation-specific dialogues. • Affect Adapter: This component is responsible for guiding the adaptation strategy for the agent’s response corresponding to the user’s context, taking into account the output of the perception builder and a data-driven rule engine. We have devised a set of adaptation rules using multiple quantitative and qualitative studies that describe the prosody, volume and speed to shape agents response. • Text-to-Speech Builder: This component is responsible for synthesising the agent’s response in a voice that accurately reflects a user’s situation using IBM Bluemix Voice service [2]. This synthesis process interplays various voice attributes, e.g., pitch, rate, breathiness, glottal tension etc. to transform agents voice according to the rule of the Affect Adapter.

Original languageEnglish
Title of host publicationMobiSys 2019 - Proceedings of the 17th Annual International Conference on Mobile Systems, Applications, and Services
PublisherAssociation for Computing Machinery, Inc
Pages657-658
Number of pages2
ISBN (Electronic)9781450366618
DOIs
Publication statusPublished - 2019 Jun 12
Event17th ACM International Conference on Mobile Systems, Applications, and Services, MobiSys 2019 - Seoul, Korea, Republic of
Duration: 2019 Jun 172019 Jun 21

Publication series

NameMobiSys 2019 - Proceedings of the 17th Annual International Conference on Mobile Systems, Applications, and Services

Conference

Conference17th ACM International Conference on Mobile Systems, Applications, and Services, MobiSys 2019
CountryKorea, Republic of
CitySeoul
Period19/6/1719/6/21

Fingerprint

Kinetics
Acoustics
Acoustic waves
Engines

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Computer Science Applications

Cite this

Katayama, S., Mathur, A., Okoshi, T., Nakazawa, J., & Kawsar, F. (2019). Demo: Situation-aware conversational agent with kinetic earables. In MobiSys 2019 - Proceedings of the 17th Annual International Conference on Mobile Systems, Applications, and Services (pp. 657-658). (MobiSys 2019 - Proceedings of the 17th Annual International Conference on Mobile Systems, Applications, and Services). Association for Computing Machinery, Inc. https://doi.org/10.1145/3307334.3328569

Demo : Situation-aware conversational agent with kinetic earables. / Katayama, Shin; Mathur, Akhil; Okoshi, Tadashi; Nakazawa, Jin; Kawsar, Fahim.

MobiSys 2019 - Proceedings of the 17th Annual International Conference on Mobile Systems, Applications, and Services. Association for Computing Machinery, Inc, 2019. p. 657-658 (MobiSys 2019 - Proceedings of the 17th Annual International Conference on Mobile Systems, Applications, and Services).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Katayama, S, Mathur, A, Okoshi, T, Nakazawa, J & Kawsar, F 2019, Demo: Situation-aware conversational agent with kinetic earables. in MobiSys 2019 - Proceedings of the 17th Annual International Conference on Mobile Systems, Applications, and Services. MobiSys 2019 - Proceedings of the 17th Annual International Conference on Mobile Systems, Applications, and Services, Association for Computing Machinery, Inc, pp. 657-658, 17th ACM International Conference on Mobile Systems, Applications, and Services, MobiSys 2019, Seoul, Korea, Republic of, 19/6/17. https://doi.org/10.1145/3307334.3328569
Katayama S, Mathur A, Okoshi T, Nakazawa J, Kawsar F. Demo: Situation-aware conversational agent with kinetic earables. In MobiSys 2019 - Proceedings of the 17th Annual International Conference on Mobile Systems, Applications, and Services. Association for Computing Machinery, Inc. 2019. p. 657-658. (MobiSys 2019 - Proceedings of the 17th Annual International Conference on Mobile Systems, Applications, and Services). https://doi.org/10.1145/3307334.3328569
Katayama, Shin ; Mathur, Akhil ; Okoshi, Tadashi ; Nakazawa, Jin ; Kawsar, Fahim. / Demo : Situation-aware conversational agent with kinetic earables. MobiSys 2019 - Proceedings of the 17th Annual International Conference on Mobile Systems, Applications, and Services. Association for Computing Machinery, Inc, 2019. pp. 657-658 (MobiSys 2019 - Proceedings of the 17th Annual International Conference on Mobile Systems, Applications, and Services).
@inproceedings{b2751453802947468bb71ec511811e94,
title = "Demo: Situation-aware conversational agent with kinetic earables",
abstract = "Conversational agents are increasingly becoming digital partners of our everyday computing experiences offering a variety of purposeful information and utility services. Although rich on competency, these agents are entirely oblivious to their users’ situational and emotional context today and incapable of adjusting their interaction style and tone contextually. To this end, we present a first-of-its-kind situation-aware conversational agent on kinetic earable that dynamically adjusts its conversation style, tone, volume in response to users emotional, environmental, social and activity context gathered through speech prosody, ambient sound and motion signatures. In particular the system is composed of the following components: • Perception Builder: This component is responsible for building an approximate view of user’s momentary experience by sensing his/her 1) physical activity, 2) emotional state, 3) social context and 4) environmental context using different purpose-built acoustic and motion sensory models [4, 5]. • Conversation Builder: This component enables a user to interact with the agent using a predefined dialogue base, and for this demo, we have used Dialogflow [1] populated with a set of situation-specific dialogues. • Affect Adapter: This component is responsible for guiding the adaptation strategy for the agent’s response corresponding to the user’s context, taking into account the output of the perception builder and a data-driven rule engine. We have devised a set of adaptation rules using multiple quantitative and qualitative studies that describe the prosody, volume and speed to shape agents response. • Text-to-Speech Builder: This component is responsible for synthesising the agent’s response in a voice that accurately reflects a user’s situation using IBM Bluemix Voice service [2]. This synthesis process interplays various voice attributes, e.g., pitch, rate, breathiness, glottal tension etc. to transform agents voice according to the rule of the Affect Adapter.",
author = "Shin Katayama and Akhil Mathur and Tadashi Okoshi and Jin Nakazawa and Fahim Kawsar",
year = "2019",
month = "6",
day = "12",
doi = "10.1145/3307334.3328569",
language = "English",
series = "MobiSys 2019 - Proceedings of the 17th Annual International Conference on Mobile Systems, Applications, and Services",
publisher = "Association for Computing Machinery, Inc",
pages = "657--658",
booktitle = "MobiSys 2019 - Proceedings of the 17th Annual International Conference on Mobile Systems, Applications, and Services",

}

TY - GEN

T1 - Demo

T2 - Situation-aware conversational agent with kinetic earables

AU - Katayama, Shin

AU - Mathur, Akhil

AU - Okoshi, Tadashi

AU - Nakazawa, Jin

AU - Kawsar, Fahim

PY - 2019/6/12

Y1 - 2019/6/12

N2 - Conversational agents are increasingly becoming digital partners of our everyday computing experiences offering a variety of purposeful information and utility services. Although rich on competency, these agents are entirely oblivious to their users’ situational and emotional context today and incapable of adjusting their interaction style and tone contextually. To this end, we present a first-of-its-kind situation-aware conversational agent on kinetic earable that dynamically adjusts its conversation style, tone, volume in response to users emotional, environmental, social and activity context gathered through speech prosody, ambient sound and motion signatures. In particular the system is composed of the following components: • Perception Builder: This component is responsible for building an approximate view of user’s momentary experience by sensing his/her 1) physical activity, 2) emotional state, 3) social context and 4) environmental context using different purpose-built acoustic and motion sensory models [4, 5]. • Conversation Builder: This component enables a user to interact with the agent using a predefined dialogue base, and for this demo, we have used Dialogflow [1] populated with a set of situation-specific dialogues. • Affect Adapter: This component is responsible for guiding the adaptation strategy for the agent’s response corresponding to the user’s context, taking into account the output of the perception builder and a data-driven rule engine. We have devised a set of adaptation rules using multiple quantitative and qualitative studies that describe the prosody, volume and speed to shape agents response. • Text-to-Speech Builder: This component is responsible for synthesising the agent’s response in a voice that accurately reflects a user’s situation using IBM Bluemix Voice service [2]. This synthesis process interplays various voice attributes, e.g., pitch, rate, breathiness, glottal tension etc. to transform agents voice according to the rule of the Affect Adapter.

AB - Conversational agents are increasingly becoming digital partners of our everyday computing experiences offering a variety of purposeful information and utility services. Although rich on competency, these agents are entirely oblivious to their users’ situational and emotional context today and incapable of adjusting their interaction style and tone contextually. To this end, we present a first-of-its-kind situation-aware conversational agent on kinetic earable that dynamically adjusts its conversation style, tone, volume in response to users emotional, environmental, social and activity context gathered through speech prosody, ambient sound and motion signatures. In particular the system is composed of the following components: • Perception Builder: This component is responsible for building an approximate view of user’s momentary experience by sensing his/her 1) physical activity, 2) emotional state, 3) social context and 4) environmental context using different purpose-built acoustic and motion sensory models [4, 5]. • Conversation Builder: This component enables a user to interact with the agent using a predefined dialogue base, and for this demo, we have used Dialogflow [1] populated with a set of situation-specific dialogues. • Affect Adapter: This component is responsible for guiding the adaptation strategy for the agent’s response corresponding to the user’s context, taking into account the output of the perception builder and a data-driven rule engine. We have devised a set of adaptation rules using multiple quantitative and qualitative studies that describe the prosody, volume and speed to shape agents response. • Text-to-Speech Builder: This component is responsible for synthesising the agent’s response in a voice that accurately reflects a user’s situation using IBM Bluemix Voice service [2]. This synthesis process interplays various voice attributes, e.g., pitch, rate, breathiness, glottal tension etc. to transform agents voice according to the rule of the Affect Adapter.

UR - http://www.scopus.com/inward/record.url?scp=85069186099&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85069186099&partnerID=8YFLogxK

U2 - 10.1145/3307334.3328569

DO - 10.1145/3307334.3328569

M3 - Conference contribution

AN - SCOPUS:85069186099

T3 - MobiSys 2019 - Proceedings of the 17th Annual International Conference on Mobile Systems, Applications, and Services

SP - 657

EP - 658

BT - MobiSys 2019 - Proceedings of the 17th Annual International Conference on Mobile Systems, Applications, and Services

PB - Association for Computing Machinery, Inc

ER -