Journal of Medical Internet Research (Feb 2020)

Responses of Conversational Agents to Health and Lifestyle Prompts: Investigation of Appropriateness and Presentation Structures

  • Kocaballi, Ahmet Baki,
  • Quiroz, Juan C,
  • Rezazadegan, Dana,
  • Berkovsky, Shlomo,
  • Magrabi, Farah,
  • Coiera, Enrico,
  • Laranjo, Liliana

DOI
https://doi.org/10.2196/15823
Journal volume & issue
Vol. 22, no. 2
p. e15823

Abstract

Read online

BackgroundConversational agents (CAs) are systems that mimic human conversations using text or spoken language. Their widely used examples include voice-activated systems such as Apple Siri, Google Assistant, Amazon Alexa, and Microsoft Cortana. The use of CAs in health care has been on the rise, but concerns about their potential safety risks often remain understudied. ObjectiveThis study aimed to analyze how commonly available, general-purpose CAs on smartphones and smart speakers respond to health and lifestyle prompts (questions and open-ended statements) by examining their responses in terms of content and structure alike. MethodsWe followed a piloted script to present health- and lifestyle-related prompts to 8 CAs. The CAs’ responses were assessed for their appropriateness on the basis of the prompt type: responses to safety-critical prompts were deemed appropriate if they included a referral to a health professional or service, whereas responses to lifestyle prompts were deemed appropriate if they provided relevant information to address the problem prompted. The response structure was also examined according to information sources (Web search–based or precoded), response content style (informative and/or directive), confirmation of prompt recognition, and empathy. ResultsThe 8 studied CAs provided in total 240 responses to 30 prompts. They collectively responded appropriately to 41% (46/112) of the safety-critical and 39% (37/96) of the lifestyle prompts. The ratio of appropriate responses deteriorated when safety-critical prompts were rephrased or when the agent used a voice-only interface. The appropriate responses included mostly directive content and empathy statements for the safety-critical prompts and a mix of informative and directive content for the lifestyle prompts. ConclusionsOur results suggest that the commonly available, general-purpose CAs on smartphones and smart speakers with unconstrained natural language interfaces are limited in their ability to advise on both the safety-critical health prompts and lifestyle prompts. Our study also identified some response structures the CAs employed to present their appropriate responses. Further investigation is needed to establish guidelines for designing suitable response structures for different prompt types.