Journal of Medical Internet Research (May 2020)
An Informatics Framework to Assess Consumer Health Language Complexity Differences: Proof-of-Concept Study
Abstract
BackgroundThe language gap between health consumers and health professionals has been long recognized as the main hindrance to effective health information comprehension. Although providing health information access in consumer health language (CHL) is widely accepted as the solution to the problem, health consumers are found to have varying health language preferences and proficiencies. To simplify health documents for heterogeneous consumer groups, it is important to quantify how CHLs are different in terms of complexity among various consumer groups. ObjectiveThis study aimed to propose an informatics framework (consumer health language complexity [CHELC]) to assess the complexity differences of CHL using syntax-level, text-level, term-level, and semantic-level complexity metrics. Specifically, we identified 8 language complexity metrics validated in previous literature and combined them into a 4-faceted framework. Through a rank-based algorithm, we developed unifying scores (CHELC scores [CHELCS]) to quantify syntax-level, text-level, term-level, semantic-level, and overall CHL complexity. We applied CHELCS to compare posts of each individual on online health forums designed for (1) the general public, (2) deaf and hearing-impaired people, and (3) people with autism spectrum disorder (ASD). MethodsWe examined posts with more than 4 sentences of each user from 3 health forums to understand CHL complexity differences among these groups: 12,560 posts from 3756 users in Yahoo! Answers, 25,545 posts from 1623 users in AllDeaf, and 26,484 posts from 2751 users in Wrong Planet. We calculated CHELCS for each user and compared the scores of 3 user groups (ie, deaf and hearing-impaired people, people with ASD, and the public) through 2-sample Kolmogorov-Smirnov tests and analysis of covariance tests. ResultsThe results suggest that users in the public forum used more complex CHL, particularly more diverse semantics and more complex health terms compared with users in the ASD and deaf and hearing-impaired user forums. However, between the latter 2 groups, people with ASD used more complex words, and deaf and hearing-impaired users used more complex syntax. ConclusionsOur results show that the users in 3 online forums had significantly different CHL complexities in different facets. The proposed framework and detailed measurements help to quantify these CHL complexity differences comprehensively. The results emphasize the importance of tailoring health-related content for different consumer groups with varying CHL complexities.