PLOS Digital Health (Dec 2022)
Is artificial intelligence capable of generating hospital discharge summaries from inpatient records?
Abstract
Medical professionals have been burdened by clerical work, and artificial intelligence may efficiently support physicians by generating clinical summaries. However, whether hospital discharge summaries can be generated automatically from inpatient records stored in electronic health records remains unclear. Therefore, this study investigated the sources of information in discharge summaries. First, the discharge summaries were automatically split into fine-grained segments, such as those representing medical expressions, using a machine learning model from a previous study. Second, these segments in the discharge summaries that did not originate from inpatient records were filtered out. This was performed by calculating the n-gram overlap between inpatient records and discharge summaries. The final source origin decision was made manually. Finally, to reveal the specific sources (e.g., referral documents, prescriptions, and physician’s memory) from which the segments originated, they were manually classified by consulting medical professionals. For further and deeper analysis, this study designed and annotated clinical role labels that represent the subjectivity of the expressions and builds a machine learning model to assign them automatically. The analysis results revealed the following: First, 39% of the information in the discharge summary originated from external sources other than inpatient records. Second, patient’s past clinical records constituted 43%, and patient referral documents constituted 18% of the expressions derived from external sources. Third, 11% of the missing information was not derived from any documents. These are possibly derived from physicians’ memories or reasoning. According to these results, end-to-end summarization using machine learning is considered infeasible. Machine summarization with an assisted post-editing process is the best fit for this problem domain. Author summary Medical practice necessitates significant paperwork; thus, the automated processing of clinical records can reduce the burden on medical professionals. To this end, some research efforts have attempted to achieve automatic summarization of inpatient records collected by physicians when they hospitalize patients. This study investigated whether discharge summaries can be constructed automatically from inpatient records to facilitate further processing. For this purpose, each piece of information in the discharge summaries is manually labeled to determine whether it originated from inpatient records. If not, possible sources were attempted to identify. The results revealed that 61% of the information in the discharge summary was derived from inpatient records, whereas the remaining 39% was derived from other sources. These external sources included patient’s past clinical records (43%) and patient referral documents (18%). Furthermore, 11% of the information did not originate from any document, indicating that it was possibly derived from physicians’ memories or reasoning. This suggests that a fully automated generation of discharge summaries is considered infeasible, and future research efforts must be directed toward semi-automated generation aimed at optimal human-machine collaboration in the authoring of discharge summaries.