PLoS ONE (Jan 2021)
MEDLINE search retrieval issues: A longitudinal query analysis of five vendor platforms.
Abstract
This study compared the results of data collected from a longitudinal query analysis of the MEDLINE database hosted on multiple platforms that include PubMed, EBSCOHost, Ovid, ProQuest, and Web of Science. The goal was to identify variations among the search results on the platforms after controlling for search query syntax. We devised twenty-nine cases of search queries comprised of five semantically equivalent queries per case to search against the five MEDLINE database platforms. We ran our queries monthly for a year and collected search result count data to observe changes. We found that search results varied considerably depending on MEDLINE platform. Reasons for variations were due to trends in scholarly publication such as publishing individual papers online first versus complete issues. Some other reasons were metadata differences in bibliographic records; differences in the levels of specificity of search fields provided by the platforms and large fluctuations in monthly search results based on the same query. Database integrity and currency issues were observed as each platform updated its MEDLINE data throughout the year. Specific biomedical bibliographic databases are used to inform clinical decision-making, create systematic reviews, and construct knowledge bases for clinical decision support systems. They serve as essential information retrieval and discovery tools to help identify and collect research data and are used in a broad range of fields and as the basis of multiple research designs. This study should help clinicians, researchers, librarians, informationists, and others understand how these platforms differ and inform future work in their standardization.