Search results outliers among MEDLINE platforms




Information Retrieval, Search Queries, Medical Subject Headings (MeSH), MEDLINE


Objective: Hypothetically, content in MEDLINE records is consistent across multiple platforms. Though platforms have different interfaces and requirements for query syntax, results should be similar when the syntax is controlled for across the platforms. The authors investigated how search result counts varied when searching records among five MEDLINE platforms.

Methods: We created 29 sets of search queries targeting various metadata fields and operators. Within search sets, we adapted 5 distinct, compatible queries to search 5 MEDLINE platforms (PubMed, ProQuest, EBSCOhost, Web of Science, and Ovid), totaling 145 final queries. The 5 queries were designed to be logically and semantically equivalent and were modified only to match platform syntax requirements. We analyzed the result counts and compared PubMed’s MEDLINE result counts to result counts from the other platforms. We identified outliers by measuring the result count deviations using modified z-scores centered around PubMed’s MEDLINE results.

Results: Web of Science and ProQuest searches were the most likely to deviate from the equivalent PubMed searches. EBSCOhost and Ovid were less likely to deviate from PubMed searches. Ovid’s results were the most consistent with PubMed’s but appeared to apply an indexing algorithm that resulted in lower retrieval sets among equivalent searches in PubMed. Web of Science exhibited problems with exploding or not exploding Medical Subject Headings (MeSH) terms.

Conclusion: Platform enhancements among interfaces affect record retrieval and challenge the expectation that MEDLINE platforms should, by default, be treated as MEDLINE. Substantial inconsistencies in search result counts, as demonstrated here, should raise concerns about the impact of platform-specific influences on search results.

 This article has been approved for the Medical Library Association’s Independent Reading Program.

Author Biographies

Christopher Sean Burns, Associate Professor, School of Information Science, University of Kentucky, Lexington, KY

Associate Professor

School of Information Science

Robert M. Shapiro II, Assistant Professor, School of Information Science, University of Kentucky, Lexington, KY

School of Information Science

Assistant Professor

Tyler Nix, Informationist, Taubman Health Sciences Library, University of Michigan, Ann Arbor, MI

Taubman Health Sciences Library


Jeffrey T. Huber, Professor, School of Information Science, University of Kentucky, Lexington, KY

School of Information Science



