ORCID

0009-0004-8759-6720 (Nelson, Houston), 0009-0003-1139-4121 (Beauchamp), 0000-0002-9760-7639 (Pace)

Document Type

Article

Publication Date

2025

DOI

10.7759/cureus.86543

Publication Title

Cureus

Volume

17

Issue

6

Pages

e86543

Abstract

Background: The internet has become a primary source of health information for the public, with important implications for patient decision-making and public health outcomes. However, the quality and readability of this content vary widely. With the rise of generative artificial intelligence (AI) tools such as ChatGPT and Gemini, new challenges and opportunities have emerged in how patients access and interpret medical information.

Objective: To evaluate and compare the quality, credibility, and readability of consumer health information provided by traditional search engines (Google, Bing) and generative AI platforms (ChatGPT, Gemini) using three validated instruments: DISCERN, JAMA Benchmark Criteria, and Flesch-Kincaid Readability Metrics.

Methods: Twenty health-related webpages from each platform were collected using a standardized query across Google, Bing, Gemini, and ChatGPT. Each source was assessed independently by two reviewers using the DISCERN instrument and the adapted JAMA benchmark criteria. Readability was evaluated using the Flesch Reading Ease and Grade Level scores. One-way ANOVA with Bonferroni correction was used to compare platform performance, and Cohen's Kappa measured inter-rater reliability.

Results: Google achieved the highest mean scores for both quality and credibility (DISCERN: 3.33 ± 0.53; JAMA: 3.70 ± 0.44), followed by Bing, Gemini, and ChatGPT. ChatGPT received the lowest scores across all quality measures. Readability analysis revealed no statistically significant differences between platforms; however, all content exceeded recommended reading levels for public health information. Cohen's Kappa indicated strong inter-rater agreement across DISCERN items.

Conclusion: Google remains the most reliable source of high-quality, readable health information among the evaluated platforms. Generative AI tools such as ChatGPT and Gemini, while increasingly popular, exhibited significant limitations in accuracy, transparency, and complexity. These findings highlight the need for improved oversight, transparency, and user education regarding AI-generated health content.

Rights

© Copyright 2025 Nelson et al.

This is an open access article distributed under the terms of the Creative Commons Attribution License CC-BY 4.0., which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Original Publication Citation

Nelson, H. C., Beauchamp, M. T., & Pace, A. A. (2025). The reliability gap: How traditional search engines outperform artificial intelligence (AI) chatbots in rosacea public health information quality. Cureus, 17(6), Article e86543. https://doi.org/10.7759/cureus.86543

Share

COinS