Navigating the minefield of AI in healthcare: Balancing innovation with accuracy


In a recent ‘Fast Facts’ article published in the journal BMJ, researchers discuss recent advances in generative artificial intelligence (AI), the importance of the technology in the world today, and the potential dangers that need to be addressed before large language models (LLMs) such as ChatGPT can become the trustworthy sources of factual information we believe them to be.

BMJ Fast Facts: Quality and safety of artificial intelligence generated health information. Image Credit: Le Panda / ShutterstockBMJ Fast Facts: Quality and safety of artificial intelligence generated health information. Image Credit: Le Panda / Shutterstock

What is generative AI? 

‘Generative artificial intelligence (AI)’ is a subset of AI models that create context-dependant content (text, images, audio, and video) and form the basis of the natural language models powering AI assistants (Google Assistant, Amazon Alexa, and Siri) and productivity applications including ChatGPT and Grammarly AI. This technology represents one of the fastest-growing sectors in digital computation and has the potential to substantially progress varying aspects of society, including healthcare and medical research.

Unfortunately, advancements in generative AI, especially large language models (LLMs) like ChatGPT, have far outpaced ethical and safety checks, introducing the potential for severe consequences, both accidental and deliberate (malicious). Research estimates that more than 70% of people use the internet as their primary source of health and medical information, with more individuals tapping into LLMs such as Gemini, ChatGPT, and Copilot with their queries each day. The present article focuses on three vulnerable aspects of AI, namely AI errors, health disinformation, and privacy concerns. It highlights the efforts of novel disciplines, including AI Safety and Ethical AI, in addressing these vulnerabilities.

AI errors

Errors in data processing are a common challenge across all AI technologies. As input datasets become more extensive and model outputs (text, audio, pictures, or video) become more sophisticated, erroneous or misleading information becomes increasingly more challenging to detect.

“The phenomenon of “AI hallucination” has gained prominence with the widespread use of AI chatbots (e.g., ChatGPT) powered by LLMs. In the health information context, AI hallucinations are particularly concerning because individuals may receive incorrect or misleading health information from LLMs that are presented as fact.”

For lay members of society incapable of discerning between factual and inaccurate information, these errors can become very costly very fast, especially in cases of erroneous medical information. Even trained medical professionals may suffer from these errors, given the growing amount of research conducted using LLMs and generative AI for data analyses.

Thankfully, numerous technological strategies aimed at mitigating AI errors are currently being developed, the most promising of which involves developing generative AI models that “ground” themselves in information derived from credible and authoritative sources. Another method is incorporating ‘uncertainty’ in the AI model’s result – when presenting an output. The model will also present its degree of confidence in the validity of the information presented, thereby allowing the user to reference credible information repositories in instances of high uncertainty. Some generative AI models already incorporate citations as a part of their results, thereby encouraging the user to educate themselves further before accepting the model’s output at face value.

Health disinformation

Disinformation is distinct from AI hallucinations in that the latter is accidental and inadvertent, while the former is deliberate and malicious. While the practice of disinformation is as old as human society itself, generative AI presents an unprecedented platform for the generation of ‘diverse, high-quality, targeted disinformation at scale’ at almost no financial cost to the malicious actor.

“One option for preventing AI-generated health disinformation involves fine-tuning models to align with human values and preferences, including avoiding known harmful or disinformation responses from being generated. An alternative is to build a specialized model (separate from the generative AI model) to detect inappropriate or harmful requests and responses.”

While both the above techniques are viable in the war against disinformation, they are experimental and model-sided. To prevent inaccurate data from even reaching the model for processing, initiatives such as digital watermarks, designed to validate accurate data and represent AI-generated content, are currently in the works. Equally importantly, the establishment of AI vigilance agencies would be required before AI can be unquestioningly trusted as a robust information delivery system.

Privacy and bias

Data used for generative AI model training, especially medical data, must be screened to ensure no identifiable information is included, thereby respecting the privacy of its users and the patients whose data the models were trained upon. For crowdsourced data, AI models usually include privacy terms and conditions. Study participants must ensure that they abide by these terms and not provide information that can be traced back to the volunteer in question.

Bias is the inherited risk of AI models to skew data based on the model’s training source material. Most AI models are trained on extensive datasets, usually obtained from the internet.

“Despite efforts by developers to mitigate biases, it remains challenging to fully identify and understand the biases of accessible LLMs owing to a lack of transparency about the training data and process. Ultimately, strategies aimed at minimizing these risks include exercising greater discretion in the selection of training data, thorough auditing of generative AI outputs, and taking corrective steps to minimize biases identified.”


Generative AI models, the most popular of which include LLMs such as ChatGPT, Microsoft Copilot, Gemini AI, and Sora, represent some of the best human productivity enhancements of the modern age. Unfortunately, advancements in these fields have far outpaced credibility checks, resulting in the potential for errors, disinformation, and bias, which could lead to severe consequences, especially when considering healthcare. The present article summarizes some of the dangers of generative AI in its current form and highlights under-development techniques to mitigate these dangers.

Journal reference:

  • Sorich, M. J., Menz, B. D., & Hopkins, A. M. (2024). Quality and safety of artificial intelligence generated health information. In BMJ (p. q596). BMJ, DOI – 10.1136/bmj.q596,


Leave a Reply

Your email address will not be published. Required fields are marked *