A new study from Mount Sinai found something important. Large language models, like AI chat tools, can still believe and repeat false medical information. This happens most often when the false information sounds professional and confident, as if a doctor wrote it.
In the study, the AI models were tested with 158,000 prompts. In 50,108 of those cases—almost one in three—the models accepted or repeated fabricated medical data. That means they got it wrong about 31% of the time.
Here’s the surprising part: when the same false claims were written in a sloppy or clearly flawed way, the AI was less likely to believe them. In other words, the AI was more easily fooled by information that sounded smart and official.
The lesson is simple. Making AI bigger is not the main solution. To make AI safer, we need better systems that verify facts and clearer rules that help the AI distinguish between confident writing and the actual truth.