GPT-4o generates near-authentic retinal fundus images

OpenAI´s GPT-4o now produces highly realistic synthetic fundus images, raising both promise and questions for medical imaging and Artificial Intelligence research.

On March 25th, 2025, OpenAI introduced ´ChatGPT-4o Image Generation,´ a new feature enabling text-to-image synthesis within the large language model GPT-4o. This innovation brought notable advances in image photorealism and prompt adherence compared to previous models. Researchers immediately tested the system´s capabilities to generate realistic ophthalmological images, a domain traditionally challenging for large language models owing to their limited interpretation skills in medical imaging.

The team initiated their investigation by disabling the ChatGPT Memory feature to ensure a clean session, then asked GPT-4o to generate an image of a healthy retinal fundus. While the initial output looked convincing at first glance, detailed scrutiny revealed subtle yet important anomalies: the retinal background was overly uniform, without the intricate choroidal vascular patterns typical of real fundus images. Blood vessel structure was also atypical, showing unnatural crossings, excessive axial light reflexes, and abrupt caliber changes—hallmarks of synthetic fabrication despite overall realism.

To further enhance authenticity, the researchers uploaded a real fundus photograph as a reference for GPT-4o, prompting it to generate an image as similar as possible to the original. The resulting synthetic image displayed improved visual fidelity. Features like choroidal vasculature became evident, and vessel configuration better matched normal anatomy, even though some artifacts persisted, such as a reduced optic disc cup and continued light reflex exaggeration. The study suggests that prompting strategies—such as specifying age, macula, or vessel appearance in greater detail—might yield superior results.

This work arrives amid a surge in using deep learning models, particularly generative adversarial networks (GANs), to synthesize high-quality images for training data augmentation in ophthalmology. GANs, however, require substantial technical expertise and computational resources. By contrast, GPT-4o and similar language model-based generators promise a faster and more accessible route for image synthesis. The authors note that this is the first documented instance of a publicly accessible large language model producing high-resolution, authentic-looking retinal photographs. Nonetheless, whether such images can serve reliably in training clinical deep learning systems or pass professional diagnostic muster remains an open question, and further research is warranted to determine their real-world utility.

74

Impact Score

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.

Please check your email for a Verification Code sent to . Didn't get a code? Click here to resend