PadChest-GR sets new bilingual benchmark for grounded radiology reporting

PadChest-GR debuts as the first multimodal, bilingual chest X-ray dataset, advancing Artificial Intelligence interpretability and evaluation in clinical radiology.

PadChest-GR, created by the University of Alicante in collaboration with Microsoft Research, University Hospital Sant Joan d’Alacant, and MedBravo, emerges as the world’s first multimodal, bilingual radiology dataset focused on chest X-rays. This dataset contains 4,555 chest X-ray studies, each paired with granular sentence-level Spanish and English descriptions and spatially precise bounding box annotations for both positive and negative findings. This initiative marks a pivotal shift from traditional, unstructured radiology narratives to a more systematic, grounded approach, facilitating improved collaboration between clinicians and Artificial Intelligence systems while reducing the risk of fabricated or ambiguous interpretations.

The benchmark is designed to catalyze research in vision-language models for healthcare by offering the first public resource to train and evaluate fully grounded radiology reports in both English and Spanish. PadChest-GR underpins the latest generation of Artificial Intelligence-powered systems such as MAIRA-2, a state-of-the-art model developed by Microsoft Research for interpretable report generation. The dataset’s creation leveraged GPT-4 via Microsoft Azure OpenAI Service for sentence extraction and translation, with expert radiologist oversight through the Centaur Labs platform ensuring high-quality, clinically relevant annotations.

Collaboration was central to realizing PadChest-GR, combining the technical strengths of Microsoft Research with the clinical insight of the University of Alicante. The annotation protocol enforced consistency and accuracy, resulting in a robust resource for advancing grounded radiology reporting models. PadChest-GR’s impact is already visible, supporting published research and inspiring new models and standards for evaluation. The stakeholders highlight the importance of open scientific collaboration, anticipating that widespread access to PadChest-GR will accelerate progress in medical imaging Artificial Intelligence and deliver meaningful improvements in patient care. The resource is now available for the global research community, poised to drive innovation in grounded vision-language applications within medical imaging.

76

Impact Score

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.

Please check your email for a Verification Code sent to . Didn't get a code? Click here to resend