Artificial Intelligence model trained on NVIDIA GPUs identifies over a million species

Tanya Berger-Wolf’s BioCLIP 2 is an Artificial Intelligence foundation model trained on NVIDIA GPUs and a 214 million-image dataset that maps traits across more than 925,000 taxonomic classes. The model is available open source on Hugging Face and will be presented at NeurIPS.

Tanya Berger-Wolf, director of the Translational Data Analytics Institute and a professor at The Ohio State University, has developed BioCLIP 2, an Artificial Intelligence foundation model trained on the largest and most diverse organism dataset to date. The project grew from Berger-Wolf’s early work identifying individual zebras and now targets broad biodiversity tasks. The underlying dataset, TREEOFLIFE-200M, contains 214 million images spanning more than 925,000 taxonomic classes and was curated with partners including the Imageomics Institute and the Smithsonian Institution.

BioCLIP 2 goes beyond image recognition to learn species traits, taxonomic hierarchies and intra-species variation without explicit labels. In experiments the model arranged Darwin’s finches by beak size, distinguished adult from juvenile and male from female animals, and separated healthy plant leaves from diseased ones while identifying different disease types. The team trained the model for 10 days on 32 NVIDIA H100 GPUs and used a cluster of 64 NVIDIA Tensor Core GPUs to accelerate training, with individual Tensor Core GPUs employed for inference. The model is available under an open-source license on Hugging Face and was downloaded more than 45,000 times in the last month. BioCLIP 2 builds on an earlier version that was also trained on NVIDIA GPUs and received the Best Student Paper award at CVPR.

The researchers plan to extend the work into interactive wildlife digital twins that visualize and simulate ecological interactions and species perspectives, enabling low-impact experimentation and public-facing experiences such as zoo installations. Berger-Wolf and her team emphasize that these tools can help address data deficiencies in conservation biology by serving as a biological encyclopedia and inference platform to fill gaps for under-documented species. The BioCLIP 2 paper will be presented at NeurIPS, with sessions scheduled Nov. 30 to Dec. 5 in Mexico City and Dec. 2 to 7 in San Diego.

68

Impact Score

Computational biology and bioinformatics coverage in Nature

Nature’s computational biology and bioinformatics section highlights research and commentary spanning genomic regulation, enzyme and gene design, microbiomes, and the fast‑moving impact of artificial intelligence on science and society.

How Artificial Intelligence is reshaping democratic politics

A review of Rewiring Democracy examines how accelerating Artificial Intelligence tools are already woven into campaigning, governing, and civic life, and questions whether that will truly make governments more responsive. The book argues that liberal democracies must learn to harness these systems while avoiding alarmism about their risks.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.