Google and DeepMind have unveiled VaultGemma, a one-billion-parameter Artificial Intelligence language model built from scratch with differential privacy. The teams describe VaultGemma as the largest open-source model of its kind and say it is available for free on Hugging Face and Kaggle. The model is positioned for research and practical deployments in sensitive areas such as healthcare and finance, where keeping training and user data private is a primary concern.
VaultGemma runs on Googleu2019s Gemma architecture and uses Multi-Query Attention with a 1,024-token context window, a configuration intended to balance inference speed with privacy protections. Training was conducted across 2,048 TPUv6e chips and followed new privacy scaling rules. According to the teams, the training process matched predicted accuracy targets while preserving stronger privacy guarantees than models that apply privacy measures only after pre-training.
On standard benchmarks such as ARC-C and TriviaQA, VaultGemma scores roughly on par with non-private models from five years ago, a performance level described as solid though not state of the art. The crucial distinction is that VaultGemma bakes differential privacy into pre-training rather than applying it only at the end, reducing the risk of memorizing or leaking training data. Google and DeepMind frame the release as a new standard for privacy-first Artificial Intelligence models that could be especially valuable for developers and organizations that must meet strict data protection requirements.