Raspberry Pi has announced the Raspberry Pi AI HAT+ 2, an add on board for Raspberry Pi 5 that is designed to run generative artificial intelligence workloads entirely on device. The new HAT addresses a key limitation of the original AI HAT+, which excelled at vision tasks such as object detection, pose estimation, and scene segmentation but could not run generative artificial intelligence models. By keeping all processing on the Raspberry Pi and removing the need for network connectivity, the AI HAT+ 2 continues the focus on privacy, security, and cost efficiency by avoiding cloud based artificial intelligence services.
The Raspberry Pi AI HAT+ 2 is built around the Hailo-10H neural network accelerator and delivers 40 TOPS (INT4) of inferencing performance, so that generative artificial intelligence workloads can run smoothly on Raspberry Pi 5. The board includes 8GB of dedicated on board RAM, which enables the accelerator to handle much larger models than before and allows the Hailo-10H to accelerate large language models, vision language models, and other generative artificial intelligence applications. For vision based workloads such as Yolo object recognition, pose estimation, and scene segmentation, its computer vision performance is broadly equivalent to the earlier 26 TOPS AI HAT+, helped by the added memory and reuse of the existing camera software stack, including libcamera, rpicam apps, and Picamera2. Users already working with the previous HAT will find that most software integration is seamless, although different model files compiled for the H10 NPU are required.
At launch, several large language models are available to install, including DeepSeek-R1-Distill 1.5 billion, Llama3.2 1 billion, Qwen2.5-Coder 1.5 billion, Qwen2.5-Instruct 1.5 billion, and Qwen2 1.5 billion, with more and larger models planned soon after release. Example applications use the hailo-ollama large language model backend from Hailo’s Developer Zone together with the Open WebUI frontend to provide a browser based chat interface, all running locally on a Raspberry Pi AI HAT+ 2 attached to a Raspberry Pi 5. Demonstrations include Qwen2 for question answering, Qwen2.5-Coder for programming tasks, Qwen2 for simple French to English translation, and a vision language model for describing live camera scenes. The article notes that cloud based large language models from OpenAI, Meta, and Anthropic range from 500 billion to 2 trillion parameters, while the edge focused models sized to fit in the AI HAT+ 2 on board RAM typically run at 1-7 billion parameters and are intended to operate within constrained datasets rather than match the full knowledge of large cloud models.
To address these limits, Raspberry Pi highlights fine tuning as a way to adapt models to specific tasks. As with the original AI HAT+, users can retrain visual models such as Yolo using image datasets that match their applications, using the Hailo Dataflow Compiler. The AI HAT+ 2 also supports Low-Rank Adaptation based fine tuning for language models, so users can create adapters for task specific customization while keeping most base model parameters frozen and then run those adapted models on the HAT. The Raspberry Pi AI HAT+ 2 is available now at $130, and Raspberry Pi directs users to its AI HAT guide for setup instructions. Hailo’s GitHub repository and Developer Zone provide examples, demos, and frameworks for vision and generative artificial intelligence applications, including vision language models, voice assistants, speech recognition, as well as documentation, tutorials, and downloads for the Dataflow Compiler and the hailo-ollama server.
