Y Combinator backs new wave of computer vision startups in 2026

Y Combinator’s 2026 computer vision cohort spans infrastructure, developer tools, and industry-specific applications from retail security to aquaculture and healthcare. Startups are increasingly pairing computer vision with large vision language models and foundation models to tackle real-time video, automation, and domain-specific analysis.

Y Combinator’s latest batch of computer vision startups reflects a shift toward vision systems that combine real-time video understanding, foundation models, and automation across both digital and physical environments. Overshoot targets developers who need real-time vision applications by connecting live video feeds to the largest collection of Vision Language Models with 3 lines of code and returning responses in less than 200ms, which it claims is 10x faster than any existing inference platform. OnDeck positions itself as infrastructure for Vision Language Models, enabling enterprises to find any object, behavior, or event in footage without training models or collecting data, and published a NeurIPS workshop paper showing new methods with Vision Language Models beating traditional computer vision even at niche tasks. Eventual and Encord focus on the data layer, with Eventual’s Daft engine built for petabytes of multimodal data and Encord offering human-in-the-loop labeling, testing, and curation for Artificial Intelligence teams shipping models to production.

Several startups are applying computer vision to operational problems in specific verticals. Lexius transforms retail cameras into an Artificial Intelligence security guard that detects shoplifting and organized crime while automating case generation, while Kirana AI deploys an on-premise GPU to monitor grocery stores 24/7 for theft, safety, and customer service issues and plans to automate ordering and pricing through point-of-sale integrations. OctaPulse uses vision to automate hatchery QA in fish farms, cutting inspection time from about 5 minutes to under 30 seconds per fish with more than 90 percent accuracy and targeting the $300B aquaculture industry where farms can spend >$200K a year on technicians and geneticists. In healthcare, Mecha Health builds x-ray foundation models that beat Microsoft, Google, and OpenAI on clinical accuracy metrics, helping radiologists go from reading 1 scan per hour to 1 scan every 5 minutes for what it frames as a 40B+ x-ray report generation market, while MICSI introduces Artificial Intelligence software that doubles MRI resolution and halves scan time, estimating an additional 2 million of revenue per MRI scanner.

Manufacturing and industrial automation emerge as another major theme. Allus AI develops vision foundation models for factories to see and improve production in real time, and Bucket Robotics converts CAD and sample data into production-ready models that adapt as parts and lines change, targeting what it calls a $700B automation push in American manufacturing. LineWise and Optifye.ai capture live video on the shop floor to either extract know-how into structured standard operating procedures or monitor worker performance in real time, while Cerrion uses standard CCTV cameras to detect bottlenecks, citing a Pepsi supplier producing 500 bottles per minute that now automatically reacts to fallen bottles. Additional startups extend computer vision into public safety and infrastructure, with EdgeTrace unifying camera networks for public safety teams, DeepNight building next-generation night vision with Artificial Intelligence, and Mach9 transforming geospatial imagery into insights for urban development. Across this cohort, many teams bring experience from companies such as Uber, Meta, AssemblyAI, and self-driving car programs, and frequently combine computer vision with Artificial Intelligence agents, generative models, and multimodal analytics to automate tasks that were previously manual, slow, or impossible at scale.

55

Impact Score

Memory architecture is central to autonomous llm agents

Memory design, not just model choice, determines whether autonomous agents can sustain context, learn from experience, and stay reliable over time. A practical framework centers on how information is written, managed, and read across multiple memory types.

OpenAI expands cyber model access through trusted program

OpenAI has introduced GPT-5.4-Cyber as a restricted model for cybersecurity professionals, widening access through its Trusted Access for Cyber program. The release highlights both the defensive value and misuse risks of more capable Artificial Intelligence tools in security work.

Chinese tech firms and Li Fei-Fei push world models forward

Chinese tech companies and Li Fei-Fei’s World Labs are accelerating work on world models, a field focused on helping Artificial Intelligence learn from and interact with physical reality. Alibaba’s new Happy Oyster system targets real-time virtual world creation with more continuous user control.

UK launches Sovereign Artificial Intelligence backing for startups

The UK government has unveiled Sovereign Artificial Intelligence, a state-backed initiative aimed at helping domestic startups build, scale and stay in Britain. The first support includes an equity investment in Callosum and supercomputing access for 6 additional companies working across drug discovery, infrastructure and national security.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.