the genai paradox: superhuman models but mixed success with enterprise Artificial Intelligence

Frontier models are showing superhuman performance on targeted tasks, yet enterprise adoption is lagging because the bottleneck is integration and process rather than model quality. New studies on GPT-5 and voice agents highlight capability and clear routes to practical impact for Artificial Intelligence when workflows are redesigned.

Frontier Artificial Intelligence models continue to advance on high‑stakes benchmarks while real‑world enterprise returns remain uneven. A controlled evaluation of GPT‑5 on multimodal medical reasoning found large gains over GPT‑4o on the MedXpertQA benchmark (+29.26% in reasoning and +26.18% in understanding) and reported performance above pre‑licensed human experts (+24.23% and +29.40% respectively). The paper also noted a nuance: on the smaller VQA‑RAD dataset, GPT‑5‑mini slightly outperformed the full GPT‑5 model, suggesting that right‑sizing can sometimes beat brute‑force scaling for niche tasks.

A separate large field experiment examined voice agents in hiring, randomizing more than 70,000 applicants for customer service roles in the Philippines to human interviewers, an Artificial Intelligence voice agent, or a choice between the two. The AI‑led interviews produced materially better hiring outcomes: 12% more job offers, 18% more job starts, and 17% higher 30‑day retention. When given a choice, 78% of applicants chose the AI interviewer. Transcript analysis pointed to greater consistency in interviews as the likely mechanism, and reported gender discrimination by interviewers nearly halved under the AI condition.

These capability wins sit against an enterprise adoption backdrop described by MIT’s NANDA initiative in The GenAI Divide. The report finds only about 5% of corporate Artificial Intelligence pilots drive rapid revenue acceleration, with most pilots stalling due to a learning and integration gap rather than model quality. Purchasing specialized tools and partnering succeed roughly two‑thirds of the time, while internal builds succeed only about one‑third as often. The newsletter draws practical lessons: treat adoption as a process problem, separate interaction from adjudication so humans make final decisions, redesign workflows for consistent, auditable signal capture, right‑size models to the task, and favor buy‑then‑integrate when speed and reliability are critical. The piece also flags related industry moves such as NVIDIA’s Granary dataset and Anthropic and Mistral model updates, underscoring a fast‑moving technical landscape alongside persistent organizational challenges.

70

Impact Score

Analog computing from waste heat

MIT researchers developed an analog computing approach that uses waste heat in electronic devices to process data without electricity. The technique performs matrix vector multiplication with strong accuracy and could also help monitor heat in chips without extra energy use.

How Artificial Intelligence is reshaping financial services oversight

Financial services regulators are largely treating Artificial Intelligence as another technology governed by existing rules rather than building new securities-specific frameworks. History suggests that clearer expectations will emerge through examinations, enforcement, and supervisory guidance.

Nvidia faces gamer backlash over Artificial Intelligence shift

Nvidia is facing growing frustration from gamers as memory supply is steered toward data center chips and DLSS 5 becomes more central to game performance. The dispute highlights how far the company’s priorities have shifted toward enterprise Artificial Intelligence.

Executives see limited Artificial Intelligence productivity gains so far

Corporate enthusiasm around Artificial Intelligence has yet to translate into broad gains in employment or productivity, reviving comparisons to the long lag between early computing breakthroughs and measurable economic impact. Recent surveys and studies show mixed results, with strong expectations for future benefits but little consensus on present gains.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.