How Grok 3 compares to ChatGPT, DeepSeek and other Artificial Intelligence rivals

October 12, 2025

xAI’s Grok 3 is live with new reasoning models and bold benchmark claims. Experts say it is competitive at the frontier, but not a clear reason to abandon ChatGPT.

xAI has officially launched Grok 3, along with Grok 3 Reasoning in beta and a smaller Grok 3 mini Reasoning model. Reasoning models aim to step beyond standard generative systems by iteratively working through problems, which can reduce hallucinations. xAI is promoting Grok 3 as best in class, saying it surpasses models from OpenAI, Google, Anthropic, and DeepSeek on key benchmarks, and it performed strongly under the codename “chocolate” in Chatbot Arena’s blind tests. The model appears to have largely caught up to rivals despite xAI’s late start, though it still inherits some familiar frontier-model limitations.

Early user assessments frame Grok 3 as competitive but not definitively superior. Andrej Karpathy, a founding member of OpenAI and former Tesla director of Artificial Intelligence, said Grok 3 with its Deep Search reasoning feature feels in the state-of-the-art range of OpenAI’s strongest models and slightly ahead of DeepSeek-R1 and Gemini 2.0 Flash Thinking on his stress tests. Wharton professor Ethan Mollick called the release in line with expectations, arguing it does not alter the broader consensus: rapid progress continues, speed is a moat, compute still matters, and there is no obvious secret sauce if a team has talent and chips. In other words, Grok 3 may satisfy enthusiasts but is not an obvious reason for most users to cancel a ChatGPT subscription.

After xAI’s benchmark slides circulated, OpenAI product engineer Rex Asabor posted an “updated” comparison indicating OpenAI’s unreleased o3 beats Grok 3 Reasoning on math and science benchmarks. Since o3 is not public, xAI may not have had access to those scores, but the exchange tempers claims that Grok 3 is the outright leader in reasoning.

Observers also highlighted the pace of xAI’s catch-up. Mollick noted how quickly X got to the frontier and said the key question is whether the trend continues. Elon Musk said Grok 3 training used 10 times the compute of Grok 2, powered by 200,000 GPUs, reinforcing the near-term view that more compute correlates with better performance. Still, researcher Gary Marcus remains skeptical that scaling laws will continue to yield linear gains in intelligence.

Grok 3 shows familiar shortcomings. Karpathy described its humor as limited to punny dad jokes, calling this a common large model issue tied to mode collapse. In a prompt to generate an SVG of a pelican riding a bicycle, Grok 3 did better than some peers but still missed elements. On politically charged prompts, Karpathy said the model produced a cautious, noncommittal essay, suggesting it may be more sensitive on ethics than some of Elon Musk’s supporters expect. Previous Grok versions have leaned left on political questions, which Musk has attributed to public training data, and he has vowed to steer the system toward political neutrality. First access to Grok 3 goes to X Premium+ subscribers, a plan whose price was recently increased.

Source

55

Impact Score

Latest News

Artificial Intelligence coverage at eWeek: highlights and latest articles

October 12, 2025

eWeek’s Artificial Intelligence hub rounds up the latest news, analysis, and buying guidance, spanning security warnings, deepfake ethics, chip deals, enterprise demand, and new tools.

New test generates an immune health score

October 12, 2025

Researchers at Yale University created an immune health metric by profiling blood cells, gene expression, and more than 1,300 proteins, then using machine learning to correlate those signals with health. The experimental test aligned with responses to disease and vaccines but is not ready for clinical use.

How muscles remember movement and exercise

October 12, 2025

Research shows skeletal muscle stores a lasting epigenetic memory of both training and atrophy, shaping how quickly we regain strength or lose it, and that exercise can help reset negative imprints.

The Download: our bodies’ memories and Traton’s electric trucks

October 12, 2025

Today’s briefing explores how muscles store movement memories, a European push for zero-emission freight, and a new attempt to score immune health, plus the day’s biggest tech stories.

OpenAI warns EU antitrust officials about big tech dominance in artificial intelligence

October 11, 2025

OpenAI told European Commission antitrust enforcers it fears Apple, Google and Microsoft could lock in users and dominate the emerging artificial intelligence market. The company urged intervention on data access and platform practices following a 24 September meeting.

How Grok 3 compares to ChatGPT, DeepSeek and other Artificial Intelligence rivals

55

Impact Score

Latest News

Artificial Intelligence coverage at eWeek: highlights and latest articles

New test generates an immune health score

How muscles remember movement and exercise

The Download: our bodies’ memories and Traton’s electric trucks

OpenAI warns EU antitrust officials about big tech dominance in artificial intelligence

Contact Us