Modular artificial intelligence agents outperform fine tuned monoliths

New multi institution research suggests that small specialized tools wrapped around a frozen large language model can match the accuracy of heavily fine tuned agents while using 70x less training data, validating a modular approach one developer discovered through trial and error.

The article describes how the author spent 6 months building artificial intelligence agents in a way that felt wrong, only to later see new research from 12 institutions show that the simple modular approach he used can be 70x more efficient than complex fine tuned agents. The research, involving institutions such as UIUC, Stanford, Princeton, Harvard, UW, Caltech, UC Berkeley, UCSD, Georgia Tech, Northwestern, TAMU, and Unity, identifies 4 main ways to optimize an artificial intelligence agent system and gives names to architectures the author had already built in practice. The key finding is that small specialized tools built around a frozen large language model, referred to as the T2 approach, beat fine tuning massive monolithic agents, referred to as the A2 approach, in data efficiency while matching accuracy.

The framework described in the research divides agent design into 4 approaches: T1, T2, A1, and A2. T1 uses portable, agent agnostic tools such as markdown files and git that can work with any large language model, while T2 builds agent supervised tools around a frozen model such as Claude 3.5. On the agent training side, A1 trains agents from tool feedback when outcomes are verifiable, and A2 fine tunes entire agents from final answers, as in the case of Search-R1, which needed 170,000 training examples and weeks of compute. In contrast, an S3 system following the T2 approach used 2,400 training examples, and the article states that the S3 system and Search-R1 perform the same task, with the S3 system using 70x fewer examples and matching Search-R1’s accuracy while using 70x less training data.

The author explains that his own workflows map directly to the T1 and T2 patterns identified by the research. For T2, he describes a research agent architecture with a frozen agent, Gemini 2.0 Flash via OpenRouter, that is never trained or modified, paired with four specialized tools: query expansion that generates four different search angles, parallel searches via SerpAPI, content extraction using Jina AI to convert webpages into markdown, and a synthesis with memory step that produces a comprehensive report, all costing $0 per query within free tiers. For T1, he uses local markdown files in git, searchable with grep, that work with Claude, GPT, or Gemini and require no vector database, embeddings, or training. He ties this to his earlier “Memory Palace” idea of specialized stores instead of a single massive context, arguing that simple, portable tools that are independent of any specific agent are more robust, since they do not break when switching models, and that the new research formally validates this simple modular philosophy.

68

Impact Score

Anumana wins FDA clearance for pulmonary hypertension ECG Artificial Intelligence tool

Anumana has received FDA 510(k) clearance for an Artificial Intelligence-enabled pulmonary hypertension algorithm designed for use with standard 12-lead electrocardiograms. The company says the software can help clinicians spot early signs of disease within existing workflows and without moving patient data outside the health system environment.

Anu Bradford on tech sovereignty and regulatory fragmentation

Anu Bradford argues that Europe is wavering in its role as the world’s digital rule-setter just as governments everywhere move toward more state control over technology. Global companies are being pushed to treat geopolitical risk, data sovereignty, and Artificial Intelligence governance as core strategic issues.

Mistral launches text-to-speech model

Mistral has expanded its Voxtral family with a text-to-speech system aimed at enterprise voice applications. The company is positioning the open-weights model as a flexible alternative for organizations that want more control over deployment, cost and customization.

UK Parliament opens workforce inquiry on Artificial Intelligence

A UK Parliament committee is examining how Artificial Intelligence is changing business and work, with a focus on both economic opportunity and labour disruption. The inquiry is seeking evidence on government priorities as adoption expands across the economy.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.