Enterprises shift from building to operating Artificial Intelligence agents

Enterprises are moving from rapidly building Artificial Intelligence agents to the harder discipline of governing, observing, and securing them in production, as agent-based automation spreads across core business functions.

Enterprises are entering a new phase of generative Artificial Intelligence adoption in which the main challenge is no longer how to build agents but how to operate them safely, predictably, and at scale inside real business systems, according to Maryam Ashoori, vice president of product and engineering at IBM watsonx. She said that since 2023, the market has cycled from early experimentation with models and prompts to a surge of enthusiasm for agents, and is now settling into a more sober focus on control, visibility, and governance. In 2023, most companies treated generative Artificial Intelligence as an exploratory investment concentrated on narrow use cases such as summarization, classification, question answering, information extraction, code generation, and content creation, and when organizations had a clear production-relevant use case, the value was immediate but otherwise pilots produced insight without measurable outcomes.

The landscape shifted in early 2024 when large language models gained the ability to take actions by calling application programming interfaces, a capability often described as agentic Artificial Intelligence that let models interact directly with enterprise software and legacy systems. Ashoori said the enterprise reaction was immediate, with chief information officers requesting agents across the business even when concrete tasks were not fully defined, and speed frequently winning out over structure. By late 2025, she said enterprises had accumulated dozens or even hundreds of agents built by developers, business users, and external providers, often running across different platforms with varying assumptions, and while an agent can be built in less than five minutes, the real problem is what happens once it is deployed and connected to sensitive systems.

As agents multiplied, so did exposure to risk, because agents inherit model limitations and those weaknesses are amplified when systems can act rather than just respond. Ashoori noted that hallucinations, which are familiar at the model layer, can become operational failures at the agent layer, and if the model hallucinates and takes the wrong tool and that tool has access to unauthorized data, then you have a data leak. She said enterprises have therefore shifted their focus from build time to run time as they move from experimentation to production and discover that managing and governing a collection of agents is significantly more complex than creating them, and this has made observability unavoidable.

Ashoori defined observability in terms of tracing, which records every action an agent takes, including which model was involved, which tool was called, the inputs and outputs, how long each step took, and what it cost. She illustrated the need for tracing with a customer support scenario in which an agent diagnosing a non-working camera might consult internal manuals via retrieval, then escalate to online search tools or public forums, and each action is a decision point where incorrect or inappropriate information can surface. With tracing, enterprises can go back and see exactly what happened, support audits and compliance, and identify optimization opportunities by seeing where the latency was caused, where the cost was triggered, and which models or workflows should be replaced. Despite these benefits and growing risks, she cited figures indicating that only about 19% of organizations currently focus on observability and monitoring in production, even as costs and exposure increase.

Looking ahead, Ashoori said that this gap will close quickly, pointing to Gartner estimates suggesting that by 2028, roughly one-third of interactions with generative Artificial Intelligence systems will occur through agents, reinforcing her view that agents are going to be everywhere. However, she stressed that observability is only one part of the challenge, and the next phase of adoption will be shaped heavily by policy enforcement and security. As agents gain autonomy, assigning responsibility becomes more difficult because the builder, model provider, tool owner, and end user may all share accountability when something goes wrong, and she described the security of agents as a very hot topic as enterprises create non-negotiable policies from risk, compliance, and security teams to govern behavior, especially in highly regulated environments.

Fragmentation further complicates governance because agents are created on different platforms and frameworks both inside and outside organizations, which is driving interest in approaches that are agnostic to how agents are built or where they run. From Ashoori’s perspective, best practice involves separating systems that build agents from systems that govern them so that enterprises can monitor, evaluate, and optimize agent behavior regardless of origin. Even with heightened concern around risk, she said momentum behind agents is not slowing but rather becoming more practical, and the next wave of adoption will emphasize domain-specific automation in areas like HR, sales and marketing, procurement, commerce, and IT support, where workflows are repeatable and the value of automation is easier to measure.

Ashoori pointed to IBM’s internal HR agent “AskHR” as an example of this more grounded approach, explaining that it is used to provide complementary analytics during performance evaluations and other sensitive processes involving highly personal information. She expects that successful workloads such as HR agents will increasingly be reused across enterprises because if an agent works well in one large organization, it is likely to be relevant to many others facing similar processes and constraints. In her view, the market is moving away from pure invention toward disciplined execution, and she said the energy has shifted so that the core priority is now learning how to operate Artificial Intelligence agents with confidence and at scale rather than simply launching new pilots.

65

Impact Score

Indiana launches Artificial Intelligence business portal

Indiana is rolling out IN AI, a statewide portal meant to help employers adopt Artificial Intelligence with practical guidance, workshops and peer support. State leaders and business groups are positioning the effort as a way to raise productivity, wages and job growth while keeping workers at the center.

Goodfire launches model debugging tool for large language models

Goodfire has introduced Silico, a mechanistic interpretability platform designed to let developers inspect and adjust model behavior during development. The company is positioning it as a way to give smaller teams deeper control over open-source models and more trustworthy outputs.

Nvidia launches nemotron 3 nano omni for enterprise agents

Nvidia has introduced Nemotron 3 Nano Omni, a multimodal open model designed to support enterprise agents that reason across vision, speech and language. The launch extends Nvidia’s push beyond hardware into models and services while targeting more efficient agentic workflows.

Intel 18A-P node improves performance and efficiency

Intel plans to present new results for its 18A-P process at the VLSI 2026 Symposium, highlighting gains in performance, power efficiency, and manufacturing predictability. The updated node is positioned as a stronger option for customers seeking 18A density with better operating characteristics.

EA CEO defends broader Artificial Intelligence use in game development

EA CEO Andrew Wilson defended the company’s internal use of Artificial Intelligence after employee claims that the tools were slowing work rather than helping. He framed the technology as an aid for repetitive quality assurance tasks, even as concerns persist over its broader impact on development.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.