A governance blueprint for securing agentic systems

Enterprises are being urged to manage agentic systems as powerful, semi-autonomous users by shifting security from prompt-level guardrails to boundary-focused governance. A new eight-step plan outlines how CEOs can demand concrete controls and evidence around capabilities, data, behavior, and oversight.

Enterprises adopting agentic systems are being advised to treat these systems as powerful, semi-autonomous users and to govern them at the boundaries where they interact with identity, tools, data, and outputs. Rather than relying on prompt-level constraints, security guidance from standards bodies, regulators, and major providers converges on boundary controls that can be governed and audited. A proposed eight-step plan groups controls into three pillars that constrain capabilities, control data and behavior, and prove governance and resilience, giving boards and CEOs concrete questions they can ask and expect to see answered with evidence instead of assurances.

The first pillar focuses on constraining capabilities by redefining how agents are identified and what they can do. Agents should be treated as non-human principals with narrowly scoped roles, running as the requesting user in the correct tenant, with permissions aligned to role and geography and with high-impact actions requiring explicit human approval and recorded rationale. Tooling must be controlled like a supply chain, with pinned versions of remote tool servers, formal approvals for adding tools, scopes, or data sources, and explicit policies for any automatic tool-chaining, aligning with OWASP concerns about excessive agency and with the EU AI Act Article 15 obligations on robustness and cybersecurity. Permissions should be bound to tools and tasks rather than to models, with credentials and scopes rotated and auditable, so that an agent such as a finance operations assistant can be allowed to read but not write ledgers without specific approval, and so that individual capabilities can be revoked without a full system redesign.

The second pillar targets data and behavior by treating external content as hostile until vetted, separating system instructions from user content, and gating all new retrieval sources before they enter memory or retrieval-augmented generation workflows, including tagging sources, disabling persistent memory in untrusted contexts, and tracking provenance. Outputs that can trigger side effects must never execute solely because a model produced them, and instead require validators that inspect agent outputs before code, credentials, or other artifacts reach production environments or users, following OWASP insecure output handling guidance and browser origin-boundary practices. Runtime data privacy is positioned as protecting data first and models second, using tokenization or masking by default with policy-controlled detokenization at output boundaries and comprehensive logging, which limits blast radius if an agent is compromised and provides evidence of risk control under regimes such as the EU AI Act, GDPR, and sector regulations.

The third pillar addresses governance and resilience by insisting on continuous evaluation and robust inventory and audit practices. Agents should be instrumented with deep observability and subjected to regular red teaming and adversarial test suites, with failures captured as regression tests and drivers for policy updates, reflecting concerns about sleeper agents described in recent research. Organizations are urged to maintain a living catalog and unified logs covering which agents exist on which platforms, what scopes, tools, and data each can access, and the details of every approval, detokenization, and high-impact action, enabling reconstruction of specific decision chains. A system-level threat model is recommended that assumes a sophisticated adversary is already present, aligns with frameworks like MITRE ATLAS, and recognizes that attackers target systems rather than individual models. Collectively, these controls reposition Artificial Intelligence access and actions within familiar enterprise security frames used for powerful users and systems, and shift board-level focus from generic guardrails to verifiable answers to specific governance questions.

56

Impact Score

Anumana wins FDA clearance for pulmonary hypertension ECG Artificial Intelligence tool

Anumana has received FDA 510(k) clearance for an Artificial Intelligence-enabled pulmonary hypertension algorithm designed for use with standard 12-lead electrocardiograms. The company says the software can help clinicians spot early signs of disease within existing workflows and without moving patient data outside the health system environment.

Anu Bradford on tech sovereignty and regulatory fragmentation

Anu Bradford argues that Europe is wavering in its role as the world’s digital rule-setter just as governments everywhere move toward more state control over technology. Global companies are being pushed to treat geopolitical risk, data sovereignty, and Artificial Intelligence governance as core strategic issues.

Mistral launches text-to-speech model

Mistral has expanded its Voxtral family with a text-to-speech system aimed at enterprise voice applications. The company is positioning the open-weights model as a flexible alternative for organizations that want more control over deployment, cost and customization.

UK Parliament opens workforce inquiry on Artificial Intelligence

A UK Parliament committee is examining how Artificial Intelligence is changing business and work, with a focus on both economic opportunity and labour disruption. The inquiry is seeking evidence on government priorities as adoption expands across the economy.

Windows 11 tightens kernel trust for older drivers

Microsoft is changing Windows 11 kernel policy so new drivers must be signed through the Windows Hardware Compatibility Program. Older trusted drivers will still be allowed in some cases to preserve compatibility during the transition.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.