A governance blueprint for securing agentic systems

Enterprises are being urged to manage agentic systems as powerful, semi-autonomous users by shifting security from prompt-level guardrails to boundary-focused governance. A new eight-step plan outlines how CEOs can demand concrete controls and evidence around capabilities, data, behavior, and oversight.

Enterprises adopting agentic systems are being advised to treat these systems as powerful, semi-autonomous users and to govern them at the boundaries where they interact with identity, tools, data, and outputs. Rather than relying on prompt-level constraints, security guidance from standards bodies, regulators, and major providers converges on boundary controls that can be governed and audited. A proposed eight-step plan groups controls into three pillars that constrain capabilities, control data and behavior, and prove governance and resilience, giving boards and CEOs concrete questions they can ask and expect to see answered with evidence instead of assurances.

The first pillar focuses on constraining capabilities by redefining how agents are identified and what they can do. Agents should be treated as non-human principals with narrowly scoped roles, running as the requesting user in the correct tenant, with permissions aligned to role and geography and with high-impact actions requiring explicit human approval and recorded rationale. Tooling must be controlled like a supply chain, with pinned versions of remote tool servers, formal approvals for adding tools, scopes, or data sources, and explicit policies for any automatic tool-chaining, aligning with OWASP concerns about excessive agency and with the EU AI Act Article 15 obligations on robustness and cybersecurity. Permissions should be bound to tools and tasks rather than to models, with credentials and scopes rotated and auditable, so that an agent such as a finance operations assistant can be allowed to read but not write ledgers without specific approval, and so that individual capabilities can be revoked without a full system redesign.

The second pillar targets data and behavior by treating external content as hostile until vetted, separating system instructions from user content, and gating all new retrieval sources before they enter memory or retrieval-augmented generation workflows, including tagging sources, disabling persistent memory in untrusted contexts, and tracking provenance. Outputs that can trigger side effects must never execute solely because a model produced them, and instead require validators that inspect agent outputs before code, credentials, or other artifacts reach production environments or users, following OWASP insecure output handling guidance and browser origin-boundary practices. Runtime data privacy is positioned as protecting data first and models second, using tokenization or masking by default with policy-controlled detokenization at output boundaries and comprehensive logging, which limits blast radius if an agent is compromised and provides evidence of risk control under regimes such as the EU AI Act, GDPR, and sector regulations.

The third pillar addresses governance and resilience by insisting on continuous evaluation and robust inventory and audit practices. Agents should be instrumented with deep observability and subjected to regular red teaming and adversarial test suites, with failures captured as regression tests and drivers for policy updates, reflecting concerns about sleeper agents described in recent research. Organizations are urged to maintain a living catalog and unified logs covering which agents exist on which platforms, what scopes, tools, and data each can access, and the details of every approval, detokenization, and high-impact action, enabling reconstruction of specific decision chains. A system-level threat model is recommended that assumes a sophisticated adversary is already present, aligns with frameworks like MITRE ATLAS, and recognizes that attackers target systems rather than individual models. Collectively, these controls reposition Artificial Intelligence access and actions within familiar enterprise security frames used for powerful users and systems, and shift board-level focus from generic guardrails to verifiable answers to specific governance questions.

56

Impact Score

EU Artificial Intelligence Act amendments delay some deadlines and add new bans

A provisional Digital Omnibus on Artificial Intelligence would push back several EU Artificial Intelligence Act deadlines, refine how the law interacts with sector rules, and introduce new prohibited practices. The package also expands limited bias-testing allowances and strengthens centralized oversight for some high-impact systems.

Qwen 3.5 raises concerns about censorship embedded in model weights

A technical analysis of Alibaba Cloud’s Qwen 3.5 points to political censorship circuits embedded directly in the model’s learned weights. The findings highlight operational, compliance, and product risks for startups building on third-party Artificial Intelligence models.

Laptop prices rise as memory shortages hit PCs

Laptop prices are climbing as memory makers redirect production toward data center demand driven by Artificial Intelligence. The squeeze is spreading beyond RAM to graphics memory and SSDs, raising costs across the PC market.

Artificial Intelligence models split on job disruption estimates

A new working paper finds that leading Artificial Intelligence models give sharply different answers when asked which jobs they are most likely to disrupt. The findings raise doubts about using model-generated exposure scores to guide labor policy or economic analysis.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.