As a veteran in endpoint security and network management, Rupert Marais has spent years navigating the intersection of emerging technology and rigid regulatory frameworks. With the Australian Prudential Regulation Authority recently flagging significant gaps in how financial giants manage autonomous systems, Marais offers a crucial perspective on the shift from experimental AI use to systemic integration. This conversation explores the necessity of technical depth at the board level, the evolution of identity management for non-human agents, and the hidden dangers of upstream dependencies. By synthesizing the findings of the 2025 review of large regulated entities, we delve into the core challenges of maintaining operational resilience in an era where software can now make high-stakes financial decisions.
Many financial institutions prioritize productivity and customer experience when deploying AI, yet board-level oversight of technical risks often lags. How can leadership transition from relying on vendor presentations to a deeper technical understanding, and what specific procedures should be triggered when a model displays unpredictable behavior?
Leadership must move beyond the glossy summaries provided by vendors and start asking uncomfortable questions about the underlying mechanics of their models. During the targeted review in late 2025, it became clear that while boards are enthusiastic about productivity, they often lack the technical depth to scrutinize how a failure might ripple through critical operations. To fix this, institutions need to align their AI strategy directly with a formal risk appetite statement that specifically accounts for model volatility. When a model exhibits unpredictable behavior, firms should have “kill-switch” protocols and defined rollback procedures that can be triggered immediately to prevent cascading errors. This isn’t just about turning the system off; it involves a coordinated response that includes forensic logging and an immediate shift to manual oversight until the root cause of the deviation is identified.
Organizations are increasingly using AI for high-stakes tasks like loan application processing and claims triage. What specific protocols ensure that human-in-the-loop requirements are effectively maintained, and how can firms move beyond treating AI as standard IT by accounting for inherent model bias?
The biggest mistake a firm can make is treating a probabilistic AI model like a deterministic piece of legacy software. Because these systems can inherit and amplify bias, especially in sensitive areas like claims triage, we must mandate human intervention for any high-risk decision that impacts a customer’s financial standing. Effective protocols require that a “named-person” is held accountable for each AI instance, ensuring there is a clear line of responsibility rather than a vague corporate oversight. Monitoring must go beyond simple uptime and delve into “behavioral drift,” where the model’s outputs are periodically audited against a baseline of human-approved decisions. By implementing these rigorous checks, firms can transition from a passive “set and forget” mentality to an active management style that respects the unique risks of machine learning.
Identity and access management systems are traditionally designed for human interaction, but AI agents now perform delegated actions. What are the primary challenges in securing non-human identities, and how can service providers verify that a software agent is acting under valid, authorized conditions?
Traditional identity frameworks were never built to handle a software agent that can initiate commerce or move data autonomously, which is why the FIDO Alliance is currently racing to develop new specifications. The primary challenge lies in verifying the intent behind a delegated action—ensuring that when an agent acts, it is doing so within a strictly defined and authorized scope. We are seeing progress with frameworks like Google’s Agent Payments Protocol and Mastercard’s Verifiable Intent, which aim to provide a digital “paper trail” for non-human actions. To secure these identities, firms must implement non-human identity management that treats an AI agent with the same level of scrutiny as a high-privileged administrator. This means every action must be verifiable, time-bound, and tied to a specific, authorized human intent to prevent unauthorized or malicious “agentic” behavior.
Integrating AI into software engineering and customer-facing roles introduces new vulnerabilities like prompt injection and insecure integrations. What step-by-step controls are necessary for managing agentic workflows, and how should organizations approach security testing for code that is generated by AI?
Managing agentic workflows requires a multi-layered defense strategy that begins with strict privileged access management and ends with rigorous configuration patching. To combat threats like prompt injection, organizations should adopt the CIS Controls v8.1, mapping them specifically to their Large Language Model and Model Context Protocol environments. For AI-generated code, the speed of development can feel overwhelming, but we cannot bypass traditional security testing; instead, we must automate the scanning of this code for vulnerabilities before it ever hits production. This involves a dedicated pipeline where AI-generated snippets are treated as “untrusted” by default and subjected to the same static and dynamic analysis as code written by a human. Only by enforcing these guardrails can we benefit from the speed of AI without opening the door to insecure integrations that compromise the entire network.
Many entities depend on a single AI provider for multiple core functions without a clear substitution strategy. What are the systemic risks of these upstream dependencies, and what specific elements must be included in a robust exit plan to ensure operational resilience?
The concentration of risk in a single upstream AI provider is a ticking time bomb for operational resilience. If that provider experiences a catastrophic failure or a significant change in their terms of service, an institution without a substitution strategy could see its claims processing or fraud detection grind to a halt overnight. A robust exit plan must include a detailed inventory of every AI instance and a clear mapping of which functions are dependent on specific third-party models. It should also outline “portability” requirements, ensuring that data and prompts can be migrated to an alternative provider with minimal friction. Without these elements, a firm is essentially handing over its operational sovereignty to a vendor, leaving them vulnerable to any disruption that occurs further up the supply chain.
The speed of AI-assisted development is placing significant pressure on traditional change and release controls. How can firms maintain an accurate inventory of AI instances, and what metrics should be used to determine who holds “named-person ownership” of specific autonomous tools?
Maintaining an accurate inventory is becoming a monumental task as the volume of AI-assisted software development surges, putting immense pressure on release controls. Firms must move toward automated asset discovery tools that can identify and categorize AI instances across the enterprise in real-time. To establish “named-person ownership,” we should look at metrics such as the specific business unit benefiting from the tool, the technical lead responsible for its deployment, and the risk officer assigned to its oversight. This creates a tripartite ownership structure where there is no ambiguity about who is responsible for the tool’s behavior, its decommissioning, or its failure. Ownership shouldn’t just be a name on a spreadsheet; it should be a functional role with the authority and resources to manage the tool’s entire lifecycle from pilot to sunset.
What is your forecast for AI agent governance?
I predict that within the next twenty-four months, we will see a mandatory shift toward standardized “Agentic Passports” that provide a universal way to audit the permissions and origins of any autonomous software. As regulators like APRA continue to expose gaps in maturity, the industry will be forced to move away from fragmented, internal risk models toward a unified global framework for non-human identity. We will likely see the emergence of “Governance-as-Code,” where compliance checks are baked directly into the AI’s operational environment, preventing any agent from executing a command that violates pre-set risk parameters. Ultimately, the firms that survive the coming regulatory crackdown will be those that view governance not as a hurdle to innovation, but as the essential foundation that makes autonomous commerce possible.
