The AI Trust Letter
Posts
AI agents can now make payments. What could go wrong?

AI agents can now make payments. What could go wrong?

Top AI and Cybersecurity news you should check out today

Rodrigo Fernandez
May 18, 2026

Welcome Back to The AI Trust Letter

Once a week, we distill the most critical AI & cybersecurity stories for builders, strategists, and researchers. Let’s dive in!

🛡️ Securing the Agentic Payment Layer

The Story:

Payment rails were built on the assumption that a human is behind every transaction. AI agents break that assumption, creating a "Human-Not-Present" (HNP) gap that 3D Secure, biometrics, and CAPTCHAs cannot close without destroying the agent's autonomy. The industry is now building a new stack of protocols to handle machine-initiated commerce.

The details:

The Agent Payments Protocol (AP2) uses Verifiable Digital Credentials and two-stage mandates (Checkout and Payment) to cryptographically bind an agent's action to a specific user intent
Skyfire's KYAPay protocol replaces session-based trust with transaction-level authentication, requiring a signed JWT for every payment that carries the owner identity, authorization scope, and transaction parameters
Scoped Payment Tokens from providers like Stripe, Visa, and Mastercard restrict agents by merchant category, time-to-live, and maximum spend, so a compromised agent cannot exceed its mandate
Pre-flight defenses (deterministic policy guardrails, anomaly detection, recursive loop protection) sit between the reasoning engine and the payment gateway to block hallucinated or injected purchases

Why it matters:

Liability is the unsolved part. When an agent makes a disputed purchase, the question of whether the user, developer, credential provider, or orchestration layer is responsible has no clear answer yet. The fix is not better firewalls but non-repudiable audit trails that link signed user mandates to traceable reasoning logs and verified execution metadata. Without this, agentic commerce stays stuck in pilot mode.

Read full article

🔐 The Race to Keep AI Agents From Going Rogue With Your Credit Card

The Story:

The FIDO Alliance, the body behind passkeys, is now the custodian of the standards that will govern how AI agents pay for things. Google has handed over ownership of AP2, and Mastercard has contributed a parallel framework called Verifiable Intent. The move shifts agentic commerce from vendor-led protocols to a neutral standards process.

The details:

FIDO has formed an Agentic Authentication Technical Working Group focused on how users delegate actions to agents with phishing-resistant authentication, separating what a user does directly from what an agent does on their behalf
A second group, the Payments Technical Working Group, is chaired by Mastercard and Visa and is responsible for the agent-initiated commerce specifications
Mastercard's Verifiable Intent, co-developed with Google and compatible with AP2, creates a shared record of what a user actually approved an agent to do, so merchants and issuers can check an action against the original authorization rather than trust the agent's claim
The combined work targets three gaps: verifiable user instructions, agent authentication that proves the agent is acting for a real authenticated user rather than replaying a session, and trusted delegation that can be audited if a charge is later disputed

Why it matters:

A protocol owned by one vendor is a product. A protocol owned by a standards body is infrastructure. Google giving up AP2 is the signal that the agentic payments stack is moving past the land-grab phase and into the interoperability phase, where issuers, merchants, and wallets need a common definition of "the user approved this" before they will accept the liability. FIDO is the same venue that got the industry to align on passkeys, and it is now being asked to do the same for machine-initiated commerce, on a much shorter timeline.

Read full article

🤖 OpenAI Daybreak: The Dawn of Agentic Cybersecurity

The Story:

OpenAI has launched Daybreak, an initiative that turns its GPT-5.5 models plus Codex into an agent that works inside the software development lifecycle to find, validate, and patch vulnerabilities. It is positioned as a direct response to the remediation bottleneck created by the volume of zero-days and supply chain bugs, and as a counterweight to Anthropic's Claude Mythos in the same space.

The details:

Codex acts as the agentic harness, running secure code review across large codebases, building editable threat models per repository, generating and testing patches in-repo, and analyzing dependency risk for supply chain exposure
Access is split into three tiers: GPT-5.5 with default safeguards for general dev and triage work, GPT-5.5 with Trusted Access for Cyber for verified defensive tasks like malware analysis and detection engineering, and GPT-5.5-Cyber for authorized red teaming and pen testing under account-level controls and stronger verification
The pitch is "resilient by design" rather than reactive patching, with the AI engaged at coding time rather than after disclosure
Daybreak ships with industry partnerships rather than as a standalone product, signaling distribution through existing security vendors instead of a direct buyer relationship

Why it matters:

Cybersecurity is becoming the first enterprise domain where agents are running with real autonomy on critical systems, which is also where dual-use risk is highest. The tiered model is the interesting part: OpenAI is openly acknowledging that the same agent useful for patch validation is useful for finding exploits, and is gating capability behind account verification rather than safety filters alone. If this becomes the template, the security posture of an AI vendor's customer account turns into a control surface as important as the model itself.

Read full article

🚨 AI shrinks vulnerability exploitation window to hours

The Story:

A new report finds that AI-enabled adversaries are closing the gap between a CVE being published and the first observed exploitation from weeks to hours. Defenders are responding: mean time to remediation dropped about 47% across all severity levels in 2025, signaling a shift from periodic pentesting to continuous validation.

The details:

Published CVEs hit 48,244 in 2025, a 20% year-over-year increase, with high-severity findings up 10% while low and medium severity declined
Average mean time to remediation fell from 63 days in 2024 to 38 days in 2025, with critical vulnerabilities patched 25 days faster
The most common findings were still cross-site scripting and authorization/permission issues, but content injection, brute force, and remote code execution rose through the year, matching the pattern of AI-driven probing of identity and access controls
Asset sprawl is widening exposure: organizations average around 40,000 subdomains, web app counts grew year over year (a side effect of AI coding assistants shipping more code faster), and manufacturing asset counts jumped from 2,053 to 2,486 per organization

Why it matters:

The remediation clock and the exploitation clock are both speeding up, but they are not speeding up symmetrically. Automated scanners catch known signatures and miss the logic flaws, misconfigurations, and unexpected behaviors that agentic systems introduce, which is precisely where AI-enabled attackers are spending their time. The takeaway for security teams is that "patch on a sprint cadence" is no longer a viable posture for critical findings. Continuous validation, with humans focused on the logic and architectural issues machines miss, is becoming the floor rather than the ceiling.

Read full article

🍎 Apple's Last Shot at Making Siri Matter in the AI Era

The Story:

Apple is preparing to relaunch Siri at WWDC in June, and according to Bloomberg's Mark Gurman, privacy will be the centerpiece of the pitch. The new Siri will arrive as Apple's first standalone AI app, with a ChatGPT-style chatbot interface and retention controls borrowed from Messages, but the model doing the work will be Google Gemini.

The details:

The Siri app will let users auto-delete conversations after 30 days, after one year, or keep them indefinitely, the same retention pattern Apple uses for iMessage
The chatbot is powered by Google Gemini under a partnership announced earlier this year, with Apple positioning itself as the layer that limits how long user data is used and stored
Apple executives are expected to frame the privacy controls as a differentiator against ChatGPT, Claude, and Gemini's own consumer app
Gurman flags that the privacy framing also serves to soften two awkward points: Siri's continued gap behind frontier chatbots and the fact that a Google model is handling part of the security and processing

Why it matters:

Apple is doing what it did with App Tracking Transparency, turning a competitive weakness into a positioning advantage. The interesting tension is that the privacy story rests on a vendor relationship Apple does not control. Retention limits at the app layer matter, but they sit on top of inference happening somewhere in Google's stack, which complicates the usual "your data stays on your device" narrative Apple has built Siri around. For users, this is the first mainstream consumer AI product where chat retention is a setting rather than a default, and that norm is likely to spread.

Read full article

What´s next?

Thanks for reading! If this brought you value, share it with a colleague or post it to your feed. For more curated insight into the world of AI and security, stay connected.

NeuralTrust | The leading security platform for generative AI

Our platform uncovers vulnerabilities, blocks attacks, monitors performance, and ensures regulatory compliance — everything enterprises need to scale AI

neuraltrust.ai