- The AI Trust Letter
- Posts
- Executives Are the Biggest Users of Shadow AI
Executives Are the Biggest Users of Shadow AI
Top AI and Cybersecurity news you should check out today

Welcome Back to The AI Trust Letter
Once a week, we distill the most critical AI & cybersecurity stories for builders, strategists, and researchers. Let’s dive in!
🕶️ Shadow AI Is a Leadership Problem, Not an Intern Problem

The Story:
A new report flips the usual assumption about shadow AI. It is not junior staff sneaking ChatGPT past IT. The people setting the policies are the ones breaking them most often.
The details:
65% of senior decision-makers use unapproved AI tools at work, compared to 31% of employees below that level
78% of decision-makers feel confident using AI, against 43% of the rest of the workforce, and some avoid approved platforms specifically because they do not want their activity tracked
44% of workers say their organization provides no training on safe AI use, and self-learning through videos and blogs is the main way employees pick up skills
Nearly one-third of respondents said they would keep using AI tools even if their employer banned them and threatened disciplinary action
Why it matters:
Shadow AI policies tend to be written as if the problem lives at the bottom of the org chart. This data points the other way. When leadership moves faster than its own governance, the policy stops functioning as a control and starts functioning as something only junior staff are expected to follow. The practical fix is less about tighter restrictions and more about closing the gap between what approved tools can do and what people actually need to get their work done.
🛡️ Why Red Teaming Is Becoming Non-Negotiable for Enterprise AI Agents

The Story:
Enterprises are racing to put AI agents into production, but most are skipping the adversarial testing that would expose how those agents actually break. A Forbes Tech Council piece argues that red teaming is no longer a security nice-to-have for agentic systems. It is the only practical way to know what you are deploying.
The details:
AI agents do not just generate text. They call APIs, write to databases, and execute actions, which means a successful prompt injection or jailbreak is no longer a content problem but an unauthorized transaction
Traditional penetration testing assumes deterministic systems. Agents are non-deterministic, so a single test pass tells you almost nothing about how the agent will behave across thousands of real interactions
Indirect prompt injection, where malicious instructions sit hidden inside documents the agent reads, is the attack vector most enterprise teams are not testing for
Regulatory pressure is catching up. The EU AI Act and emerging frameworks like MITRE ATLAS are pushing red teaming from optional to documented requirement for high-risk systems
Gartner expects more than 40% of agentic AI projects to be canceled by the end of 2027, with inadequate risk controls cited as one of the main reasons
Why it matters:
The gap between piloting an AI agent and running one in production is mostly a gap in adversarial testing. An agent that handles refunds, queries customer data, or files tickets is a new piece of attack surface, and the people who will probe it hardest are not your QA team. Building a red teaming practice now is cheaper than discovering the failure modes after the agent is live and connected to your systems of record.
🌐 Even Google Is Figuring Out AI Security on the Fly

The Story:
TechCrunch's Connie Loizos sat down with Francis de Souza, COO of Google Cloud, who laid out the standard advice for companies trying to secure AI: bake security in from the start, get governance in place, demand auditability from your platforms. Sound advice. The catch is that Google itself is currently demonstrating how hard this is to do well, with developers waking up to five-figure Gemini bills they never authorized.
The details:
De Souza's core argument is that security cannot be bolted on later and that the threat landscape has shifted fundamentally. The average time between an initial breach and the next stage of an attack has dropped from eight hours to 22 seconds
He flagged a specific risk most teams underestimate: AI agents roaming internal systems will surface forgotten SharePoint servers and stale access controls that nobody has cleaned up in years, exposing data that was previously protected by obscurity
His proposed answer is fully agentic defense, with humans overseeing AI agents that respond at machine speed rather than sitting in the loop on every decision
Reports have documented Google Cloud developers hit with bills like $10,138 in 30 minutes and AUD $17,000 overnight after Google quietly expanded the scope of Maps API keys to also call Gemini, and after its automated systems raised billing tiers up to $100,000 without explicit consent
Researchers found that revoked Google API keys remain usable for up to 23 minutes after deletion, long enough for attackers to exfiltrate cached Gemini conversation data. Google's newer credential formats revoke in five seconds to one minute, which suggests the issue is prioritization rather than engineering
Why it matters:
There is a widening gap between what platform vendors are telling customers to do and how fast those same vendors are adapting their own products. The advice to take a platform approach and demand security from your providers is correct, but it only works if the platforms can actually deliver on it. For teams building on any major AI provider right now, the practical takeaways are concrete: audit what scopes your API keys actually have today rather than what they had when you deployed them, set hard billing caps and verify they hold, and treat agent rollouts as a trigger to clean up legacy access controls before the agents find them for you.

The Story:
A new attack class is getting attention from AI security researchers. It does not jailbreak the model or trick it into breaking its rules. Instead, it manipulates an image at the pixel level so the model perceives something completely different from what a human sees, then lets the model honestly report that false reality. NeuralTrust calls it AI authority laundering, borrowing the term from money laundering: a "dirty" narrative gets passed through a trusted AI and comes out looking clean.
The details:
The mechanism is a perceptual discrepancy attack. Tiny, invisible pixel changes leave the image looking benign to humans, but the AI's vision encoder reads it as a different concept entirely
The model is not misbehaving. It is doing exactly what it was trained to do: describe what it sees with confidence. Alignment techniques like RLHF do not help here because the model is not being asked to break any rules
Two channels are at risk. Epistemic authority covers the AI as a source of truth (fact-checking, identification, recommendations). Compliance authority covers the AI as a gatekeeper (content moderation, NSFW filters, brand safety scans). Both can be laundered with the same technique
Practical scenarios include a fact-checking bot confidently misidentifying a public figure based on a manipulated photo, a shopping assistant recommending an inferior product because the image was perturbed to look superior, and harmful content getting a clean bill of health from moderation filters
Researchers note the attack bar is low. The optimization techniques required have existed for over a decade and do not need cutting-edge math to execute
Why it matters:
Most AI security thinking still revolves around what the model says. This shifts the problem to what the model sees, and current defenses are not built for that. For anyone deploying vision-capable models in production, especially for moderation, verification, or recommendation, the takeaway is to stop treating model output as ground truth. Cross-checking with differently architected models, keeping humans in the loop on high-stakes visual decisions, and framing AI conclusions as interpretations rather than facts are the practical first steps. The era of trusting what the AI says it saw is closing fast.
⚡ AI Agents Are Creating Production Incidents No One Is Tracking

The Story:
Every autonomous agent action that touches infrastructure is a chaos engineering event, but most engineering teams do not treat it that way. Agents are running in production, taking actions, and generating cascading failures that postmortems log as connection pool saturations or latency spikes, with the agent itself invisible in the incident report.
The details:
79% of organizations now have AI agents in production and 96% plan to expand. Gartner projects 33% of enterprise software will include agentic AI by 2028, while also warning that 40% of those projects will be canceled for poor risk controls
The failure mode sits between those numbers. An agent detects elevated latency on a microservice and restarts the cluster, a reasonable action in isolation. What the agent does not see: the shared connection pool at 87% utilization, the dependent database mid index rebuild, three other services handling peak traffic. The restart triggers a cascade nobody's chaos engineering program tested for
When a human runs a chaos experiment they check error budget burn rate, dependency stability, blast radius. When an agent acts on an anomaly, none of those checks happen. The agent has no shared view of how much stress the system can currently absorb
Reported AI-related incidents rose 21% from 2024 to 2025, and that number almost certainly understates the real exposure because most organizations have no incident classification for "agent action triggered cascade"
One proposed fix is to treat absorb capacity as a continuously recomputed resilience budget drawing on four signals: SLO burn rate, P99 latency trend, dependency saturation state, and application behavioral signals like session completion rates. Every chaos experiment and every agent action draws from the same budget
Language models can usefully generate chaos hypotheses from postmortem corpora, where the signal is grounded in real production failures. They fail at generating hypotheses from dependency graphs because stale graphs produce confidently wrong blast radius estimates
Why it matters:
Most enterprises treat autonomous remediation agents and chaos engineering as separate disciplines run by separate teams with separate vocabularies. They are the same discipline. The practical first step is unglamorous but concrete: audit every autonomous agent currently touching infrastructure, map its action surface against your live SLO burn rate signals, and define explicit floor conditions below which the agent must wait or escalate to a human. Most organizations running agents at scale already have several acting entirely outside their resilience accounting. The choice is to find them now or have production find them first.
What´s next?
Thanks for reading! If this brought you value, share it with a colleague or post it to your feed. For more curated insight into the world of AI and security, stay connected.
