The AI Trust Letter
Posts
Remote code injection and AI court drama - Issue 3

Remote code injection and AI court drama - Issue 3

Top AI and Cybersecurity news you should check out today

Rodrigo Fernandez
May 26, 2025

What is The AI Trust Letter?

Once a week, we distill the five most critical AI & cybersecurity stories for builders, strategists, and researchers. Let’s dive in!

🚨 Remote Prompt Injection hits GitLab Duo

The Story:
Researchers found that attackers can plant hidden instructions in GitLab Duo’s context, merge requests, comments, commit messages or source files and trick the AI assistant into leaking private source code.

The details:

Hidden prompts in MR descriptions, issue comments, commits and code influenced Duo’s answers
Encoding tricks (Base16, KaTeX, invisible text) kept malicious instructions out of view
Streaming markdown rendering allowed injected <img> tags to exfiltrate base64-encoded code via HTTP requests
The same method can leak confidential issue content, including zero-day vulnerability details
GitLab patched both prompt and HTML injection in duo-ui!52, blocking unsafe tags to external domains

Why it matters:
Any AI assistant that ingests full page context can be manipulated into exposing sensitive data. Treat LLM inputs as untrusted, sanitize user content and restrict rendered HTML to safe elements.

Read full article

🚀 Anthropic Overtakes OpenAI?

The Story:
Claude Opus 4 handled a full day’s worth of coding without losing track. At Rakuten, it worked on a complex code refactor for seven hours straight, never needing a reset or reminder.

The Details:

Marathon focus: It treated a multi-hour session as one continuous task, picking up right where it left off.
Benchmark victory: On the SWE-Bench coding test, it scored 72.5%, beating GPT-4.1’s 54.6%.
Dual-mode smarts: It delivers instant fixes for simple edits, then shifts into deep-dive mode for harder bugs.
Memory that sticks: It keeps track of earlier edits and can generate summaries, so you don’t have to repeat yourself.
Built for your tools: It plugs into GitHub Actions, VS Code, and JetBrains out of the box and offers APIs for file handling and prompt caching.

Why It Matters:
AI is no longer just a quick helper. Claude Opus 4 can own an entire project, freeing your team to focus on review and strategy. To get the most out of this all-day AI partner, set up clear review steps, track its changes, and choose the right tasks for human vs. machine.

Read full article

⚖️ AI Hallucinations Haunt Courtrooms

The Story:
Judges are spotting made-up cases, authors and laws in filings drafted with AI, and they’re not amused.

The details:

California judge Michael Wilner fined Ellis George $31,000 after an AI-assisted brief cited articles that don’t exist.
In a record-label lawsuit, Anthropic’s own lawyers slipped a wrong title and author into a citation generated by Claude.
Israeli prosecutors cited statutes that don’t exist when asking to hold a suspect’s phone as evidence.
Legal experts warn these errors persist because AI feels authoritative and lawyers under tight deadlines often skip thorough checks.

Why it matters:
AI hallucinations can lead to sanctions, flawed rulings and lost trust in the justice system. Lawyers must treat AI drafts as untrusted, verify every fact and citation, and keep human review front and center.

Read full article

✈️ GenAI Security Takes Flight for Airlines

The Story:
Airlines are using Generative AI for everything from chatbots to maintenance forecasts. That brings big gains but also new risks that could ground flights or expose passenger data.

The details:

Customer‐facing chatbots can be tricked by malicious prompts into revealing booking records or executing unintended actions.
Poisoned training data may hide real engine faults or inject false safety warnings into maintenance recommendations.
AI systems handling passenger name records risk leaking PII if their outputs aren’t strictly filtered.
Theft or hijack of an AI model controlling ground traffic or baggage sorting could cause major operational disruptions.

Why it matters:
GenAI now underpins critical airline functions. Without prompt‐injection defenses, data‐poisoning checks and zero-trust controls, these systems become entry points for service failures, safety lapses and privacy breaches.

Read full post

😊 We are finalists at South Summit!

Out of 4,500 startups, NeuralTrust made the top 100 for South Summit Madrid’s 2025 Startup Competition.

The details:

When & where: June 4–6 at La Nave, Madrid’s innovation hub; finalist pitches on June 5 and one-on-one meetings on June 6
Who’s there: 30,000 founders, investors and executives across ten vertical tracks
Finalist perks: Main-stage pitch to investors, expert workshops, VC and media exposure, plus global networking
Track record: Past winner Invopop used its South Summit win to expand into five new markets and close a decisive Series A

Why it matters:
This nod underlines how crucial AI security is for every industry. If you’re at South Summit, stop by our Trust Tech booth for live demos of TrustGate, TrustLens and TrustTest. Our full suite for secure, compliant GenAI.

Read full article

👀 We are hiring!

We’re looking for passionate teammates to join us:

Business Development Representative
Market Research Intern

Interested? Check our positions on LinkedIn or reach out for details.

What´s next?

Thanks for reading! If this brought you value, share it with a colleague or post it to your feed. For more curated insight into the world of AI and security, stay connected.

NeuralTrust | The leading security platform for generative AI

Our platform uncovers vulnerabilities, blocks attacks, monitors performance, and ensures regulatory compliance — everything enterprises need to scale AI

neuraltrust.ai