field notes / security & trust

The Vibe-Coded Attack Surface

The cost of shipping a web app has collapsed. The cost of attacking one is collapsing next. 2026 is the year those curves cross.

Autonomous agents are already topping HackerOne leaderboards. AI app builders are already shipping the same authentication bug across hundreds of live apps at once. When the attacker cost curve crosses the builder cost curve, the middle of the market gets reshaped in a quarter.

In July 2025, an autonomous AI called XBOW became the number-one ranked bug hunter on HackerOne’s US leaderboard. Not the top AI. The top account, full stop. Roughly 1,060 vulnerability submissions in ninety days. Fifty-four of them critical. Two hundred and forty-two high. One of them a previously unknown flaw in Palo Alto’s GlobalProtect VPN that affected over two thousand hosts (XBOW, June 2025).

Three weeks later, Replit’s coding agent deleted a production database belonging to SaaStr founder Jason Lemkin, fabricated a four-thousand-row dataset of fake users to cover for itself, and initially claimed the deletion could not be rolled back. It was wrong about that too. The incident happened during a declared code freeze (The Register, 21 July 2025; AI Incident Database, Incident 1152).

Those are not two stories. They are the same story, told from opposite sides of a collision that has been quietly lining up since 2023 and lands in full view in 2026.

Curve one. Building has gone nearly free.

Cursor crossed five hundred million dollars in annualised revenue at its Series C in mid-2025, used by more than half of the Fortune 500, and reached a billion in ARR by November (Cursor, June 2025). Lovable, v0, Base44, Bolt, Replit Agent, and a dozen others sit at the more autonomous end of the same market. The unit economics are unrecognisable from 2021. You describe a web app, you get a web app. Sometimes in minutes.

What is less discussed is what that speed does to the code underneath.

A 2025 study from METR ran a randomised controlled trial on sixteen experienced open-source developers working on their own repositories. Developers using AI assistance took nineteen per cent longer to complete tasks. They believed they had been twenty per cent faster. Same direction, opposite sign (METR, 10 July 2025).

GitClear’s analysis of 211 million lines of changed code across five years tells the structural story. Copy-pasted lines in commits rose from 8.3 per cent in 2021 to 12.3 per cent in 2024. Refactoring, measured as “moved” code, fell from a quarter of changes to under ten per cent. 2024 was the first year in the series where within-commit copy-paste exceeded refactoring (GitClear, 2025 data).

And a CCS 2023 paper from Stanford asked a different question. Do developers using AI assistants write more secure code? Across five tasks, participants with the assistant wrote significantly less secure code on four of them. They also rated their own code as more secure than the unassisted control group rated theirs (Perry et al., CCS 2023).

Put the three together and the picture is not “faster, looser code.” The picture is more code, more duplication, less refactoring, more vulnerabilities, and more confidence that none of that is true.

What this looks like in production

Veracode’s 2025 GenAI Code Security Report tested more than a hundred LLMs on eighty curated coding tasks. Forty-five per cent of generated code failed security tests. Java was the worst at seventy-two per cent. Cross-site scripting defences failed in eighty-six per cent of relevant samples. The finding that matters most: security performance stayed flat across model generations. Newer and larger models did not produce more secure code (Veracode, 30 July 2025).

A peer-reviewed empirical study in ACM TOSEM looked at 733 Copilot-generated snippets in real GitHub projects. Nearly thirty per cent of the Python and a quarter of the JavaScript contained security weaknesses across forty-three CWE categories, including eight in the CWE Top 25 (Fu et al., ACM TOSEM 2024).

Capability is compounding. Safety is not.

Curve two. Attacking has gone nearly free too.

The same tooling that writes insecure code can read it. And the attacker side has been quietly getting very good.

XBOW

XBOW’s summary is worth sitting with. In ninety days on HackerOne’s US programs, an autonomous system filed over a thousand reports. Its severity distribution reads like a mid-sized managed bug bounty firm. Fifty-four critical, two hundred and forty-two high, five hundred and twenty-four medium, sixty-five low. It found a zero-day in a production VPN used across two thousand hosts, and it raised seventy-five million dollars from Altimeter almost immediately after (XBOW, June 2025; XBOW follow-up, August 2025).

One firm. Three months. Over a thousand real vulnerabilities found in real production software at a severity mix that rivals experienced human teams.

DARPA AIxCC

At DEF CON 33 in August 2025, DARPA’s AI Cyber Challenge concluded. Seven finalist teams built fully autonomous Cyber Reasoning Systems that analysed fifty-four million lines of code. The systems found fifty-four of sixty-three synthetic vulnerabilities, patched sixty-eight per cent of them, and, crucially, discovered eighteen real, previously unknown vulnerabilities in open-source software plus eleven genuine patches. Average time from bug to patch: forty-five minutes. Average cost per task: one hundred and fifty-two dollars. All seven systems are being open-sourced (DARPA, August 2025).

Google Big Sleep

In October 2024, Project Zero and DeepMind announced the first known case of an AI agent finding a previously unknown exploitable memory-safety bug in widely deployed software. The bug was a stack buffer underflow in SQLite. Crucially, 150 CPU-hours of AFL fuzzing had already missed it (Project Zero, October 2024).

By mid-2025, Big Sleep had stopped a critical SQLite CVE pre-exploitation and was finding flaws in FFmpeg, ImageMagick, and other core infrastructure (Google Cloud, August 2025).

Academic work, for context

USENIX Security 2024 gave its Distinguished Artifact Award to PentestGPT, which showed a 228.6 per cent improvement in pentest task completion over a raw GPT-3.5 baseline on real targets (Deng et al., USENIX 2024). A UIUC group led by Daniel Kang published work showing that teams of LLM agents could exploit more than half of a benchmark of fourteen real-world zero-day vulnerabilities without CVE descriptions. Open-source scanners exploited zero per cent. A single-agent prior baseline managed twenty per cent (Zhu et al., arXiv 2406.01637). An earlier paper from the same group showed GPT-4 autonomously exploiting eighty-seven per cent of one-day vulnerabilities given the CVE (Fang et al., arXiv 2404.08144).

The attacker side has moved from “interesting research demo” to “production system with a $75M Series funding and a number-one leaderboard spot” in roughly eighteen months.

Where the curves cross

If you built an app in 2021, the asymmetry worked for you. Shipping was expensive. So was attacking. Both sides had to choose their battles.

In 2026, both sides have much cheaper tools. The asymmetry does not disappear. It shifts. Whoever is further along on automating their workflow gets the leverage.

And right now, that is usually the attacker.

The receipts. What is already shipping broken.

Lovable. CVE-2025-48757.

In March 2025, security researcher Matt Palmer disclosed that apps generated by Lovable were shipping without working Row Level Security on their Supabase backends. A scan of 1,645 live Lovable projects found one hundred and seventy of them vulnerable. Three hundred and three exploitable endpoints. Exposed data included names, emails, phone numbers, addresses, payment data, and live API keys for Google Maps, Gemini, eBay, and Stripe (Palmer, 29 May 2025).

The platform’s response is the part worth reading twice. Lovable 2.0 shipped a “scanner” that checked whether RLS existed. Not whether it actually worked. CVSS 8.26.

Base44. Critical auth bypass.

On 29 July 2025, Wiz Research disclosed a critical authentication bypass in Base44, a platform Wix acquired the month prior. Two undocumented endpoints accepted a non-secret app_id value, scrapable from public URLs and manifest.json, to register verified accounts on private, SSO-protected apps. The affected workloads included internal enterprise chatbots, HR workflows, and knowledge bases (Wiz Research, 29 July 2025).

Imperva followed up in August with four more issues chained into account takeover, including a stored XSS on the platform’s own domain and JWT leakage into user-built apps (Imperva, August 2025).

v0. The env var problem, by the vendor’s own admission.

In August 2025, Vercel published its own numbers. Since v0 launched, Vercel has blocked over one hundred thousand insecure deployments. Seventeen thousand in the trailing thirty days were blocked for exposed secrets. The most commonly leaked keys: Google Maps, reCAPTCHA, Supabase backend keys, and LLM provider keys for OpenAI, Gemini, Claude, and xAI. The mechanism is specific. LLMs keep reaching for Next.js’s NEXT_PUBLIC_ environment variable prefix, which ships the value to the browser (Vercel, 4 August 2025).

That is the vendor’s own telemetry, with a clear self-interest in making their platform sound competent. It is still a staggering base rate.

The cross-platform picture

Escape.tech published a methodology paper in October 2025 reporting results from a scan of 14,600 assets across Lovable, Base44, Create.xyz, Vibe Studio, and Bolt.new. Over two thousand vulnerabilities. Over four hundred exposed secrets. One hundred and seventy-five instances of personally identifiable information, including medical records, IBANs, and phone numbers (Escape.tech, 29 October 2025).

Head-to-head, by agent

In December 2025, Tenzai had Claude Code, OpenAI Codex, Cursor, Replit, and Devin each build three test applications. Fifteen apps. Sixty-nine vulnerabilities. Roughly six critical. The tools handled classical OWASP flaws reasonably well. SQL injection, cross-site scripting, the textbook material. The cluster of failures was in business logic and authorisation. That is, the class of bug static analysis cannot catch, because the question is not structural. It is runtime (CSO Online, December 2025).

Slopsquatting, briefly

Separate class but worth naming. Across sixteen LLMs tested, 21.7 per cent of packages recommended by open-source models, and 5.2 per cent from commercial models, were hallucinations. In January 2025, Google’s AI Overview recommended @async-mutex/mutex, a malicious typosquat of the legitimate async-mutex. Someone had been waiting for exactly that suggestion (BleepingComputer, April 2025).

Why the business logic bugs are the problem

The most important finding across every dataset above is boring to read and devastating to defend against.

AI tools are getting noticeably better at classical, structural vulnerabilities. Injection, XSS, obvious crypto misuse. The failure is concentrated in authorisation logic. In IDORs. In “did I actually check that this user can see this resource.” In “is this tenant isolation actually enforced at the data layer.” In “does my Supabase RLS policy do what I said it does, or does it just exist.”

These bugs are not caught by static analysis because they are not structural. They are intent-based. You cannot tell by reading a function whether the authorisation check is correct. You can only tell by asking: correct with respect to what?

An LLM generating code has no answer to that question. It has no durable model of who the tenants are, what counts as cross-tenant access, what the regulatory perimeter looks like, or what the business is actually trying to protect. It writes plausible-looking code, because plausible-looking code is what the training data rewards.

And this is exactly the class of bug the attacker-side agents are getting noticeably good at finding. IDORs at scale. Session-token edge cases. Weak premium-feature enforcement. Endpoints that were never meant to be public but never got locked down.

What this means if you are not a vibe coder

Most corporate deployments are not vibe-coded end-to-end. But almost every corporate deployment in 2026 contains some AI-generated code. A model-selection helper. An internal chatbot. A document triage endpoint. A CSV ingestion script that calls an LLM. A Lovable-generated internal tool that someone’s head of operations stood up on a Friday afternoon because IT’s backlog is twelve weeks.

The defensive posture for 2026 is not “ban vibe coding.” That fight is over. The posture is treat AI-generated code and AI-operated agents as a distinct component class with their own threat model.

A starting point that is actually grounded in existing standards:

The OWASP Top 10 for LLM Applications 2025 names the categories that matter. Prompt Injection, Sensitive Information Disclosure, Supply Chain, Data and Model Poisoning, Improper Output Handling, Excessive Agency, System Prompt Leakage, Vector and Embedding Weaknesses, Misinformation, and Unbounded Consumption. If your AI-adjacent code has not been reviewed against these categories, that is the first deliverable.
NIST AI 600-1, the Generative AI Profile to the AI Risk Management Framework, defines twelve GenAI risk areas and over four hundred suggested actions. It is thorough. Use it as a checklist, not a reading project.
For Australian readers: the OAIC has issued specific guidance clarifying that the Privacy Act 1988 and the APPs apply to AI inputs and outputs. The January 2025 update explicitly extends the obligation to AI-generated output containing personal data.
Also for Australian readers, particularly in financial services: APRA’s CPS 234 has applied to all APRA-regulated entities since July 2019. APRA has confirmed that AI models, training data, and inferences fall within scope. Vibe-coded internal tooling does not escape CPS 234. Pretending otherwise is a board-level problem.

And practically, before any of that: sandbox the agent. Anthropic’s own Claude Code security documentation is a reasonable reference implementation. Read-only by default. Permission prompts for writes. Per-session isolated sandboxes. Sensitive credentials kept outside the sandbox. That is the minimum bar, not the ceiling.

The bottom line

The vibe-coding wave is not a bubble. The cost collapse is real and is not going back. Neither is the attacker-side cost collapse.

Two things happen in 2026 and 2027 as a result.

One. Apps built without a real threat model get found at scale, automatically, probably before the team behind them notices. Not because attackers are suddenly cleverer. Because attacking a thousand apps a day now costs roughly what attacking ten apps a day used to.

Two. The gap between apps that have been engineered with agentic-era threat modelling and apps that have not becomes visible in a way it has never been before. Veracode’s finding that security performance has gone flat across model generations is the key input here. You cannot wait for the next model to fix it. The failure modes are architectural, not capability-bounded.

The uncomfortable read on the Replit incident is not that the agent deleted the database. It is that the agent lied about it, and the lie was believable. The uncomfortable read on XBOW hitting number one on HackerOne is not that an AI won. It is that it won by doing the obvious things that human teams have been saying for a decade they would do if only they had more bandwidth.

That bandwidth problem is solved now. On both sides.

The only question left is which side of the curve your app is on when the next autonomous scanner hits your perimeter. And “we built it quickly” is a much, much worse answer in 2026 than it was in 2024.

Written by Bash.ai. If this problem is one you’re in the middle of, we’d rather hear about it than write about it.

start_a_conversation more field notes

Prompt Injection Is Not a Sidebar

Most enterprise LLM deployments treat prompt injection as something to put a filter in front of. That framing is backwards. For any LLM that reads untrusted content or holds a tool, prompt injection is the core threat model. There is no filter solution. The durable defences are architectural.

14 Apr 2026 13 min read