Echoes of AI Exposure: Thousands of Secrets Leaking Through Vibe Coded Sites | Wave 15 | Project Resonance

redhuntAdmin

3 months ago

1. Introduction

The vibe coding revolution has empowered millions to build and deploy websites using natural languages. Entrepreneurs, artists, and small businesses can now bring their ideas to life online without writing a single line of code. But has this convenience come at a hidden security cost?

In this post, we present the 15th wave of Project Resonance: A RedHunt Labs Research Initiative, investigating the security posture of websites built on modern “vibe coding” platforms. Our research was driven by a central hypothesis: that the non-technical user base of these platforms unknowingly leak sensitive secrets through their publicly accessible websites.

This article details our methodology, presents the key findings from our internet-wide scan, and provides actionable recommendations for users, platform providers, and security teams to mitigate these risks.

2. Our Research Methodology

Flowchart illustrating the Discovery Phase and Secret Collection Phase of the research methodology for analyzing vibe coding platforms.

To ensure our research was thorough and credible, we followed a multi-phase approach:

Phase 1: Discovery

Our first step was to identify and catalogue major vibe coding platforms. We picked 13 popular platforms and subsequently collected a list of more than 130k unique, published domains for analysis.

Phase 2: Enumeration

We developed techniques to collect websites that were publicly published via these platforms programmatically. In some cases, discovery was straightforward; however, others posed more significant challenges. For instance, v0.app deploys all of its generated sites to the vercel.app domain, making a full scan of the subdomain space ineffective. To address this, we implemented filtering strategies to accurately identify sites deployed through v0.app, despite being hosted under the broader vercel.app namespace.

Phase 3: Secret Scanning on scale

With the list of target websites, we initiated a scan specifically looking for hardcoded secrets. Our scanners were configured to detect various types of sensitive information, including API keys, database connection strings, private keys, and other secrets, using a combination of pattern matching and entropy analysis.

Phase 4: Data Aggregation & Analysis

Once the discoveries were made, we pulled together all the findings from the exposed secrets to their surrounding context, such as platform, URL, and secret type. Rather than looking at them in isolation, we treated them as part of a larger picture. By aggregating the data, we were able to analyze patterns at scale, draw correlations between different leaks, and highlight recurring themes. This broader view helped us move beyond individual cases and uncover systemic trends in how and where secrets were leaking across Vibe-coded sites.

Limitations

This research was limited to analyzing secrets present in client-side code and files that were publicly accessible without authentication. The intention was to assess what an external attacker or casual visitor could easily discover. Server-side exposures, such as misconfigured APIs, database leaks, or credentials stored within backend systems, were not part of this study. As a result, the findings likely represent only a portion of the overall exposure landscape, the actual number of leaked secrets could be significantly higher if server-side components were also included.

3. Key Statistics & Findings

Our analysis of the vibe-code ecosystem uncovered a widespread security issue: one in every five websites we scanned exposes at least one sensitive secret.

In total, our scans identified roughly 25,000 unique secrets for popular services like OpenAI, Google, and ElevenLabs. This count specifically excludes generic and low-entropy keys to focus on high-impact secrets.

The scale of our research and the key findings are broken down below:

Scale of Research:

13 vibe coding platforms analyzed.
~130,000 unique published websites scanned.

Key Findings:

~26,000 websites found with at least one leaked secret (1 in 5).
~25,000 unique secrets discovered for popular services.

While the leaks spanned many categories, one finding stood out as a clear indicator of a new and growing risk: the explosion of exposed secrets for AI platforms.

Spotlight on Secrets Belonging to the AI Platform

The recent race to plug AI into everything has opened up a new kind of security blind spot. In the scramble to ship features fast, developers are often leaving the keys to their AI platforms exposed in code or public files. These keys aren’t just configuration details, they’re the crown jewels that control access, usage, and even billing. Our findings show that this problem is more common than most teams realize, and it’s quietly fueling a wave of AI-related secret leaks.

Distribution of Exposed AI Secrets:

Google’s Gemini API keys were overwhelmingly the most common, accounting for nearly three-quarters (72.43%) of all exposed AI secrets.
OpenAI (14.22%) and the voice AI platform ElevenLabs (8.09%) followed, making up the next most significant portion of the leaks.
The remaining fraction was a mix of emerging players like Anthropic, Deepseek, Stability AI, Perplexity, and xAI’s Grok, which collectively accounted for about 5% of the total.

This trend is often a result of users following online tutorials to add chatbot or content generation features, pasting code snippets directly into their site’s public-facing code. These exposed keys are a direct line to a paid service, and they can be easily abused by malicious actors to run expensive queries, potentially leading to thousands of dollars in unexpected bills for the owner.

Spotlight on Exposed Backends & Database Keys

Beyond frontend services, our research uncovered a critical number of exposed secrets for powerful Backend-as-a-Service (BaaS) platforms, which often hold sensitive user data.

Our scan found:

16k+ exposed credentials for Firebase
3k+ exposed credentials for Supabase

These aren’t just abstract numbers; they represent direct keys to application databases. The potential for damage is enormous, as demonstrated by incidents like the Tea App hack, where a misconfigured Firebase instance led to a major database breach. These leaks occur when users embed full-access credentials into their site’s code to fetch data, inadvertently publishing the keys to their entire backend.

The Broader Picture: The Full Scope of Leaked Secrets

Beyond the emerging AI trend, our research highlights a persistent and widespread problem with the handling of other common secret types.

To provide a more focused look at specific, high-impact secrets, we have intentionally filtered the following list to make the statistics cleaner and more informed. Google API Keys have been excluded from this breakdown due to their sheer volume and generic format. Similarly, we have removed secrets identified only by high-entropy string detection (e.g., “Generic API Keys” and “Generic Secrets”) to reduce potential noise and false positives.

NOTE: Google uses the same key format across multiple high-impact services. An exposed key could be for a low-risk service like Maps, or it could grant critical access to a service like Gemini. The real impact is masked behind a generic-looking key. Our analysis revealed that out of all the Google API Keys, around 300+ were working on Gemini APIs.

After applying these filters, the breakdown of the remaining specific secret types is as follows:

Bearer Token: 25.05%
OpenAI API Key: 12.06%
reCAPTCHA API Key: 8.35%
ElevenLabs API Key: 6.86%
Razorpay Key ID: 4.45%
Telegram Bot Token: 3.53%
Artifactory Access Token: 3.15%
Airtable API Key v2: 2.60%
Airtable Personal Access Token: 2.60%
Stripe API Key: 1.86%
Other: 29.50% (This includes a long tail of various secrets such as MongoDB URIs, Slack Webhooks, RapidAPI Keys, Anthropic API keys, and Deepseek API keys)

4. In-Depth Analysis: The Stories Behind the Data

The numbers reveal how easily secrets slip through the cracks. Our research uncovered several key patterns:

How Secrets Are Leaked
- Secrets often get exposed when users feed API keys to AI platforms, which then embed them in public client-side code. This blind spot highlights the need for caution, AI won’t always know what’s sensitive.
The Real-World Impact:
- A leaked key isn’t just text—it’s access. From Stripe API keys enabling financial theft to Supabase strings leading to full-scale data breaches, the risks are real and immediate.
Surprising Discovery:
- AI integrations are fueling leaks. We found a surge in OpenAI and ElevenLabs keys, showing how rushed AI adoption often skips over security best practices.

5. Recommendations and Mitigation Strategies

Protecting against these leaks is a shared responsibility. We have recommendations for everyone involved in the vibe coding ecosystem.

For those who use Vibe Coding Platforms:

Treat Secrets Like Passwords: Never paste API keys, tokens, or credentials in public code or content.
Use Built-In Secret Management: Always use official features like environment variables. If missing, request them.
Automate Detection: Manual checks fail. Use automated tools or a CTEM platform (e.g., RedHunt Labs) for continuous external exposure scanning and alerting.

For those who provide Vibe Coding Platforms:

Pre-Publish Secret Scanning: Block or warn users when secrets are detected before publishing.
Simplify Secret Management: Provide secure, easy-to-use secret storage away from public code.
Educate Users: Add tutorials and in-app warnings about secret exposure risks.

For Security Teams and Businesses:

Monitor Continuously: Track unknown assets created by non-tech teams on no-code platforms.
Adopt CTEM: Automate discovery, attribution, and risk scoring of exposed assets and secrets across your attack surface, including vibe coding sites.

6. Conclusion

Our research demonstrates that while vibe coding platforms offer incredible power and flexibility, they also introduce new avenues for critical security risks, especially for users without a technical background. The ease of building is matched by the ease of leaking sensitive data.

This research underscores the growing importance of a comprehensive Continuous Threat Exposure Management (CTEM) strategy. As more business functions are decentralized to citizen developers, having a unified view of your external assets and exposures is no longer a luxury, it’s a necessity.

At RedHunt Labs, we simplify the complexity of Continuous Threat Exposure Management (CTEM), giving you the visibility and insights needed to protect your organization.

Book a Scan 🔗 and take control of your threat exposure today.

Let’s Reduce Your Org’s Attack Surface.

Request Free Trial

Table of Contents