Introduction
If you are in the Information Technology sector, you most likely have come across the terms Cloud Object Storage, S3 buckets, and/or buckets. Cloud object storage buckets are like digital containers in the cloud where you can store files, documents, and data. Think of them as virtual folders that make it easy to organize and access your digital stuff securely from anywhere.
Many cloud service providers offer this solution under different names, for instance, Amazon Web Services offers Amazon S3, Google Cloud Platform offers Buckets under their cloud storage category and DigitalOcean offers DigitalOcean spaces. Despite how widely adopted this technology is, the instance owners/creators tend to make mistakes now and then, which can turn out to be disastrous for an organization considering the data that is potentially at risk.
But how? Well, if the security settings for a bucket aren’t properly set up, it can allow unauthorized users to view and retrieve the files stored on these storage buckets. To make Cloud Object Security easier, we presented our in-house developed tool BucketLoot during BlackHat USA 2023 and MEA 2023. The idea here is to not just identify misconfigured storage bucket instances from AWS, DO, and GCP but also scan them for extracting assets, and secret exposures and even search for custom keywords or regular expressions.
We decided to dive deep into this issue and look into what the state of security looks like when it comes to storage buckets that we discovered from across the internet this wave is all about it, so brace yourselves for seeing the exciting insights that we were able to pull from here!
Our Approach

The approach of our scan for this wave has been pretty straightforward. Starting with collecting our targets, we looked into various sources including our in-house database for our ASM Platform which stores a whopping 6 Billion+ records about domains, subdomains, third-party SaaS platforms, etc. We specifically targeted DigitalOcean, Google Cloud Platform, and Amazon Web Services since right now BucketLoot only supports these 3 providers.
By the end of this step, we took a sample size of ~ 170,000 targets. Out of these, 141670 belonged to AWS, 26169 belonged to DO and finally, 5080 belonged to GCP.
Since our targets for this wave are storage buckets, what else could be a better option than our in-house developed BucketLoot? For those unaware, BucketLoot is an automated S3-compatible bucket inspector that can help users extract assets, flag secret exposures, and even search for custom keywords as well as Regular Expressions from publicly exposed storage buckets by scanning files that store data in plain text. To know more about this cool tool, check out our tool release blog here.
When it comes to this scan, we ran Bucketloot with the following configuration:
- Default mode (fast), where the requests happen concurrently
- It only looked for files that were below or equal to 10 MB in size considering that we were mostly interested in configuration, log, code files (etc.) that were in a textual format.
- The scan ran in guest mode (default) where we can only scan 1000 files per bucket.
- We logged the output in JSON format.
So the command in this case would look like below:./bucketloot "$url1" "$url2" "$url3" "$url4" "$url5" -max-size 10000000 -save "output/group-${group_index}.json
And finally, once the results were in, we went ahead to extract the interesting insights that we are about to present 🙂.
The Findings
Misconfigured Buckets
When it comes to misconfigured buckets, we found over 23,254 bucket instances that would let users list or retrieve their contents. This means that at least 1 in every 7 buckets was likely to be misconfigured.
Fun fact, these buckets don’t just belong to individual owners, but even some well-known companies. We even discovered an exposed bucket instance belonging to a Biomedical Research and Development NPO, which can have a major impact since it includes a lot of medical records and information. Considering cases like these, exposure of data can be quite critical not just for an organization’s safety but even for the end-users or customers.
The above infographic states the distribution of these Misconfigured 23,254 buckets based on the Cloud Service Provider. Out of the aforementioned count, 19,656 instances were related to Amazon Web Services, 2,057 were related to DigitalOcean and 1,541 were related to Google Cloud Platform.
Secret Exposures
While we were at it, we also unearthed a whopping 13,425 secret exposures from these misconfigured buckets, indicating that a large number of buckets are not just publicly exposed, but also leak out sensitive secrets. As part of our responsible disclosure policy, we are in the process of reporting these exposures to the instance owners that could be correlated with the misconfigured buckets.
In the above infographic, we can see the distribution of “Different Types of Secret Exposures” unearthed from the publicly exposed buckets that we found in our scan. Google API key was the top contender with a massive count of 6,781 entries, followed by AWS Access Key ID with 2,665 entries, and finally Amazon SNS Topic Disclosure with 1,732 entries.
Buckets belonging to Amazon Web Services had the most number of secret exposures with a total of 12084 occurrences, DigitalOcean with 672 occurrences, and finally Google Cloud Platform with 669 occurrences.
All in all, the results seemed pretty scary and the potential these exposures had, ranged from performing actions over GCP using service accounts, interacting with the AWS CLI using the keys, etc. This is a very small chunk of the vast consequences that these secret exposures could bring to an affected individual or organization, thus securing these buckets becomes an utmost priority.
Assets Uncovered
As we are an Attack Surface Management company and are largely interested in collecting Asset intelligence across the internet, we also extracted assets from these publicly exposed buckets (i.e., URLs, domains, and subdomains). We discovered a total of 543,482,065 URL occurrences out of which 160,274,068 were unique entries. Interesting enough. 😉
The below infographic shows the list of the top 10 most frequently present domains in URLs extracted from the misconfigured buckets during the scan.
credifi.com has been the top contributor to the list with a whopping 6200161 occurrences, followed by jupix.co.uk and s3browser.com with 3124893 and 2733403 occurrences respectively. Through these URLs, we further extracted 221311 unique domains and 228669 unique subdomains. Thanks to our Recon API.
Quite often, assets like these can uncover hidden endpoints and URLs that should have never been disclosed in the first place but can also be helpful for bug hunters and penetration testers from a recon point of view.
Mitigation
Considering the insights we were able to draw from this research, we will recommend the following 3 points to anyone associated with handling cloud object storage buckets in general. These are,
- Configure the security settings properly: Ensure that access controls, permissions, and policies for your cloud object storage buckets are correctly configured. This way you can restrict your bucket and its contents from being viewable or retrievable by unauthorized users.
- Avoid storing secrets in buckets: Refrain from storing sensitive information, such as access keys, API tokens, or any confidential data, directly in your cloud object storage buckets. You can instead use a dedicated secrets management solution or a secure key vault specifically designed for storing and managing sensitive information.
- Regularly monitor and audit your buckets: As a business or individual dealing with cloud object storage buckets, it’s crucial to monitor them regularly to leave no stone unturned. This way security misconfigurations can be avoided and the bucket contents will remain secure.
By following these recommendations, organizations and individuals can significantly improve their security posture around cloud object storage buckets thus reducing the chances of misconfigurations or data leaks.
Conclusion
Through this project resonance wave, we were able to draw some interesting insights, and all in all, it demonstrates how the security posture around cloud object storage buckets is still a topic of concern that needs to be addressed quickly. Individuals or organizations dealing with storage buckets need to immediately address this issue and make sure that their instances are secure and do not have any security lapses.
How can we help?
Our SaaS-based Attack Surface Management solution, NVADR, continuously keeps track of your organization’s external digital footprint by identifying and profiling the assets as they surface on the internet. The assets we identify are way beyond IP Addresses and Subdomains, and we cover a wide variety, including Docker Containers, Mobile Applications, Code Repositories, and much more. Once these assets are identified, we find security misconfigurations across all asset classes (including secret exposures and misconfigured cloud storage buckets).
To understand how NVADR can help your organization improve its external digital footprint and security posture, Request a Demo.