At RedHunt Labs, we conduct extensive internet-wide studies as part of Project Resonance to stay ahead of the evolving cyberspace and enhance our Attack Surface Management (ASM) platform. This blog highlights our recent research, where we analyzed billions of IP addresses to check for port 80 open, uncovering fascinating insights.
Introduction
The internet is massive – an ever-growing network of roughly 3.7 billion publicly accessible IP addresses. Each of these hosts can run services on up to 65,535 ports, creating an almost unimaginable 240 trillion potential points to explore. Connected to this digital grid are all sorts of devices: phones, laptops, industrial machines, even power plants. Yes, power plants!
In such a complex and chaotic system, vulnerabilities often hide in plain sight. To better understand this landscape, we picked up results from one of our routine public internet scans, focusing on port 80. While we weren’t actively searching for flaws, what we uncovered told a fascinating story about the state of online security today.
During our scan, we gathered the following information from the response bodies –
- Response Headers
- Favicon Hashes
- Social Media Links
- Email Addresses
- IPs
- Cloud Bucket
Our methodology enabled extensive data collection but had limitations:
- Virtual Hosting: Sites serving content based on hostnames were missed, as we scanned IPs directly. (We performed a detailed research on that sometime back, did you read the blog?)
- Dynamic Content: Web apps generating user-specific responses limited our ability to extract details like emails, social links, and S3 buckets.
That said, the internet delivered fascinating results, offering even more intriguing insights to uncover. Let’s dive in!
Key Statistics
Out of the scanned IPv4 address space, approximately 42 million IPs were found to have port 80 open, making them accessible for HTTP requests.

The scan uncovered over 25,000 unique HTTP response headers, showcasing the diversity of web server configurations and implementations. Believe us, some of these headers made us scratch our heads.
Approximately 2.1 million unique favicon hashes were collected, offering insights ranging from grouping similar organizations and identifying technologies to detecting phishing websites.
When it comes to different artefacts, we found around 500,000 email addresses, 600,000 social media links, and 25,000 cloud storage buckets, of which the buckets’ distribution is as follows:
Insights
Based on the statistics gathered, we asked ourselves some questions (on your behalf) and tried to answer them here to gather insights and useful information.
1. HTTPS Redirection:
In today’s world, where security is supposedly a top priority, you’d expect all – or at least most – HTTP connections to redirect to HTTPS for a secure experience. Right? Not really.
Our findings revealed a significant gap in HTTPS adoption. Only 12.8% of hosts with port 80 open implemented secure redirection to port 443, leaving most connections vulnerable. Rest all chose to stay with HTTP and employ clear-text communication, since no forceful redirection was implemented. There could be a very small number of hosts implementing JavaScript-based redirection, but that count is almost negligible.
Approximately 1.7% of the websites that redirected to port 443 were found using the SHA1-RSA algorithm. Surprisingly, some still relied on the MD5 algorithm, which has been considered weak for over a decade. Others used outdated ciphers, unsupported protocols, or misconfigured certificates, showing poor security standards.
We also gathered additional certificate details to derive deeper insights and correlations which will be analyzed further in later stages, such as:
- Certificate Subject
- Validity
- Cipher Suite
- Issuer
2. Certificates and Cipher Suites
An analysis of SSL certificate issuers showed notable adoption patterns, with Amazon Trust Services (ATS) leading among issuers for redirections from port 80 to 443.
Current data shows that a majority of internet traffic (57.18%) utilizes strong, modern ciphers like TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 💪. However, older and less secure ciphers, such as TLS_RSA_WITH_AES_256_CBC_SHA256 ☹️, are still in use, highlighting the need for wider adoption of stronger encryption standards.
3. Infrastructure & Technology Insights
The Server header analysis showed widespread usage of popular web servers:
- Nginx (~ 10 Million) and Apache (~ 6 Million) emerged as dominant, showcasing their enduring popularity in hosting environments.
- A significant number of hosts were identified as being hosted on AWS infrastructure (~ 1.3 Million).
The X-Powered-By header highlighted the adoption of server-side technologies:
- ASP.NET led the pack, showcasing its dominance across the internet.
- Frameworks like Express.js reflected growing adoption of modern JavaScript-based backends.
Legacy PHP versions (e.g., 5.x.x) were surprisingly still prevalent, signaling a high likelihood of vulnerabilities due to outdated software.
4. Headers
There are common headers which were identified during the internet scan analysis, such as
- Server
- Content-Type
- Cache-Control
From a security standpoint, the usage of some of the security headers is shown below:
And apart from the common headers, there were a few of them which were not-so-common –
- bar: foo
- X-Drone-Version
- X-Debug-Featureflag-Releaseinprogress
- x_clacks_overhead: GNU Terry Pratchett
- x_xxs_protection: 1
- tools_name: My Seo please go
- Nooooo!!!!!x-frame-options: ALLOW-FROM https://*abc.com/
5. Favicons
While scanning 42 million IP addresses, we also collected their favicons, uncovering some surprising results along the way.
The top 5 favicons that we found during our scan were:
| Favicon MD5 Hash | Count | Service |
|---|---|---|
| 89b932fcc47cf4ca3faadb0cfdef89cf | 674901 | Hikvision IP Cam |
| d41d8cd98f00b204e9800998ecf8427e | 565511 | Empty |
| 7ef1f0a0093460fe46bb691578c07c95 | 268857 | Dede CMS |
| 60fa7ed2309d77de1f9dc5e7c741ac48 | 156069 | Sonicwall Firewall |
| a437e84d20c9cf7442fffab49e0f07e7 | 155602 | Dahua XVR |
These findings highlight the diversity of services exposed online, ranging from Hikvision IP cameras to security devices like SonicWall and Dahua XVR. Such insights provide a valuable glimpse into the internet’s infrastructure and potential security trends.
Analyzing all this data at once can feel overwhelming, but it’s essential for seeing the big picture. The insights it offers are immense, and this is just a fraction of the internet – picking up results from the scan of a single and one of the default HTTP ports within a specific type of IP range (IPv4). There’s still so much more to uncover.
To contribute to the community, we’re also releasing some of our datasets, enabling other researchers and security practitioners to explore and derive their own insights –
- Top 100 Favicon Hashes
- Unique HTTP response Headers
- Unique “Server” Headers
- Unique X-powered-by Headers
- Unique X-Frame-Options Headers
- Unique CSP Headers Headers
Dataset: Project Resonance Wave 12 – Open Port Chronicle: What Port 80 Revealed About The Internet Dataset
Conclusion
Analyzing the vast amount of data from internet-wide scans is no small feat. From routine scanning and parsing data to extracting insights and correlating it with other information, the process can quickly become overwhelming. Some day we will write a rant blog on this.
At RedHunt Labs, we simplify this complexity for you, primarily because we are curious and passionate, and secondarily because we are building one the most technically advanced Attack Surface Management platform.
By doing the heavy lifting, we provide organizations with the visibility and control they need to secure their online assets. With RedHunt ASM, businesses can stay ahead of risks and ensure their attack surface remains protected in today’s rapidly evolving digital landscape.