Pros
- • Completely passive discovery generates zero alarms on client infrastructure
- • Maps external exposure across domains, subdomains, and shadow IT
- • Enriches asset inventories with Shodan, DNS, and Certificate Transparency data
- • Visualizes complex organizational relationships using Maltego graphing
- • Provides highly accurate scoping intelligence for subsequent penetration tests
- • Identifies historical credential breach exposure without violating privacy boundaries
- • Delivers an executive-friendly attack surface narrative
Cons
- • Public data is inherently incomplete and subject to false positives
- • Requires intensive manual validation to separate signal from noise
- • Must strictly navigate data privacy and ethical reconnaissance boundaries
- • Limited internal context requires eventual client alignment to confirm asset ownership
Here’s the reality: attackers never rush in blindly. They spend weeks mapping your organization like a burglar casing a house—hunting for forgotten subdomains, exposed cloud storage, and developer credentials left on GitHub. The perimeter you think you’re defending? It’s probably not what the attacker sees.
The OSINT Attack Surface Intelligence Workflow lets you see what the attacker sees—using only passive, legal methods. Before you run a single security test or a penetration test, this workflow identifies external assets you didn’t know you had, closes critical exposure gaps, and gives you a clear picture of your actual attack surface.
Ethical Boundaries and Passive Reconnaissance
This workflow has clear rules. We gather intelligence without touching anything—literally just looking at what’s already public.
No active scanning: We use Shodan to find what’s publicly exposed, but we don’t run port scans or network probes against your infrastructure.
No credential testing: We’ll identify that your email is in a known breach database, but we don’t try to use those credentials or extract anything.
No social engineering: We map your org structure from LinkedIn and public sources, but we don’t call anyone, send phishing tests, or interact with employees.
Intelligence Objectives
The goal is building a complete picture of what’s exposed about your organization across all the places an attacker looks:
Infrastructure Find all your domains—the main ones, forgotten subdomains, ASNs (autonomous systems), and IP ranges you might not remember owning.
Cloud & Shadow IT Discover orphaned S3 buckets, exposed cloud storage, and SaaS apps that are running but not tracked. Shadow IT is a gold mine for attackers.
Technology Fingerprinting Identify what tech stack you’re running—backend frameworks, WAF providers like Cloudflare, outdated CMSes. Attackers use this to find version-specific exploits.
SSL Certificates & CT Logs Search Certificate Transparency logs to find every certificate issued to your domain. This reveals staging, dev, and internal environments you might have forgotten about.
Source Code & Secrets Scan GitHub repositories for hardcoded API keys, AWS credentials, CI/CD configs left behind by developers. This is incredibly common and incredibly dangerous.
The Practitioner Tool Stack
Gathering this data at scale requires specialized tools. Here’s what actually gets used:
Maltego The visual powerhouse. It connects the dots between domains, IPs, email addresses, and registrants. You end up with a visual map of your organization’s internet presence.
Passive DNS Discovery: Amass, Subfinder, DNSdumpster These tools enumerate subdomains without ever scanning your network. They use DNS records, certificate logs, and passive data sources to find everything connected to your domain.
Shodan & Censys These search engines index the internet’s exposed services and devices. They tell you what’s actually reachable from the internet—open ports, services running, vulnerable versions.
Historical Data: Wayback Machine, crt.sh, WHOIS/RDAP Archive.org shows what your website looked like years ago (revealing old endpoints). crt.sh shows every SSL certificate ever issued to your domain. WHOIS/RDAP reveals registration details.
Custom Python Scripts Tie it all together. Parse DNS records, aggregate threat feeds, automate the data collection pipeline.
The Intelligence Workflow
1. Scoping & Authorization Agree on what domains are in scope and what’s actually authorized. Get it in writing. Legal boundaries matter.
2. Passive Data Collection Run everything in parallel—DNS enumeration, GitHub searches for secrets, breach databases for your email addresses, Shodan queries for exposed services. This phase generates a lot of raw data.
3. Data Enrichment Cross-reference everything. Take an IP you found and run it through Shodan. Take a domain and check its SSL certificate history. Start seeing patterns and connections.
4. Validation & Filtering Here’s where you eliminate noise. A GitHub repository might look like it belongs to you but actually be a fork or a fan project. Manually verify the findings are real and actually yours.
5. Risk Scoring Not all findings are equally critical. An exposed internal API is high risk. A parked marketing domain is low risk. Rank them so the important stuff gets fixed first.
6. Integration Hand off the results to the penetration testing team or your vulnerability management program. Use this as the scope for what to actually test.
How This Feeds Into Security Testing
OSINT is the foundation for everything that comes next.
Red Team Operations You now know your org’s structure, who the key people are, what communication tools you use, and employee roles. This means spear-phishing scenarios aren’t generic—they’re targeted and credible because they’re based on real intel.
Penetration Testing
You find all the hidden v1 API endpoints, staging subdomains, and developer portals that aren’t on the main website. Without this OSINT, penetration testers only test the obvious stuff. With it, they test what actually matters—the real attack surface.
Sample OSINT Findings Matrix
| Risk Level | Finding / Exposure | Attack Vector & Implication | Priority |
|---|---|---|---|
| High | Exposed .git Directory | A forgotten development subdomain (dev.api.target.com) is exposing source code, potentially revealing backend logic and credentials. | Immediate |
| High | GitHub Secrets Leak | A developer committed an AWS Access Key ID to a public repository associated with their corporate email. | Immediate |
| Medium | Exposed RDP (Shodan) | An IP address registered to the organization has port 3389 open to the internet, exposing the network to brute-force and ransomware attacks. | 1-3 Days |
| Low | Breached Email Exposure | 45 corporate email addresses found in the “Collection #1” data breach. Indicates a high risk of credential stuffing if MFA is not enforced. | 1-2 Weeks |
Core Deliverables
The intelligence gathered is translated into actionable engineering and security deliverables:
- Attack Surface OSINT Report: A comprehensive narrative of what an attacker can see, categorized by risk.
- Visual Entity Graph: A Maltego chart illustrating the relationships between the organization’s domains, IPs, and third-party vendors.
- External Asset Register: A clean CSV/Excel inventory of all discovered subdomains, IPs, and identified tech stacks, ready for import into an Asset Management system.
- Prioritized Remediation List: Immediate actions required to remove sensitive data from public view.
30-Day Attack Surface Reduction Plan
Days 1-7: Immediate Triage Handle critical stuff right now. Revoke any exposed API keys on GitHub. Block access to staging environments and shut down exposed RDP ports with firewall rules.
Days 8-15: Asset Reconciliation Compare what OSINT found against your internal IT inventory. Identify shadow IT assets, forgotten marketing sites, and legacy infrastructure. Make official decisions—does this stay or go?
Days 16-30: Automate & Monitor Set up continuous subdomain monitoring so new subdomains trigger alerts. Implement GitHub secret scanning so developers can’t accidentally commit credentials. Create a policy that new infrastructure must be documented before it goes live.
Once you’ve done this reconnaissance workflow, you’ve fundamentally shifted the game. Attackers now have to work much harder to find an entry point. You’ve eliminated the low-hanging fruit and the forgotten assets—and that’s where most breaches start.