This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Discovery

Turn a handful of in-scope roots into a complete, validated inventory of domains and live hosts.

Discovery is where you expand a short scope list into the real attack surface. A client hands you example.com and three CIDRs; by the end of discovery you want every subdomain, every resolving host, and every live IP that belongs to them.

The pipeline is a funnel — each stage produces input for the next, and every candidate gets filtered against your scope blacklist before it moves forward:

seed roots ─▶ subdomain enumeration ─▶ DNS resolution ─▶ host discovery ─▶ live hosts
     │              (passive+active)        (which resolve)   (which are up)
     └──◀── certificate transparency & reverse DNS feed new roots back in ◀──┘

The stages

  1. Root domain recon — WHOIS, DNS records, org footprint.
  2. Subdomain enumeration — passive + active + brute.
  3. DNS resolution — which of those names actually resolve, and to what.
  4. Certificate & SSL harvesting — pull more names out of TLS certs.
  5. Host discovery — which IPs are actually alive.

Keep looping: cert harvesting and reverse DNS routinely turn up new root domains. Add the in-scope ones back to scope-domains.txt and re-run enumeration. Discovery is “done” when a full loop produces nothing new.

Root domain recon

Before enumerating subdomains, fingerprint each root. It’s cheap, passive, and orients everything that follows.

while read -r domain; do
  mkdir -p "recon/domains/$domain"
  whois "$domain"        | tee "recon/domains/$domain/whois.txt"
  for rec in A MX NS TXT SOA; do
    dig +noall +answer "$domain" "$rec"
  done                   | tee "recon/domains/$domain/dig.txt"
  # DMARC / SPF often leak infra and partner domains
  dig +short TXT "_dmarc.$domain"  | tee "recon/domains/$domain/dmarc.txt"
done < scope-domains.txt

What I’m looking for in the output:

  • Registrant org / email in WHOIS — pivot to find sibling domains.
  • NS / MX — who hosts DNS and mail; hints at cloud vs. on-prem.
  • SPF / DMARC TXT records — they frequently list partner and infra domains worth investigating (and adding to scope if they belong to the client).

If the client owns IP space, look up their ASN (whois -h whois.radb.net <ip> or bgp.he.net) and pull the announced prefixes. amass intel -asn <ASN> automates this. The CIDRs you recover become new entries in scope-ips.txt.

Continue to Subdomain Enumeration.

1 - Subdomain Enumeration

Passive, active, and brute-force discovery of subdomains for each in-scope root.

For each in-scope root domain you want every subdomain you can find. There are three techniques and they don’t fully overlap — I run all three, because each one finds names the others miss.

TechniqueSourceFinds
PassiveOSINT APIs, search engines, cert logsKnown/indexed names, zero target traffic
ActiveDNS queries against the target’s resolversNames that resolve but aren’t indexed
Brute forceWordlist against a resolverPredictable names (dev, vpn, staging)

Passive: subfinder + amass

I usually start with subfinder (ProjectDiscovery) for passive enum — it’s fast. amass pulls from a different (overlapping) set of sources, so I run both and merge the results.

# subfinder against every root at once
subfinder -dL scope-domains.txt -all -silent \
  | tee recon/domains/subfinder.txt

# amass passive enum, per root
while read -r domain; do
  amass enum -passive -d "$domain" -o "recon/domains/$domain/amass-passive.txt"
done < scope-domains.txt

You’ll get a lot more out of passive sources by adding API keys (Censys, SecurityTrails, Shodan, VirusTotal, GitHub, etc.) to ~/.config/subfinder/provider-config.yaml and ~/.config/amass/config.ini. Keyed sources roughly double the yield.

Active: amass

Active enumeration resolves and validates names against the target’s own DNS, catching wildcards and names that exist but aren’t in any OSINT feed.

while read -r domain; do
  amass enum -active -d "$domain" -o "recon/domains/$domain/amass-active.txt"
done < scope-domains.txt

Brute force: predictable names

Brute forcing throws a wordlist of common labels at the domain. I brute with a dedicated resolver tool (puredns or dnsx, covered on the DNS Resolution page) rather than amass’s built-in brute, because you control the rate and the resolver quality:

# Generate candidates from a wordlist, then resolve them
dnsx -d example.com -w /opt/SecLists/Discovery/DNS/subdomains-top1million-110000.txt \
     -silent -o recon/domains/example.com/brute.txt

amass enum -brute -d example.com does the same thing in one shot if you’d rather keep it simple.

Certificate transparency (crt.sh)

Public CT logs are one of the best passive sources — every TLS cert a host has ever requested is logged with its names. Query directly:

curl -s "https://crt.sh/?q=%25.example.com&output=json" \
  | jq -r '.[].name_value' \
  | sed 's/^\*\.//' | tr 'A-Z' 'a-z' | sort -u \
  | tee recon/domains/example.com/crtsh.txt

Merge, validate, and scope

Combine every source, strip the noise, and filter against your blacklist. This is the step that keeps you in scope:

cat recon/domains/example.com/*.txt recon/domains/subfinder.txt \
  | sed 's/^\*\.//;s/\.$//' | tr 'A-Z' 'a-z' \
  | grep -E '^[a-z0-9_.-]+$' \
  | grep -E '\.example\.com$' \
  | grep -vEf blacklist.txt \
  | sort -u \
  | tee recon/domains/example.com/subdomains.txt

anew is handy here — it appends only new lines to a file and prints them, so you can see what each run adds:

subfinder -d example.com -silent | anew recon/domains/example.com/subdomains.txt

The output subdomains.txt is the input to DNS Resolution, where you find out which of these names are actually live.

2 - DNS Resolution

Resolve enumerated names to IPs at scale, separate live from dead, and turn resolved addresses into IP scope.

Enumeration gives you a list of candidate names. Most engagements need to know which ones actually resolve, what they resolve to, and which IPs that adds to scope. It’s a mass-resolution problem — you can easily end up with tens of thousands of candidate names.

Mass resolution with dnsx

Arsenic originally used fast-resolv for this. These days I use dnsx (ProjectDiscovery) — it resolves huge lists quickly against a pool of resolvers and it’s actively maintained.

# Resolve every discovered subdomain; keep only those that answer, with their A records
dnsx -l recon/domains/example.com/subdomains.txt \
     -a -resp \
     -silent \
     -o recon/domains/example.com/resolved.txt

Use a curated resolver list to avoid poisoned or rate-limited public resolvers — dnsvalidator builds one:

dnsvalidator -tL https://public-dns.info/nameservers.txt -threads 100 -o resolvers.txt
dnsx -l subdomains.txt -r resolvers.txt -a -resp -silent -o resolved.txt

Watch out for wildcard DNS. Some domains resolve everything to one IP (*.example.com → 203.0.113.9). dnsx has -wd example.com for wildcard filtering; puredns handles it automatically. Without it, your “resolved” list is mostly garbage.

Extract IPs into scope

Every resolved address that falls inside your authorized ranges becomes part of the IP scope for the recon phase:

# Pull the unique IPs out of the resolved output
grep -oE '\[([0-9]{1,3}\.){3}[0-9]{1,3}\]' recon/domains/*/resolved.txt \
  | tr -d '[]' | sort -u \
  | tee recon/ips/from-domains.txt

# Merge with seed IP scope, filtering to authorized ranges
cat scope-ips.txt recon/ips/from-domains.txt | sort -u > recon/ips/scope-combined.txt

Reverse DNS (the other direction)

You also want to resolve IPs back to names — PTR records often reveal hostnames (and therefore new domains) you’d never have guessed:

dnsx -l recon/ips/scope-combined.txt -ptr -resp-only -silent \
  | tr 'A-Z' 'a-z' | sort -u \
  | grep -vEf blacklist.txt \
  | tee recon/ips/ptr-names.txt

Any in-scope root domains that show up here go back into scope-domains.txt, and you re-run enumeration. That’s the discovery loop closing on itself.

Next: Certificate & SSL Harvesting for one more rich source of hostnames, then Host Discovery to find which IPs are alive.

3 - Certificate & SSL Harvesting

Pull hostnames out of live TLS certificates to find assets nothing else surfaces.

TLS certificates are full of hostnames. A cert’s Common Name (CN) and Subject Alternative Names (SANs) list every name the operator put on it — including internal names, dev hosts, and sibling domains that never show up in DNS enumeration or OSINT.

There are two angles: passively reading certificate transparency logs (covered on the Subdomain Enumeration page) and actively grabbing certs off live hosts. This page is the active side. It’s worth doing because it catches certs that were never logged to CT and certs served directly on IPs with no DNS name at all.

Harvest certs from hosts and IPs

Run an nmap service scan against the TLS ports and let the ssl-cert script dump the certificate details, then parse out the names. This is what Arsenic’s as-domains-from-*-ssl-certs scripts do:

# Scan TLS ports on your resolved hosts (and on bare IPs)
nmap -p 443,8443,993,995,8080,8843 -sV -sC --open \
     -iL recon/ips/scope-combined.txt \
     -oA recon/ips/nmap-tls-check

# Extract CN + SAN entries from the nmap output
{
  grep -ohP 'commonName=\K.+'                   recon/ips/nmap-tls-check.nmap
  grep -ohP 'Subject Alternative Name: DNS:\K.+' recon/ips/nmap-tls-check.nmap \
    | sed 's/ DNS://g; s/,/\n/g'
} \
  | sed 's/^\*\.//' | tr 'A-Z' 'a-z' \
  | grep '\.' \
  | grep -vEf blacklist.txt \
  | sort -u \
  | tee recon/ips/ssl-cert-domains.txt

One-liner with httpx

httpx can grab and parse certs in one pass — faster than nmap when you only care about the names:

httpx -l recon/ips/scope-combined.txt \
      -p 443,8443,8080,8843 \
      -tls-grab -json -silent \
  | jq -r '.tls.subject_an[]?, .tls.subject_cn?' \
  | sed 's/^\*\.//' | tr 'A-Z' 'a-z' | sort -u \
  | grep -vEf blacklist.txt \
  | tee recon/ips/ssl-cert-domains.txt

Feed it back into scope

In-scope names that came out of certs are new subdomains/roots:

grep -E '\.(example\.com|example\.net)$' recon/ips/ssl-cert-domains.txt \
  | anew scope-domains-generated.txt

Then loop back to DNS Resolution to resolve the new names. Once a full discovery loop yields nothing new, move on to Host Discovery.

4 - Host Discovery

Find which IPs in scope are actually alive before you spend time on full port scans.

You may have hundreds or thousands of IPs in scope, especially after expanding CIDRs. Full port scanning all of them is wasteful — most won’t be up. Host discovery is a fast first pass to find the live ones, so the expensive recon phase only targets hosts that exist.

Expand CIDRs to addresses

First, turn any CIDR ranges into individual addresses so you can scan and track them per-host. nmap -sL (“list scan”) expands ranges without sending a single packet:

# IPv4
nmap -sL -n -iL recon/ips/scope-combined.txt \
  | awk '/report for/{print $NF}' \
  | sort -u > recon/ips/expanded-ipv4.txt

# IPv6 (if in scope)
nmap -6 -sL -n -iL recon/ips/scope-combined.txt \
  | awk '/report for/{print $NF}' \
  | sort -u > recon/ips/expanded-ipv6.txt

Smart ping sweep with nmap

A plain ICMP ping sweep misses hosts that block ICMP, which is most hardened hosts. The trick Arsenic uses is to probe the most popular ports for liveness on top of ICMP, so a host that drops ping but answers on tcp/443 still shows up.

Build the popular-port lists straight from nmap’s own frequency data:

TOP=30   # top-N most common ports
TCP=$(sort -r -k3 /usr/share/nmap/nmap-services | awk '/\/tcp/{print $2}' \
      | cut -d/ -f1 | head -n $TOP | paste -sd,)
UDP=$(sort -r -k3 /usr/share/nmap/nmap-services | awk '/\/udp/{print $2}' \
      | cut -d/ -f1 | head -n $TOP | paste -sd,)

Then sweep with multiple probe types — ICMP echo + timestamp, TCP ACK/SYN to the popular ports, and UDP to its popular ports:

sudo nmap -sn -n \
  -PE -PP \
  -PA"$TCP" -PS"$TCP" -PU"$UDP" \
  --randomize-hosts --scan-delay 50ms \
  -T4 \
  -iL recon/ips/expanded-ipv4.txt \
  -oA recon/ips/host-discovery-ipv4

# Extract the live hosts
awk '/Up$/{print $2}' recon/ips/host-discovery-ipv4.gnmap \
  | sort -u > recon/ips/alive.txt

What the flags do:

  • -sn — host discovery only, no port scan.
  • -PE -PP — ICMP echo + timestamp requests.
  • -PA<ports> / -PS<ports> — TCP ACK / SYN probes to popular ports (gets through stateful firewalls that drop ICMP).
  • -PU<ports> — UDP probes.
  • --randomize-hosts / --scan-delay — a little quieter and gentler.
  • -T4 — timing; drop to -T3 or lower for fragile/monitored networks.

Faster alternative: naabu

naabu can do liveness + a fast port pass in one step, and it’s nice for large ranges:

naabu -l recon/ips/expanded-ipv4.txt -top-ports 100 -silent \
  | cut -d: -f1 | sort -u | tee recon/ips/alive.txt

Resolve names ↔ live IPs

Cross-reference your resolved domains with the live IP list so you know which hostnames sit on which live host. One IP often serves many vhosts — you want to scan the IP once but remember every name pointing at it (it matters for HTTP vhost routing in recon).

The output of this phase — recon/ips/alive.txt plus the per-host name mapping — is the target list for Recon.