This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Pentest Engagement Handbook

A tool-by-tool walkthrough of how I run an external network/web pentest by hand — the same workflow Arsenic automates.

This handbook is about the methodology and tooling of a pentest engagement, not the Arsenic CLI. Every step here is something you can run by hand with off-the-shelf tools. Arsenic just glues these steps together and keeps the output organized; this is my attempt to write down what it’s gluing together and why.

If you only want to drive Arsenic, the Arsenic docs cover that.

Who this is for

This started as the runbook I built for myself while studying for the OSCP. I kept hardening it into a repeatable process while running a pentest team, and this is where it landed. It assumes you can use a shell, read tool output, and that you have written authorization for every target you point these tools at.

How an engagement flows

A network/web engagement moves through five phases, and each one feeds the next. You turn a handful of in-scope roots into a full asset inventory, then a service map, then a list of likely weaknesses, then confirmed findings.

PhaseGoalPage
1. SetupScope, rules of engagement, workspace, toolboxEngagement Setup
2. DiscoveryTurn scope into a complete asset inventoryDiscovery
3. ReconMap services, web surface, and contentRecon
4. HuntingFind likely vulnerabilities at scaleVulnerability Hunting
5. ReportingCapture evidence and write it upEvidence & Reporting

The phases are a loop, not a line. Discovery surfaces new domains, which expand scope, which feeds discovery again. Recon turns up a forgotten admin panel that becomes a new lead. I’ve found it’s easiest to settle into the loop and keep re-running the cheap steps as scope grows.

The toolchain at a glance

Here’s the full set of tools the handbook uses, grouped by phase. Where a tool Arsenic originally shipped has since aged out, I’ve noted what I reach for now. The Toolbox Reference has the install commands and the full mapping.

  • Discovery: amass, subfinder, crt.sh, dnsx (replaces fast-resolv), nmap -sn, naabu
  • Recon: nmap, naabu, httpx, gowitness (replaces aquatone), ffuf, feroxbuster
  • Hunting: nuclei, searchsploit, nuclei/subzy for takeovers
  • Glue: jq, mlr (miller), anew, SecLists wordlists

Using these docs in Docsy

This folder is a self-contained Docsy content section. Drop it into your site under content/en/ (or wherever your docs live) and it renders as a top-level section. Page ordering comes from the weight front matter on each file, so you reorder by editing weights rather than renaming files.

One reminder before you start: everything here is for authorized testing only. Active scans — port scans, brute force, fuzzing, nuclei — generate traffic that’s trivially attributable to you and can knock fragile services over. Get scope, rate limits, and blackout windows in writing before you run any of it against a live target.

1 - Engagement Setup

Scope, rules of engagement, a tracked workspace, and a working toolbox — everything before the first packet leaves your box.

The work you do before scanning is what keeps the engagement clean, repeatable, and defensible. Skip it and you end up scanning out-of-scope hosts, losing evidence, or unable to reconstruct what you ran on day 9.

1. Pin down scope and rules of engagement

Get these in writing before anything else. They decide which commands you’re allowed to run.

  • In-scope assets — root domains, IP ranges/CIDRs, ASNs, specific URLs.
  • Out-of-scope assets — explicit exclusions (shared SaaS, third-party CDNs, partner domains). These become your blacklist (see below).
  • Allowed activity — passive only? Active scanning? Exploitation? Brute force? Phishing?
  • Rate / timing limits — some clients cap requests-per-second or forbid scanning during business hours.
  • Blackout windows & emergency contact — who to call when something falls over, and when not to test.
  • Credentials & test accounts — for authenticated testing.

A wildcard like *.example.com means “enumerate and prove the subdomains”; a bare example.com usually means just that host. Confirm which one the client means — it changes the size of the engagement.

2. Stand up a tracked workspace

I treat the engagement as a git repo from the first minute. Every scan output, every scope change, and every note ends up version-controlled — it’s both the audit trail and the backup.

mkdir ~/engagements/acme && cd ~/engagements/acme
git init
mkdir -p recon hosts report tmp
printf '/tmp\n' > .gitignore
git add .gitignore && git commit -m "init workspace"

A directory convention that scales (this is basically the layout Arsenic enforces):

acme/
├── scope-domains.txt        # in-scope root domains, one per line
├── scope-ips.txt            # in-scope IPs / CIDRs
├── recon/                   # org-wide recon output (domains, ips, discovery)
├── hosts/<host>/recon/      # per-host scan output
├── report/
│   ├── findings/            # one folder per finding
│   └── static/              # screenshots & evidence
└── tmp/                     # scratch (git-ignored)

Commit early and often. A habit I borrowed straight from the Arsenic scripts: after each meaningful scan, git add the new output and commit with a message describing what ran. If you’re collaborating, push between steps so teammates don’t re-scan the same hosts.

3. Define the scope files

The whole pipeline is driven by two seed files. Everything you discover later gets validated back against these plus a blacklist.

# Seed roots — the things you were explicitly told are in scope
printf 'example.com\nexample.net\n' >> scope-domains.txt
printf '203.0.113.0/24\n198.51.100.10\n' >> scope-ips.txt

Keep a blacklist of root domains that show up in results but aren’t yours to test — shared infrastructure that certificate transparency and reverse DNS will constantly surface. Arsenic ships a sensible default; the usual offenders:

1e100.net            akamaitechnologies.com   amazonaws.com
azurewebsites.net    cloudfront.net           cloudapp.net
googleusercontent.com  readthedocs.io         sites.hubspot.net

Every time you generate a new candidate list of domains/IPs, run it through this blacklist before adding it to scope. This one habit prevents the most common engagement mistake: scanning someone else’s CDN.

Ingesting scope from a CSV / bug-bounty program

Real scope rarely arrives as a clean list. For a HackerOne-style CSV, I normalize it with mlr (Miller) and jq:

curl -s https://hackerone.com/teams/acme/assets/download_csv.csv \
  | mlr --icsv --ojson cat | jq | tee acme-scope.json

# Pull eligible, non-wildcard identifiers into the domain scope
jq -r '.[]
  | select(.eligible_for_submission == "true")
  | select(.max_severity != "none")
  | .identifier' acme-scope.json \
  | grep -v '\*' \
  | sort -u >> scope-domains.txt

Handle wildcard entries (*.acme.com) separately — strip the *. and feed the parent to subdomain enumeration in the Discovery phase.

4. Build your toolbox

Install the toolchain once and keep it on $PATH. Full install commands are in the Toolbox Reference; the essentials:

# ProjectDiscovery suite (Go)
go install github.com/projectdiscovery/subfinder/v2/cmd/subfinder@latest
go install github.com/projectdiscovery/dnsx/cmd/dnsx@latest
go install github.com/projectdiscovery/naabu/v2/cmd/naabu@latest
go install github.com/projectdiscovery/httpx/cmd/httpx@latest
go install github.com/projectdiscovery/nuclei/v3/cmd/nuclei@latest

# Content discovery & fuzzing
go install github.com/ffuf/ffuf/v2@latest
# feroxbuster, gobuster — package manager or release binaries

# Classics
sudo apt install -y nmap amass exploitdb jq miller   # searchsploit ships with exploitdb

# Wordlists
git clone https://github.com/danielmiessler/SecLists /opt/SecLists

Let nmap run unprivileged

Most useful nmap scans need raw sockets. Rather than sudo on every run, grant the binary the capabilities once:

sudo setcap cap_net_raw,cap_net_admin,cap_net_bind_service+eip "$(command -v nmap)"

With setup done, move on to Discovery to turn your seed scope into a full asset inventory.

2 - Discovery

Turn a handful of in-scope roots into a complete, validated inventory of domains and live hosts.

Discovery is where you expand a short scope list into the real attack surface. A client hands you example.com and three CIDRs; by the end of discovery you want every subdomain, every resolving host, and every live IP that belongs to them.

The pipeline is a funnel — each stage produces input for the next, and every candidate gets filtered against your scope blacklist before it moves forward:

seed roots ─▶ subdomain enumeration ─▶ DNS resolution ─▶ host discovery ─▶ live hosts
     │              (passive+active)        (which resolve)   (which are up)
     └──◀── certificate transparency & reverse DNS feed new roots back in ◀──┘

The stages

  1. Root domain recon — WHOIS, DNS records, org footprint.
  2. Subdomain enumeration — passive + active + brute.
  3. DNS resolution — which of those names actually resolve, and to what.
  4. Certificate & SSL harvesting — pull more names out of TLS certs.
  5. Host discovery — which IPs are actually alive.

Keep looping: cert harvesting and reverse DNS routinely turn up new root domains. Add the in-scope ones back to scope-domains.txt and re-run enumeration. Discovery is “done” when a full loop produces nothing new.

Root domain recon

Before enumerating subdomains, fingerprint each root. It’s cheap, passive, and orients everything that follows.

while read -r domain; do
  mkdir -p "recon/domains/$domain"
  whois "$domain"        | tee "recon/domains/$domain/whois.txt"
  for rec in A MX NS TXT SOA; do
    dig +noall +answer "$domain" "$rec"
  done                   | tee "recon/domains/$domain/dig.txt"
  # DMARC / SPF often leak infra and partner domains
  dig +short TXT "_dmarc.$domain"  | tee "recon/domains/$domain/dmarc.txt"
done < scope-domains.txt

What I’m looking for in the output:

  • Registrant org / email in WHOIS — pivot to find sibling domains.
  • NS / MX — who hosts DNS and mail; hints at cloud vs. on-prem.
  • SPF / DMARC TXT records — they frequently list partner and infra domains worth investigating (and adding to scope if they belong to the client).

If the client owns IP space, look up their ASN (whois -h whois.radb.net <ip> or bgp.he.net) and pull the announced prefixes. amass intel -asn <ASN> automates this. The CIDRs you recover become new entries in scope-ips.txt.

Continue to Subdomain Enumeration.

2.1 - Subdomain Enumeration

Passive, active, and brute-force discovery of subdomains for each in-scope root.

For each in-scope root domain you want every subdomain you can find. There are three techniques and they don’t fully overlap — I run all three, because each one finds names the others miss.

TechniqueSourceFinds
PassiveOSINT APIs, search engines, cert logsKnown/indexed names, zero target traffic
ActiveDNS queries against the target’s resolversNames that resolve but aren’t indexed
Brute forceWordlist against a resolverPredictable names (dev, vpn, staging)

Passive: subfinder + amass

I usually start with subfinder (ProjectDiscovery) for passive enum — it’s fast. amass pulls from a different (overlapping) set of sources, so I run both and merge the results.

# subfinder against every root at once
subfinder -dL scope-domains.txt -all -silent \
  | tee recon/domains/subfinder.txt

# amass passive enum, per root
while read -r domain; do
  amass enum -passive -d "$domain" -o "recon/domains/$domain/amass-passive.txt"
done < scope-domains.txt

You’ll get a lot more out of passive sources by adding API keys (Censys, SecurityTrails, Shodan, VirusTotal, GitHub, etc.) to ~/.config/subfinder/provider-config.yaml and ~/.config/amass/config.ini. Keyed sources roughly double the yield.

Active: amass

Active enumeration resolves and validates names against the target’s own DNS, catching wildcards and names that exist but aren’t in any OSINT feed.

while read -r domain; do
  amass enum -active -d "$domain" -o "recon/domains/$domain/amass-active.txt"
done < scope-domains.txt

Brute force: predictable names

Brute forcing throws a wordlist of common labels at the domain. I brute with a dedicated resolver tool (puredns or dnsx, covered on the DNS Resolution page) rather than amass’s built-in brute, because you control the rate and the resolver quality:

# Generate candidates from a wordlist, then resolve them
dnsx -d example.com -w /opt/SecLists/Discovery/DNS/subdomains-top1million-110000.txt \
     -silent -o recon/domains/example.com/brute.txt

amass enum -brute -d example.com does the same thing in one shot if you’d rather keep it simple.

Certificate transparency (crt.sh)

Public CT logs are one of the best passive sources — every TLS cert a host has ever requested is logged with its names. Query directly:

curl -s "https://crt.sh/?q=%25.example.com&output=json" \
  | jq -r '.[].name_value' \
  | sed 's/^\*\.//' | tr 'A-Z' 'a-z' | sort -u \
  | tee recon/domains/example.com/crtsh.txt

Merge, validate, and scope

Combine every source, strip the noise, and filter against your blacklist. This is the step that keeps you in scope:

cat recon/domains/example.com/*.txt recon/domains/subfinder.txt \
  | sed 's/^\*\.//;s/\.$//' | tr 'A-Z' 'a-z' \
  | grep -E '^[a-z0-9_.-]+$' \
  | grep -E '\.example\.com$' \
  | grep -vEf blacklist.txt \
  | sort -u \
  | tee recon/domains/example.com/subdomains.txt

anew is handy here — it appends only new lines to a file and prints them, so you can see what each run adds:

subfinder -d example.com -silent | anew recon/domains/example.com/subdomains.txt

The output subdomains.txt is the input to DNS Resolution, where you find out which of these names are actually live.

2.2 - DNS Resolution

Resolve enumerated names to IPs at scale, separate live from dead, and turn resolved addresses into IP scope.

Enumeration gives you a list of candidate names. Most engagements need to know which ones actually resolve, what they resolve to, and which IPs that adds to scope. It’s a mass-resolution problem — you can easily end up with tens of thousands of candidate names.

Mass resolution with dnsx

Arsenic originally used fast-resolv for this. These days I use dnsx (ProjectDiscovery) — it resolves huge lists quickly against a pool of resolvers and it’s actively maintained.

# Resolve every discovered subdomain; keep only those that answer, with their A records
dnsx -l recon/domains/example.com/subdomains.txt \
     -a -resp \
     -silent \
     -o recon/domains/example.com/resolved.txt

Use a curated resolver list to avoid poisoned or rate-limited public resolvers — dnsvalidator builds one:

dnsvalidator -tL https://public-dns.info/nameservers.txt -threads 100 -o resolvers.txt
dnsx -l subdomains.txt -r resolvers.txt -a -resp -silent -o resolved.txt

Watch out for wildcard DNS. Some domains resolve everything to one IP (*.example.com → 203.0.113.9). dnsx has -wd example.com for wildcard filtering; puredns handles it automatically. Without it, your “resolved” list is mostly garbage.

Extract IPs into scope

Every resolved address that falls inside your authorized ranges becomes part of the IP scope for the recon phase:

# Pull the unique IPs out of the resolved output
grep -oE '\[([0-9]{1,3}\.){3}[0-9]{1,3}\]' recon/domains/*/resolved.txt \
  | tr -d '[]' | sort -u \
  | tee recon/ips/from-domains.txt

# Merge with seed IP scope, filtering to authorized ranges
cat scope-ips.txt recon/ips/from-domains.txt | sort -u > recon/ips/scope-combined.txt

Reverse DNS (the other direction)

You also want to resolve IPs back to names — PTR records often reveal hostnames (and therefore new domains) you’d never have guessed:

dnsx -l recon/ips/scope-combined.txt -ptr -resp-only -silent \
  | tr 'A-Z' 'a-z' | sort -u \
  | grep -vEf blacklist.txt \
  | tee recon/ips/ptr-names.txt

Any in-scope root domains that show up here go back into scope-domains.txt, and you re-run enumeration. That’s the discovery loop closing on itself.

Next: Certificate & SSL Harvesting for one more rich source of hostnames, then Host Discovery to find which IPs are alive.

2.3 - Certificate & SSL Harvesting

Pull hostnames out of live TLS certificates to find assets nothing else surfaces.

TLS certificates are full of hostnames. A cert’s Common Name (CN) and Subject Alternative Names (SANs) list every name the operator put on it — including internal names, dev hosts, and sibling domains that never show up in DNS enumeration or OSINT.

There are two angles: passively reading certificate transparency logs (covered on the Subdomain Enumeration page) and actively grabbing certs off live hosts. This page is the active side. It’s worth doing because it catches certs that were never logged to CT and certs served directly on IPs with no DNS name at all.

Harvest certs from hosts and IPs

Run an nmap service scan against the TLS ports and let the ssl-cert script dump the certificate details, then parse out the names. This is what Arsenic’s as-domains-from-*-ssl-certs scripts do:

# Scan TLS ports on your resolved hosts (and on bare IPs)
nmap -p 443,8443,993,995,8080,8843 -sV -sC --open \
     -iL recon/ips/scope-combined.txt \
     -oA recon/ips/nmap-tls-check

# Extract CN + SAN entries from the nmap output
{
  grep -ohP 'commonName=\K.+'                   recon/ips/nmap-tls-check.nmap
  grep -ohP 'Subject Alternative Name: DNS:\K.+' recon/ips/nmap-tls-check.nmap \
    | sed 's/ DNS://g; s/,/\n/g'
} \
  | sed 's/^\*\.//' | tr 'A-Z' 'a-z' \
  | grep '\.' \
  | grep -vEf blacklist.txt \
  | sort -u \
  | tee recon/ips/ssl-cert-domains.txt

One-liner with httpx

httpx can grab and parse certs in one pass — faster than nmap when you only care about the names:

httpx -l recon/ips/scope-combined.txt \
      -p 443,8443,8080,8843 \
      -tls-grab -json -silent \
  | jq -r '.tls.subject_an[]?, .tls.subject_cn?' \
  | sed 's/^\*\.//' | tr 'A-Z' 'a-z' | sort -u \
  | grep -vEf blacklist.txt \
  | tee recon/ips/ssl-cert-domains.txt

Feed it back into scope

In-scope names that came out of certs are new subdomains/roots:

grep -E '\.(example\.com|example\.net)$' recon/ips/ssl-cert-domains.txt \
  | anew scope-domains-generated.txt

Then loop back to DNS Resolution to resolve the new names. Once a full discovery loop yields nothing new, move on to Host Discovery.

2.4 - Host Discovery

Find which IPs in scope are actually alive before you spend time on full port scans.

You may have hundreds or thousands of IPs in scope, especially after expanding CIDRs. Full port scanning all of them is wasteful — most won’t be up. Host discovery is a fast first pass to find the live ones, so the expensive recon phase only targets hosts that exist.

Expand CIDRs to addresses

First, turn any CIDR ranges into individual addresses so you can scan and track them per-host. nmap -sL (“list scan”) expands ranges without sending a single packet:

# IPv4
nmap -sL -n -iL recon/ips/scope-combined.txt \
  | awk '/report for/{print $NF}' \
  | sort -u > recon/ips/expanded-ipv4.txt

# IPv6 (if in scope)
nmap -6 -sL -n -iL recon/ips/scope-combined.txt \
  | awk '/report for/{print $NF}' \
  | sort -u > recon/ips/expanded-ipv6.txt

Smart ping sweep with nmap

A plain ICMP ping sweep misses hosts that block ICMP, which is most hardened hosts. The trick Arsenic uses is to probe the most popular ports for liveness on top of ICMP, so a host that drops ping but answers on tcp/443 still shows up.

Build the popular-port lists straight from nmap’s own frequency data:

TOP=30   # top-N most common ports
TCP=$(sort -r -k3 /usr/share/nmap/nmap-services | awk '/\/tcp/{print $2}' \
      | cut -d/ -f1 | head -n $TOP | paste -sd,)
UDP=$(sort -r -k3 /usr/share/nmap/nmap-services | awk '/\/udp/{print $2}' \
      | cut -d/ -f1 | head -n $TOP | paste -sd,)

Then sweep with multiple probe types — ICMP echo + timestamp, TCP ACK/SYN to the popular ports, and UDP to its popular ports:

sudo nmap -sn -n \
  -PE -PP \
  -PA"$TCP" -PS"$TCP" -PU"$UDP" \
  --randomize-hosts --scan-delay 50ms \
  -T4 \
  -iL recon/ips/expanded-ipv4.txt \
  -oA recon/ips/host-discovery-ipv4

# Extract the live hosts
awk '/Up$/{print $2}' recon/ips/host-discovery-ipv4.gnmap \
  | sort -u > recon/ips/alive.txt

What the flags do:

  • -sn — host discovery only, no port scan.
  • -PE -PP — ICMP echo + timestamp requests.
  • -PA<ports> / -PS<ports> — TCP ACK / SYN probes to popular ports (gets through stateful firewalls that drop ICMP).
  • -PU<ports> — UDP probes.
  • --randomize-hosts / --scan-delay — a little quieter and gentler.
  • -T4 — timing; drop to -T3 or lower for fragile/monitored networks.

Faster alternative: naabu

naabu can do liveness + a fast port pass in one step, and it’s nice for large ranges:

naabu -l recon/ips/expanded-ipv4.txt -top-ports 100 -silent \
  | cut -d: -f1 | sort -u | tee recon/ips/alive.txt

Resolve names ↔ live IPs

Cross-reference your resolved domains with the live IP list so you know which hostnames sit on which live host. One IP often serves many vhosts — you want to scan the IP once but remember every name pointing at it (it matters for HTTP vhost routing in recon).

The output of this phase — recon/ips/alive.txt plus the per-host name mapping — is the target list for Recon.

3 - Recon

Map every live host’s services, web surface, and content into a per-host picture you can attack.

Discovery told you what exists. Recon tells you what’s running on it. For each live host you build a profile: open ports, service versions, web apps, screenshots, and discovered content. This is the raw material the hunting phase mines for vulnerabilities.

Organize per host

Recon output is per-host. The convention (Arsenic’s hosts/ layout) keeps each host’s data isolated, so you can hand a teammate one host folder and they have everything:

hosts/
└── 203.0.113.10/
    └── recon/
        ├── nmap-quick-tcp.{nmap,gnmap,xml}   # full TCP port sweep
        ├── nmap-tcp.{nmap,gnmap,xml}         # version/script scan of open ports
        ├── nmap-udp.{nmap,gnmap,xml}
        ├── httpx.txt                          # live web services
        ├── ffuf.*.json                        # content discovery
        └── hostnames.txt                      # vhosts pointing at this IP

The recon pipeline

For each live host, in order:

  1. Port scanning — find every open TCP (and key UDP) port.
  2. Service enumeration — version + default-script scan the open ports to identify what’s listening.
  3. HTTP probing & screenshots — find web services across all hosts/ports and eyeball them fast.
  4. Content discovery — fuzz web roots for hidden paths, files, and endpoints.

Scan once per IP, remember every name

A single IP frequently hosts many domains (name-based virtual hosting). Port scan the IP so you don’t scan the same box ten times — but carry the list of hostnames forward, because the web server may serve completely different apps depending on the Host: header. The HTTP probing step is where vhosts matter most.

A note on pacing

Recon is the loudest phase so far — full port scans and fuzzing are unmistakable. Respect the rate limits from your rules of engagement: tune nmap -T, ffuf -rate, and run during permitted windows. When in doubt, go slower. A knocked-over production service is a bad look and a worse phone call.

Start with Port Scanning.

3.1 - Port Scanning

Find every open port on each live host — fast, then thorough — without scanning the same box twice.

The goal is the complete set of open ports on each host. The pattern that balances speed and completeness is two passes: a fast full-range sweep to find which ports are open, then a detailed scan of only those ports (covered in Service Enumeration).

Pass 1: fast full-range TCP sweep

Scan all 65,535 TCP ports quickly to find what’s open. I reach for one of two tools here.

nmap (the classic)

host=203.0.113.10
mkdir -p "hosts/$host/recon"

sudo nmap -p- --open -Pn -n \
  --min-rate 1500 --max-retries 1 \
  -T4 \
  "$host" \
  -oA "hosts/$host/recon/nmap-quick-tcp"

# Pull the open ports into a comma list for pass 2
ports=$(awk -F/ '/open/{print $1}' "hosts/$host/recon/nmap-quick-tcp.gnmap" \
        | tr '\n' ',' | sed 's/,$//')
  • -p- — all 65,535 ports.
  • --open — only report open ports.
  • -Pn — skip host discovery (you already know it’s up).
  • -n — no DNS resolution (faster, quieter).
  • --min-rate / --max-retries — speed knobs; raise min-rate on robust networks, lower it on fragile ones.

naabu (faster for many hosts)

naabu uses a SYN scan and is noticeably quicker across large host lists. Pipe its results straight into nmap for versioning:

naabu -host "$host" -p - -silent -o "hosts/$host/recon/naabu-tcp.txt"
ports=$(cut -d: -f2 "hosts/$host/recon/naabu-tcp.txt" | paste -sd,)

masscan is an option for very large IP ranges (it can scan the internet in minutes), but it trades accuracy for speed and needs careful rate limiting. For typical engagement-sized scope, nmap --min-rate or naabu is plenty.

Incremental / batched scanning for large scope

When you have many hosts, scanning every port on every host serially takes forever. Arsenic batches this: scan the most popular ports across all hosts first (you get fast, high-value coverage), then work through the remaining port ranges in batches. The idea is to surface the interesting services early instead of waiting for a full sweep of one host before starting the next.

A simple version — popular ports across everything first:

TOP_PORTS=$(sort -r -k3 /usr/share/nmap/nmap-services | awk '/\/tcp/{print $2}' \
            | cut -d/ -f1 | head -n 1000 | paste -sd,)

sudo nmap -sS -p"$TOP_PORTS" --open -Pn -n -T4 \
  --min-hostgroup 255 --max-retries 1 \
  -iL recon/ips/alive.txt \
  -oA recon/nmap-popular-tcp

Then schedule the full -p- sweep per host as time allows.

UDP — don’t skip it entirely

UDP scanning is slow, but skipping it misses SNMP, DNS, SChannel, IKE, TFTP, NetBIOS and other juicy services. Scan the top UDP ports rather than all of them:

sudo nmap -sU --top-ports 100 --open -Pn -n -T4 \
  "$host" -oA "hosts/$host/recon/nmap-udp"

With the open-port list in hand, move to Service Enumeration to find out what’s actually listening.

3.2 - Service Enumeration

Identify the exact software and version behind every open port — the input every later step depends on.

Knowing port 8080 is open tells you little. Knowing it’s Apache Tomcat 9.0.30 tells you what default paths to check, what CVEs apply, and what credentials to try. Service enumeration turns open ports into identified, versioned services.

Version + default-script scan

Run nmap against only the ports you found open in port scanning, with version detection and the default safe scripts. This is the deep, accurate scan, so let it take its time:

host=203.0.113.10
ports=$(awk -F/ '/open/{print $1}' "hosts/$host/recon/nmap-quick-tcp.gnmap" \
        | tr '\n' ',' | sed 's/,$//')

sudo nmap -p"$ports" -sV -sC -A -Pn -n \
  --host-timeout 30m \
  "$host" \
  -oA "hosts/$host/recon/nmap-tcp"

What the flags do:

  • -sV — probe for service/version.
  • -sC — run the default NSE script set (banner grab, titles, common checks).
  • -A — aggressive: adds OS detection, traceroute, and more scripts. Drop it if you need to be quieter; -sV -sC alone is the high-signal core.
  • --host-timeout — don’t let one stubborn host stall the whole run.

The two output formats you’ll use constantly:

  • .nmap — human-readable; read it.
  • .xml — machine-readable; feed it to searchsploit, reporting tools, and importers.

What to extract

Walk every host’s .nmap output and pull out:

  • Service + version for each port → drives vulnerability hunting.
  • HTTP/HTTPS services (including odd ports like 8000, 8443, 3000) → feed to HTTP probing.
  • TLS cert names → may surface new vhosts/domains (loop back to discovery).
  • Anonymous/guessable access flagged by NSE scripts (FTP anon login, open SMB shares, exposed RPC).

Targeted NSE for interesting services

When -sC flags something, follow up with service-specific scripts. A few I reach for a lot:

# SMB — shares, users, known vulns
nmap -p139,445 --script "smb-enum-shares,smb-enum-users,smb-vuln-*" "$host"

# HTTP — titles, methods, common files
nmap -p80,443 --script "http-title,http-methods,http-headers,http-enum" "$host"

# SSL/TLS — protocols, ciphers, weaknesses
nmap -p443 --script "ssl-enum-ciphers,ssl-cert" "$host"

Quick service inventory across all hosts

To get a one-line-per-service overview for triage:

grep -hP '^\d+/(tcp|udp)\s+open' hosts/*/recon/nmap-*.nmap \
  | awk '{print $1, $3, $4, $5, $6, $7}' \
  | sort | uniq -c | sort -rn

This tells you at a glance “we have 40 web servers, 12 SSH, 6 RDP, 3 Tomcat” — which decides where to spend the hunting phase.

Next: get eyes on the web surface with HTTP Probing & Screenshots.

3.3 - HTTP Probing & Screenshots

Find every live web service across all hosts and ports, then screenshot them to triage the web surface at a glance.

Web is where most findings live. After port scanning you have a pile of open ports that might be HTTP; this step confirms which ones actually serve web content — on which scheme and port, with what title and technology — then screenshots them so you can eyeball hundreds of apps in minutes.

Probe with httpx

I use httpx (ProjectDiscovery) for this. Feed it every host and every web-ish port; it works out http vs https, follows redirects, and reports a bunch of metadata.

# Build the candidate list: every hostname + IP you care about
# (httpx will try each on the ports you specify)
httpx -l recon/web-candidates.txt \
  -p 80,443,8000,8001,8080,8443,3000,8843,9000 \
  -title -status-code -tech-detect -web-server -content-length \
  -follow-redirects \
  -json -o recon/httpx.json -silent

# Plain list of live URLs for the next steps
jq -r '.url' recon/httpx.json | sort -u | tee recon/live-urls.txt

The flags that matter:

  • -tech-detect — Wappalyzer-style fingerprinting (CMS, framework, server). Really useful for the hunting phase.
  • -title -status-code -web-server — fast triage columns.
  • -follow-redirects — catches apps that bounce http→https or to a login.

Virtual hosts matter here. The same IP can serve different apps per Host: header, so probe by hostname, not just IP, and name-based vhosts get discovered. If you have many names on one IP, httpx handles the list — just make sure the hostnames (not only IPs) are in your candidate file.

Screenshot the web surface

Eyeballing screenshots is the fastest way to spot login panels, default install pages, admin consoles, and abandoned apps across a large estate.

Arsenic originally used aquatone for this. Aquatone is archived now, so I use one of these instead.

Option A — gowitness (what I use)

gowitness scan file -f recon/live-urls.txt \
  --screenshot-path report/static/screenshots \
  --write-db   # SQLite report you can browse
gowitness report server   # browse at http://localhost:7171

Option B — httpx built-in screenshots

If you’d rather not add a tool, httpx can screenshot during the probe:

httpx -l recon/web-candidates.txt -p 80,443,8080,8443 \
  -screenshot -srd report/static/screenshots -silent

Option C — aquatone (still works)

If you’re maintaining an existing aquatone-based flow:

cat recon/live-urls.txt \
  | aquatone -ports 80,443,3000,8000,8001,8080,8443 \
             -out report/static/aquatone

Open the report and bucket what you see:

  • Login panels → credential testing, default creds, auth bypass.
  • Default/install pages → unconfigured apps, often exploitable.
  • Admin consoles (Tomcat Manager, Jenkins, phpMyAdmin, Grafana) → high-value targets; check default creds immediately.
  • Errors / stack traces → version disclosure, debug endpoints.
  • Parked / blank → deprioritize.

Promising apps go to Content Discovery for deeper fuzzing, and the whole live-URL list feeds vulnerability hunting.

3.4 - Content Discovery

Fuzz web roots for hidden directories, files, and endpoints the app doesn’t link to.

Apps expose far more than their navigation shows: /admin, /.git/, /backup.zip, /api/v1, /.env, old /test.php files. Content discovery brute-forces paths against a wordlist to find them. Run it against every live web service from HTTP probing.

Pick a fuzzer

Arsenic supports gobuster, dirb, and ffuf, defaulting to ffuf. The two I actually use:

  • ffuf — fast, flexible, good filtering; my default.
  • feroxbuster — recursive by default, nice for deep trees.

Wordlists

SecLists is where I pull wordlists from. A solid general-purpose stack (this mirrors Arsenic’s default web-content set):

Discovery/Web-Content/common.txt
Discovery/Web-Content/raft-medium-words.txt
Discovery/Web-Content/raft-large-directories.txt
Discovery/Web-Content/quickhits.txt
Discovery/Web-Content/RobotsDisallowed-Top1000.txt

Build a combined, de-duplicated list once:

cat /opt/SecLists/Discovery/Web-Content/{common,raft-medium-words,quickhits}.txt \
  | sort -u > recon/wordlist-web-content.txt

Tailor it to the tech you fingerprinted: a Tomcat box gets tomcat.txt, a Jenkins box gets Jenkins-Hudson.txt, and so on.

Run ffuf

url="https://app.example.com"
host=app.example.com
mkdir -p "hosts/$host/recon"

ffuf -u "$url/FUZZ" \
     -w recon/wordlist-web-content.txt \
     -ac \
     -mc all -fc 404 \
     -recursion -recursion-depth 2 \
     -H "User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Firefox/68.0" \
     -of json -o "hosts/$host/recon/ffuf.json"

What the flags do (these match the Arsenic as-ffuf defaults):

  • -acauto-calibration: ffuf learns the “not found” response shape and filters it automatically. You need this, or you drown in false positives.
  • -mc all -fc 404 — match everything, filter out 404s. Lets you see 401/403 (exists but protected) and 500 (something broke = interesting).
  • -recursion -recursion-depth 2 — dig into discovered directories.
  • -e .php,.bak,.zip,.txt — add extension fuzzing when you know the stack.

Tune signal, not noise

Auto-calibration handles most of the junk, but apps that return 200 for everything need manual filtering. Inspect the size/word/line distribution and filter the dominant bucket:

# How many results per status code?
jq '.results[].status' hosts/app.example.com/recon/ffuf.json | sort | uniq -c

# Filter by response size if a wildcard 200 is flooding results
ffuf -u "$url/FUZZ" -w wordlist.txt -fs 1234   # filter that exact size

Arsenic’s as-prune-ffuf does exactly this after the fact — trimming the dominant status/size bucket out of a bloated results file so what’s left is signal.

What to chase

From the results, prioritize:

  • Auth panels & admin paths (/admin, /manager, /wp-admin).
  • Source/secrets leakage (/.git/, /.env, /config.php.bak, /backup/).
  • APIs (/api, /swagger, /graphql) — often under-protected.
  • Anything 403 — it exists and someone tried to hide it.

Discovered endpoints and the technologies you fingerprinted both feed the Vulnerability Hunting phase.

4 - Vulnerability Hunting

Turn your service and web inventory into a prioritized list of likely vulnerabilities, at scale.

By now you have a full inventory: services with versions, live web apps with fingerprinted technologies, and discovered content. Hunting is where you work through that inventory for known weaknesses — fast, broad, and automated first, then manual verification.

This is identification, not exploitation. The output is a list of probable findings, each of which you then confirm by hand (see Evidence & Reporting). Automated scanners produce false positives, so never report a finding you haven’t reproduced.

The hunting passes

  1. Automated scanning with nuclei — template-driven checks for CVEs, misconfigurations, exposures, and default credentials across every web service at once.
  2. Known exploits with searchsploit — map your nmap service versions to public exploits.
  3. Subdomain takeover — find dangling DNS records pointing at unclaimed cloud resources.

Order of operations

Run the cheap, broad passes first to triage, then go deep on what they flag:

nuclei (broad) ──┐
searchsploit ────┼──▶ triaged candidate findings ──▶ manual verification ──▶ report
takeover checks ─┘

Then pivot to manual, application-specific testing on the high-value targets the recon screenshots surfaced — login panels, admin consoles, APIs. Automation finds the easy 80%; the findings that actually matter usually come out of the manual 20%.

Stay in scope, stay polite

Hunting is the loudest phase — nuclei alone can fire thousands of requests per host. Re-read your rules of engagement: honor rate limits (nuclei -rl), keep intrusive templates off fragile production, and never run exploitation modules without explicit authorization.

Start with nuclei.

4.1 - Automated Scanning with Nuclei

Run template-based vulnerability checks across every web service to surface CVEs, misconfigurations, and exposures fast.

nuclei does most of the heavy lifting in the hunting phase. It runs a big community library of YAML templates — each one a precise check for a specific CVE, misconfiguration, default credential, or information exposure — against your targets. It’s fast, the false-positive rate is low (each template encodes a real matcher), and the template set gets updated constantly.

Setup

nuclei -update            # update the binary
nuclei -update-templates  # pull the latest template library

Run against your live URLs

Feed nuclei the live URL list from HTTP probing. Use a project file so re-runs don’t repeat work:

nuclei -l recon/live-urls.txt \
  -project -project-path .nuclei \
  -severity low,medium,high,critical \
  -o recon/nuclei-all.txt \
  -json-export recon/nuclei-all.json \
  -stats

Scan by template category

Arsenic splits nuclei into focused passes — technologies first (to fingerprint what’s there), then CVEs (to find what’s exploitable). Running targeted template groups is faster and easier to triage than one giant run:

# Fingerprint technologies / detections
nuclei -l recon/live-urls.txt -project -project-path .nuclei \
  -tags tech -o recon/nuclei-technologies.txt

# Known CVEs
nuclei -l recon/live-urls.txt -project -project-path .nuclei \
  -tags cve -severity high,critical -o recon/nuclei-cves.txt

# Exposures: panels, config files, backups, secrets
nuclei -l recon/live-urls.txt -project -project-path .nuclei \
  -t http/exposures/ -t http/exposed-panels/ -o recon/nuclei-exposures.txt

# Default credentials
nuclei -l recon/live-urls.txt -project -project-path .nuclei \
  -t http/default-logins/ -o recon/nuclei-default-logins.txt

Split results back to per-host

To keep findings with their host (Arsenic stores nuclei-cves.txt under each host’s recon/), partition the output by hostname:

while read -r host; do
  grep -F "$host" recon/nuclei-cves.txt > "hosts/$host/recon/nuclei-cves.txt"
  [ -s "hosts/$host/recon/nuclei-cves.txt" ] || rm -f "hosts/$host/recon/nuclei-cves.txt"
done < <(ls hosts/)

Be a good guest

  • -rl 150 — cap requests/second (rate limit) for fragile targets.
  • -c 25 — control concurrency.
  • -exclude-tags intrusive,dos,fuzz — skip templates that can damage or destabilize a target unless you’re explicitly cleared for them.
  • -proxy http://127.0.0.1:8080 — route through Burp to log and review every request.

Triage every hit

Nuclei is high-signal but it isn’t infallible. For each result:

  1. Read the matched template (nuclei -tl -t <template> shows it).
  2. Reproduce the finding manually — curl, browser, or Burp.
  3. Only then promote it to a finding.

The template severity is a starting point; the real severity depends on the asset and the business context. A “medium” exposure on an admin panel can be your highest-impact finding.

Next: map service versions to public exploits with searchsploit.

4.2 - Known Exploits with SearchSploit

Map the service versions you found to public exploits in Exploit-DB.

Your service enumeration produced versioned services — vsftpd 2.3.4, Apache Tomcat 8.5.32, OpenSSH 7.2. SearchSploit checks those versions against the offline Exploit-DB archive, so you can find public exploits without leaving your terminal.

Setup

searchsploit ships with the exploitdb package. Keep the database current:

sudo apt install exploitdb     # or: git clone exploitdb to /opt and symlink
searchsploit -u                # update the local exploit database

Feed it your nmap results directly

The part I like: searchsploit reads nmap XML and looks up every detected service automatically. Point it at the version/service scans (not the quick port sweeps). This is what Arsenic’s as-searchsploit does:

find hosts -name 'nmap-tcp.xml' | while read -r xml; do
  echo "[*] $xml"
  searchsploit --nmap "$xml" 2>/dev/null | tee "$xml.searchsploit.txt"
done

Manual lookups

For one-off checks:

searchsploit apache tomcat 8.5
searchsploit --cve 2021-44228          # search by CVE
searchsploit -x linux/remote/12345.c   # view an exploit
searchsploit -m linux/remote/12345.c   # copy it to the cwd to inspect/use

Verify before you trust

Public exploits are a lead, not a finding:

  • Match the version precisely. Exploits are version-specific; “close” often doesn’t fire, and a mismatched exploit can crash the service.
  • Read the code before running it. Exploit-DB hosts unvetted PoCs — some are broken, some are trojaned, some are destructive. Understand what it does first.
  • Confirm exploitability in your authorized scope. Running an RCE exploit is exploitation, not identification — make sure your rules of engagement permit it, and prefer a benign proof (version banner, safe PoC) where you can.

Complement with nuclei and Metasploit

  • nuclei -tags cve (previous page) overlaps usefully — it actively tests many CVEs rather than just matching versions.
  • Metasploit’s search and db_import (of your nmap XML) is another route to map services to modules when you’re cleared to exploit.

Confirmed-exploitable services become findings. Next, check for subdomain takeover.

4.3 - Subdomain Takeover

Find dangling DNS records pointing at unclaimed cloud resources you can hijack.

A subdomain takeover happens when a DNS record (usually a CNAME) points at a third-party service — S3, GitHub Pages, Heroku, Azure, a SaaS app — that’s since been deleted or never claimed. Anyone who registers that resource controls content served on the client’s subdomain. It’s high-impact and often overlooked, and your discovery phase already handed you the full subdomain list to check.

How to spot one

The signature is a CNAME pointing to an external service that returns a service-specific “not found / no such bucket / unclaimed” error. For example, assets.example.com CNAME’d to a non-existent S3 bucket returns NoSuchBucket.

Arsenic originally caught these via aquatone’s takeover tags. The dedicated tools are more reliable now.

subzy (what I use)

subzy checks a list of subdomains against a fingerprint database of vulnerable services:

subzy run --targets recon/domains/example.com/subdomains.txt \
          --hide_fails --verify_ssl

nuclei takeover templates

nuclei ships a maintained set of takeover detection templates — convenient if it’s already in your pipeline:

nuclei -l recon/live-urls.txt -t http/takeovers/ -o recon/takeovers.txt

Manual confirmation

Always confirm before reporting — and do not actually claim the resource unless your rules of engagement explicitly authorize proving the takeover:

# 1. Confirm the dangling CNAME
dig +short CNAME assets.example.com        # -> some-bucket.s3.amazonaws.com

# 2. Confirm the target service returns an unclaimed/error fingerprint
curl -sI https://assets.example.com        # -> NoSuchBucket / 404 service error

Cross-reference the fingerprint against can-i-take-over-xyz, which catalogs which services are takeover-vulnerable and how to verify each safely.

Reporting

A confirmed dangling record is reportable on its own — you don’t need to seize the resource to prove impact. Document the vulnerable subdomain, the dangling CNAME target, the service fingerprint, and the standard remediation (remove the stale DNS record, or reclaim/repoint the resource). Capture evidence per Evidence & Reporting.

5 - Evidence & Reporting

Capture proof as you go, structure findings consistently, and turn a pile of scan output into a deliverable.

The report is the product. A client doesn’t pay for scans — they pay for a clear, reproducible account of what’s wrong and how to fix it. The single biggest quality multiplier is capturing evidence as you find it, not reconstructing it the night before the deadline.

Capture evidence in the moment

When you confirm something, grab proof immediately — the request/response, a screenshot, the exact command. You won’t be able to recreate a transient condition later.

  • Screenshots of the rendered finding (login bypassed, data exposed, admin console reached). Arsenic’s screenshot helper drops a timestamped PNG into report/static/ and copies a Markdown image tag to your clipboard, so you can paste it straight into notes. The mechanics:

    ts=$(date +'%Y-%m-%d_%H%M')
    flameshot gui -p "report/static/finding-name-$ts.png"   # or: maim -s, scrot -s
    
  • Raw request/response for web findings — save the full HTTP exchange (Burp “Copy to file”, or curl -v). This is what makes a finding reproducible.

  • The exact command you ran, with output. Your git-tracked scan files already hold most of this — another reason to commit as you go (see Engagement Setup).

Structure each finding consistently

Use one folder per finding with a fixed file layout, so every finding has the same sections (this is Arsenic’s report/findings/ convention):

report/findings/sql-injection-login/
├── 00-metadata.md          # title, severity, CVSS, affected assets, status
├── 01-summary.md           # what it is, in plain language
├── 02-affected-assets.md   # exact URLs / hosts / parameters
├── 03-recommendations.md   # how to fix it
├── 04-references.md        # CWE, CVE, vendor advisories, OWASP
└── 05-steps-to-reproduce.md # numbered steps + evidence screenshots

A good finding answers, in order: what is it? Where is it? How bad is it? How do I prove it? How do I fix it?

Rate severity defensibly

Use a consistent scale — CVSS 3.1 is the common denominator — but adjust for business context. A medium-severity exposure on an internet-facing admin panel with customer data outranks a “high” on a decommissioned staging box. State your reasoning so the rating holds up in the readout.

Assemble the deliverable

A typical report structure:

  1. Executive summary — risk in business terms, for non-technical readers.
  2. Methodology & scope — what you tested, what you didn’t, when, and how. Your phase-by-phase workflow is this section.
  3. Findings — sorted by severity, each in the structure above.
  4. Appendices — full host/service inventory, tool versions, raw output references.

Hugo / Docsy as a report engine

Because this whole workspace is Markdown in git, a static-site generator makes a natural reporting front end. arsenic-hugo renders the hosts/, recon/, and report/ trees into a browsable site — host inventory, screenshot galleries, and findings. It’s handy for collaborating with a team and for handing clients an interactive deliverable alongside the PDF.

Close out cleanly

  • Re-test any findings the client remediates and note the verification.
  • Archive the git repo and evidence per your data-retention agreement, then securely delete client data when the retention window closes.
  • Capture lessons learned — a new tool, a new wordlist, a step worth automating next time. That feedback loop is exactly how this toolchain (and Arsenic) grew in the first place.

6 - Toolbox Reference

Install commands for the full toolchain, plus a mapping from the tools Arsenic originally automated to what I use now.

This is the install-and-cheat-sheet companion to the handbook. It lists every tool the methodology uses, where it fits, and — where the original Arsenic tooling has aged out — what I replaced it with.

Original → current tool mapping

Arsenic was built around a late-2010s OSCP-era toolchain. Most of it still holds up; a few tools have been superseded. The handbook uses the right-hand column.

PhaseJobArsenic originally usedWhat I use now
DiscoverySubdomain enum (passive)amass, crt.shsubfinder + amass + crt.sh
DiscoverySubdomain bruteamass -brute, gobuster dnsdnsx / puredns
DiscoveryMass DNS resolutionfast-resolvdnsx (or puredns/massdns)
DiscoveryHost livenessnmap -snnmap -sn + naabu
ReconPort scannmapnmap + naabu (masscan for huge ranges)
ReconService/versionnmap -sV -sC -Asame
ReconHTTP probehttpxhttpx
ReconScreenshotsaquatone (archived)gowitness (or httpx -screenshot)
ReconContent discoveryffuf, gobuster, dirbffuf + feroxbuster
HuntingVuln templatesnucleinuclei
HuntingKnown exploitssearchsploit (Exploit-DB)same
HuntingSubdomain takeoveraquatone tagssubzy / nuclei takeover templates
GlueScope ingestmlr (miller), jqsame
GlueDedup/diff listscustom sort/commanew
ReportingScreenshotsflameshot/maim + xclipsame
ReportingReport sitehugo + arsenic-hugosame

Install

ProjectDiscovery suite (Go)

Most of the modern recon flow runs on these. Install Go first and make sure $(go env GOPATH)/bin is on your $PATH.

go install github.com/projectdiscovery/subfinder/v2/cmd/subfinder@latest
go install github.com/projectdiscovery/dnsx/cmd/dnsx@latest
go install github.com/projectdiscovery/naabu/v2/cmd/naabu@latest
go install github.com/projectdiscovery/httpx/cmd/httpx@latest
go install github.com/projectdiscovery/nuclei/v3/cmd/nuclei@latest
nuclei -update-templates

naabu needs libpcap (sudo apt install libpcap-dev).

Fuzzing & content discovery

go install github.com/ffuf/ffuf/v2@latest
# feroxbuster
curl -sL https://raw.githubusercontent.com/epi052/feroxbuster/main/install-nix.sh | bash
# gobuster (optional)
go install github.com/OJ/gobuster/v3@latest

Screenshots

go install github.com/sensepost/gowitness@latest
# gowitness needs a Chromium/Chrome browser present

Classics & glue (Debian/Kali)

sudo apt install -y nmap amass exploitdb jq miller curl whois dnsutils \
                    flameshot maim xclip
# anew
go install github.com/tomnomnom/anew@latest

Takeover & DNS helpers

go install github.com/PentestPad/subzy@latest
# puredns (needs massdns) — optional, a really good resolver
go install github.com/d3mondev/puredns/v2@latest

Wordlists

git clone https://github.com/danielmiessler/SecLists /opt/SecLists

The SecLists paths this handbook references:

  • Discovery/Web-Content/ — content discovery wordlists.
  • Discovery/DNS/ — subdomain brute-force wordlists.
  • Fuzzing/ — injection payload lists (SQLi, XSS).

nmap unprivileged setup

sudo setcap cap_net_raw,cap_net_admin,cap_net_bind_service+eip "$(command -v nmap)"

Passive discovery yield roughly doubles with API keys. Add them to:

  • ~/.config/subfinder/provider-config.yaml
  • ~/.config/amass/config.ini

Providers worth setting up (free or cheap): Censys, SecurityTrails, Shodan, VirusTotal, GitHub, Chaos, BeVigil. Never commit these keys to your engagement repo.