Content Discovery
Apps expose far more than their navigation shows: /admin, /.git/,
/backup.zip, /api/v1, /.env, old /test.php files. Content discovery
brute-forces paths against a wordlist to find them. Run it against every live web
service from HTTP probing.
Pick a fuzzer
Arsenic supports gobuster, dirb, and ffuf, defaulting to ffuf. The two
I actually use:
- ffuf — fast, flexible, good filtering; my default.
- feroxbuster — recursive by default, nice for deep trees.
Wordlists
SecLists is where I pull wordlists
from. A solid general-purpose stack (this mirrors Arsenic’s default web-content
set):
Discovery/Web-Content/common.txt
Discovery/Web-Content/raft-medium-words.txt
Discovery/Web-Content/raft-large-directories.txt
Discovery/Web-Content/quickhits.txt
Discovery/Web-Content/RobotsDisallowed-Top1000.txt
Build a combined, de-duplicated list once:
cat /opt/SecLists/Discovery/Web-Content/{common,raft-medium-words,quickhits}.txt \
| sort -u > recon/wordlist-web-content.txt
Tailor it to the tech you fingerprinted: a Tomcat box gets tomcat.txt, a
Jenkins box gets Jenkins-Hudson.txt, and so on.
Run ffuf
url="https://app.example.com"
host=app.example.com
mkdir -p "hosts/$host/recon"
ffuf -u "$url/FUZZ" \
-w recon/wordlist-web-content.txt \
-ac \
-mc all -fc 404 \
-recursion -recursion-depth 2 \
-H "User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Firefox/68.0" \
-of json -o "hosts/$host/recon/ffuf.json"
What the flags do (these match the Arsenic as-ffuf defaults):
-ac— auto-calibration: ffuf learns the “not found” response shape and filters it automatically. You need this, or you drown in false positives.-mc all -fc 404— match everything, filter out 404s. Lets you see 401/403 (exists but protected) and 500 (something broke = interesting).-recursion -recursion-depth 2— dig into discovered directories.-e .php,.bak,.zip,.txt— add extension fuzzing when you know the stack.
Tune signal, not noise
Auto-calibration handles most of the junk, but apps that return 200 for
everything need manual filtering. Inspect the size/word/line distribution and
filter the dominant bucket:
# How many results per status code?
jq '.results[].status' hosts/app.example.com/recon/ffuf.json | sort | uniq -c
# Filter by response size if a wildcard 200 is flooding results
ffuf -u "$url/FUZZ" -w wordlist.txt -fs 1234 # filter that exact size
Arsenic’s as-prune-ffuf does exactly this after the fact — trimming the
dominant status/size bucket out of a bloated results file so what’s left is
signal.
What to chase
From the results, prioritize:
- Auth panels & admin paths (
/admin,/manager,/wp-admin). - Source/secrets leakage (
/.git/,/.env,/config.php.bak,/backup/). - APIs (
/api,/swagger,/graphql) — often under-protected. - Anything
403— it exists and someone tried to hide it.
Discovered endpoints and the technologies you fingerprinted both feed the Vulnerability Hunting phase.