› All Posts › programming › Katana: fast crawler cheat-sheet

Katana: fast crawler cheat-sheet

Apr 12, 2025 · 197 words · 1 minute read

katana is a fast crawler focused on execution in automation pipelines offering both headless and non-headless crawling.

Crawl a list of URLs:

katana -list https://example.com,https://google.com,...

Crawl a uRL using headless mode using Chromium:

katana -u https://example.com [-hl|-headless]

Use subfinder to find subdomains, and then use passive sources (Wayback Machine, Common Crawl, and AlienVault) for URL discovery:

subfinder [-dL|-list] path/to/domains.txt | katana -passive

Pass requests through a proxy (http/socks5) and use custom headers from a file:

katana -proxy http://127.0.0.1:8080 [-H|-headers] path/to/headers.txt -u https://example.com

Specify the crawling strategy, depth of subdirectories to crawl, and rate limiting (requests per second):

katana [-s|-strategy] depth-first|breadth-first [-d|-depth] value [-rl|-rate-limit] value -u https://example.com

Find subdomains using subfinder, crawl each for a maximum number of seconds, and write results to an output file:

subfinder [-dL|-list] path/to/domains.txt | katana [-ct|-crawl-duration] value [-o|-output] path/to/output.txt