Katana: fast crawler cheat-sheet
katana is a fast crawler focused on execution in automation pipelines offering both headless and non-headless crawling.
Crawl a list of URLs:
katana -list https://example.com,https://google.com,...
Crawl a uRL using headless mode using Chromium:
katana -u https://example.com [-hl|-headless]
Use subfinder to find subdomains, and then use passive sources (Wayback Machine, Common Crawl, and AlienVault) for URL discovery:
subfinder [-dL|-list] path/to/domains.txt | katana -passive
Pass requests through a proxy (http/socks5) and use custom headers from a file:
katana -proxy http://127.0.0.1:8080 [-H|-headers] path/to/headers.txt -u https://example.com
Specify the crawling strategy, depth of subdirectories to crawl, and rate limiting (requests per second):
katana [-s|-strategy] depth-first|breadth-first [-d|-depth] value [-rl|-rate-limit] value -u https://example.com
Find subdomains using subfinder, crawl each for a maximum number of seconds, and write results to an output file:
subfinder [-dL|-list] path/to/domains.txt | katana [-ct|-crawl-duration] value [-o|-output] path/to/output.txt
See also: gau, scrapy, waymore.
More information: https://github.com/projectdiscovery/katana .
I hope you enjoyed reading this post as much as I enjoyed writing it. If you know a person who can benefit from this information, send them a link of this post. If you want to get notified about new posts, follow me on YouTube , Twitter (x) , LinkedIn , and GitHub .