To see what URL:s we have crawled during the scan, go to your latest report and look at your Information findings (in green). Click the finding called “Crawled URL’s”
At the bottom of the finding , click the link under “Found at”.
Here you’ll see how many URL’s we tried to access during the scan and how many of those that we identified as unique. Under “References” and the very bottom, you will find a downloadable CSV file that you can go through to make sure we have visited all the relevant parts of your site.
In the CSV you’ll see a list of the URL’s we tried to visit and the status code of each of those.
Content-length is a length of a response we get.
We have a set of common paths that we always try to go to (e.g. /admin) and see if we can find anything there. These paths are marked as “seed” in your Crawled URLs report. Here you can also see paths that we picked up from any indexes, e.g. sitemap or robots.txt
If you see that the source of the path says “crawling”, it will mean that we found the path on our way when navigating through your site/clicking around.
“scanner” on the other hand means that a URL has been provided internally by the scanner (e.g. paths provided by a module) or by the user (via configurations like allowed paths or Trails)
"saved for security testing" - shows which paths we performed tests on (e.g. send payloads to)
"filtered as duplicate" - our tool is doing its best to find pages with similar structure and content and should therefore not need to crawl all pages or products on a web shop. This function is not 100% foolproof - we prefer considering more pages as unique to make sure we don't filter out pages that could lead to additional unique pages, and we use a wide range of heuristics to determine duplicity, hence the results may vary. If you however think we are filtering out pages that actually have different functionality, reach out to firstname.lastname@example.org and we'll lower the filtering threshold for you.
Why some of the paths appear in the CSV file a few times?
You may see same paths reappearing in a file a few times. That can happen when e.g.:
- we get a different response code the second time we visit the path
- the path is a duplicate
- we use a different method on the path (e.g. GET and POST) or
- we return to the path via a different referrer
*Referrer URLs are the paths that pointed us to a specific URL