e621 accepts sources from most anywhere, but which sites generate the most sources for e621? It is not that hard to run a query and sort by URL but we want something a little more fine-tuned than that. I would like to use the inbuilt tools for URL manipulation, but that can not seem to discard subdomains. That is, tell the difference between www.site.com
and site.com
. Because of that I used parse-domain to extract the domain and tld of each domain.
A full list of the sources can be found here.