Dark Crawler, a Useful Tool to Assess Child Exploitation from Online Communities | WhoisXML API

Success Stories

Dark Crawler, a Useful Tool to Assess Child
Exploitation from Online Communities

The research organization

Simon Fraser University International Cybercrime Research Center

The International Cybercrime Research Centre (ICCRC) was launched on July 8, 2008, at SFU's Surrey campus, by the B.C. Minister of Labour and Citizen's Services, in partnership with the School of Criminology and the Society for the Policing of Cyberspace (POLCYB).

Mission statement

To promote education and conduct research in cybercrime prevention, detection, and response, in collaboration with the public and private sectors at the regional, national, and international levels.

Members of the research

Dr. Martin Bouchard

Director, International Cybercrime Research Centre. Professor, School of Criminology, Simon Fraser University.

Dr. Richard Frank

Associate Director, International Cybercrime Research Centre. Assistant Professor, School of Criminology, Simon Fraser University.

Bryan Monk

He is a master’s student in the School of Criminology at Simon Fraser University. His primary research interests include social network analysis, cybercrime, geocoding IP addresses, Dark Web, cryptocurrencies, and cryptography.

Russell Allsup

International Cybercrime Research Center, School of Criminology, Simon Fraser University, Burnaby, Canada

Evan Thomas

International Cybercrime Research Center, School of Criminology, Simon Fraser University, Burnaby, Canada


Child sexual offenders have historically been quick to adapt technological advances, such as photography and film for the purposes of exploiting children. The movement of child exploitation material (CEM) to the Internet has enabled them to form virtual communities online, allowing them to more easily, and secretively, access and trade CEM, recruit co-offenders and/or business partners, as well as validate their deviant behavior amongst other offenders.

Despite the established harm inherent within child exploitation imagery and distribution online, current attempts to limit such content have been largely unsuccessful. Law enforcement strategies intended to target CEM online have included chat-room stings, honey trap sites, injunctions issued against websites hosting child pornography, and traditional criminal investigations and investigatory techniques adapted for online use.


Dark Crawler is a tool used by search-engines to automatically navigate the Internet and collect information about each website and webpage. Search engines utilize such tools to collect data which allows users to perform queries to find information. They can also be used to seek out specific content, such as child exploitation material, as in the case of the study presented in this paper. Given a starting webpage, web-crawlers will recursively follow the links out of that webpage, until some user-specified termination conditions apply. During this process, the crawler will keep track of all the links between other websites and follow them to retrieve those as well.

To perform this research a software tool called the “Location Extraction of Child Exploitation Networks” (LECEN) was utilized. LECEN is a customized web-crawler which has a unique ability to identify registrants, their physical addresses, and domains belonging to them, allowing to identify potential major players based on an individual’s location within the network.

Phases of solution

Phase 1 - data collection

LECEN starts by downloading a set of webpages which have been identified by the operator as containing CEM.

Phase 2 – constructing the network

The resulting web-crawler data was used to construct two networks. It should be stated that at no point does LECEN contravene or enter password protected websites. The first network, referred to as the “Domain Network”, was focused on the domains of the websites, where the nodes consisted only of website domains, while edges in the network represented the number of hyperlinks between the two corresponding domains.

The second network, referred to as the “Registrant Network”, focused on the registrant data, where the nodes represented the legal owners of those same domains identified in the Domain Network, with the edges representing the number of hyperlinks between the sites that those registrants owned.

Phase 3 – WHOIS

The Internet’s WHOIS service, originally referred to as Nicname, is a text-based query-response protocol which allows individuals to get an Internet domain' registrant information. This lookup allows IP addresses to be traced beyond the simple connection to the hosted site and provides details regarding the individual who owns an account linked to the domain in question.

Phase 4 – geolocation

Geolocation refers to the process of identifying locations of Internet devices (such as an IP address, a cellphone, or a computer terminal) and involves mapping the Internet protocol addresses to real world geographic locations of their hosts. The end result is either an address in the form of city/state/country, or a longitude/latitude pair.

Phase 5 – storage

All this information is then stored in a central database for later analysis.


The tool will be very beneficial to everyone, especially investigators as it provides a way to easily determine the origin of offenders which guarantees faster investigations and quick results.


[1] Networking in Child Exploitation https://www.whoisxmlapi.com/sources/networking.in.ce.pdf

[2] Measuring Cybercrime: The Example of Online Child Exploitation https://www.whoisxmlapi.com/sources/child.exploitation.pptx

[3] Dark Crawler https://www.whoisxmlapi.com/sources/dark.crawler.v02.pdf

[4] LECENing Places to Hide: Geo-Mapping Child Exploitation Material https://www.whoisxmlapi.com/sources/cp.geolocation.v14.pdf

See other success stories
Try our WhoisXML API for free
Get started