Who's Pulling the Data Strings?
Print Issue: January 2004
Attempts to police the lawless frontiers of cyberspace are hampered by the ease with which individuals are able to create Web sites without leaving a clear trail. At a low cost and with virtually no technical know-how, anyone can create a virtual—and anonymous—presence on the Internet. A cyberinvestigator tasked with untangling the strands of this web must find clues that will identify the parties behind Web sites under investigation. But that’s more challenging than it sounds, as was made clear recently when the author was called in by an investigative firm to look into Web sites suspected of circumventing state and federal regulations.
The investigative firm’s client is a board representing distributors within the United States of products of an international, multibillion-dollar industry. The investigative firm was retained to discover if businesses outside of the United States were using certain Web sites to illegally participate in the marketing and sale of products to customers within the United States. The client provided the investigative firm with a list of suspect companies and the corresponding Web sites.
Armed with this list, the author’s task was to seek out publicly available information sources within the technical structure of the Internet that could help identify the geographical location of the Web servers and the individuals and companies responsible for the suspect Web sites.
The information sources discovered through this investigation are discussed ahead, along with how investigators can access them for data relevant to any investigation. Because the author’s investigation is ongoing and confidential, Web sites involved in the case will not be cited; instead, examples using Security Management magazine’s Web site will be used.
Coming to terms. When seeking information on the legal entities that are responsible for the content and conduct of Web sites, it’s necessary to understand the distinction between various Web-related terms such as domain names and IP (Internet protocol) addresses. These terms identify distinct concepts and provide information sources about those legal entities.
Domain name. A domain name is a word or series of words that is registered to a particular person or organization that identifies that entity’s Web presence. Domain names make the Internet more easily interpretable to humans than IP addresses, which are the Internet’s underlying system of numerical addresses (IP addresses are discussed in greater detail ahead). Withwww.securitymanagement.com, the domain name can be inferred simply by dropping the “www” prefix (this prefix identifies a Web page), which leaves securitymanagement.com. The suffix can describe the type of entity; for example, fbi.gov is a government agency; nyu.edu an educational institution; and army.mil a military organization. It can also identify the country of origin of a domain; for example, .uk for the United Kingdom.
URLs identify specific resources within a particular domain name (for example,www.securitymanagement.com/
library/001465.html refers to a particular document in the online version of Security Management magazine). Like domain names, URLs are a convenience for Web users. In order to locate the resource, Web browsers must resolve the URL to a numerical address. This numerical address is the IP address.
To register a domain name, one must pay a fee to a registrar. Many companies provide this service. One must provide the company with certain information, such as the name of the registered owner, as well as technical and administrative contact information including name, address, and phone/fax numbers. However, registrars the author dealt with did not verify the information, and the contact information may be changed at any time by the owner through a password-protected interface. Registrars do maintain information on the dates and times of the original registration of the domain name, the last update of information, or the expiration date of the domain name.
Domain name registrars. The Internet Corporation for Assigned Names and Numbers, known as ICANN, accredits the registrars that issue domain names. (Link to ICANN by visiting SM Online.) An initial inquiry about a particular domain name can be done at the Web site of any registrar, by using a service called Whois. The Whois search result will provide some of the registration information available, including the name of the specific registrar that issued the domain name. In the case of securitymanagement.com, for example, a Whois search run through a randomly chosen domain-name registrar reveals that the original registrar is Network Solutions, Inc.
Contact information. Next, inquiring at the Web site of the issuing registrar provides more detailed registration information, including the contact information for the registrant as well as for technical and administrative inquiries. For the domain name securitymanagement.com, the search at the Network Solutions Web site reveals that ASIS International is the registrant, and it provides a mailing address, telephone/fax numbers, and an e-mail address for the (unnamed) administrative and technical contact.
In some cases, the registrar, administrative contact, and technical contact are different entities. For example, an individual may be listed as the registrant, while an ISP (Internet service provider) or Web-hosting company may be listed for the administrative and technical contact. In other cases, the same individual is listed for all three contacts (and, as with SM Online, sometimes no particular individual’s name is listed). While the contact information provided varies, sometimes telephone numbers and e-mail addresses are listed for individuals, and Web sites are supplied for ISPs.
Contact information gathered from the registrars gives investigators a lead to follow, though the information may not be accurate, because the registrant could have given false information or could have changed the information at any time. Also, the registrant of a domain name may be only indirectly involved in the business of the Web site (for example, the registrant may be paid by the real site owner, who wishes to remain anonymous). Nevertheless, this contact information is worth checking out. In the investigation mentioned at the beginning of this article, the investigative firm actively pursued background checks on the listed contacts.
IP address. An IP address specifies a connection to the Internet and identifies the computer that is using the connection. So the next step in a Web-site investigation is to determine an IP address from the URL of the Web site in question. The IP address leads to a different set of information and clues on who is behind the Web site.
Generally, IP addresses are four numbers between 0 and 255, separated by periods. For example,www.securitymanagement.com resolves to the IP address 18.104.22.168.
Several freely available tools can be used to resolve a URL into its IP address. The author most frequently uses a utility called nslookup. Given a URL as input, the utility returns the IP address associated with the URL as output. With the IP address in hand, an investigator can query IP registries for publicly available information.
Registries. As with the domain name, an IP address has registration information; however, a different group of administrative bodies is responsible for issuing IP addresses. The official registries for IP addresses can provide a geographic location for a block of IP addresses as well as contact information on the party responsible for the administration of that block.
To get this type of information, investigators can start with the Internet Assigned Numbers Authority (IANA), which works with regional registries. Registries for five geographic regions allocate IP addresses to Internet service providers (ISPs), which in turn assign them to specific users. (Link to IANA by visiting SM Online.)
These regional registries are the next stop for the investigator. A search at the American Registry for Internet Numbers (ARIN) for 22.214.171.124 (the IP address forwww.securitymanagement.com) shows that the block of IP addresses from 126.96.36.199 to 188.8.131.52 is administered by a Virginia-based ISP. Contact information (including name, address, and phone number) on that provider is available in the search results.
However, information on the specific IP address in question may reside with a third party. For example, the search results mentioned earlier also show that the Virginia ISP allocated a portion of the initial block, including SM Online’s Web site, to a New Jersey web-hosting company. The contact and geographic information that the IP-address registries provide can help the investigator to pinpoint the administrator of the specific IP address and the physical location of the Web server that hosts the Web site.
More complicated scenarios exist. For example, the IP address may reference a Web server that acts as a forwarding service to the actual Web server of the Web site, similar to how call forwarding transparently transfers a phone call to a new location. Forwarding has a legitimate purpose but may also be used by someone who is trying to obscure his or her location. With regard to the Web server, the information on the forwarding service provided by the initial IP address may be a useful step in the investigation. (The investigator can hire technical experts who can trace through the forwarding server to locate the Web server that hosts the site.)
Web site. The administrator of an IP address (or a block of IP addresses) may be a useful source of information on identifying the company that hosts the Web site. (In the case ofwww.securitymanagement.com, contact information for the New Jersey Web-hosting company was made clear from the ARIN search.) Investigators can now look to the administrator at the Web-hosting company for information on the legal entity behind the site, or to identify the Web-site developer.
Web developers can be valuable sources of information, as they have detailed knowledge of the configuration of the Web site, which may include a contact database of clients and correspondence logs. For a Web site that is a point of sale, the configuration may include electronic-payment facilities. Accessing the traffic at the Web server may complement an investigative strategy by collecting information on volume of business (for example, by monitoring the electronic contact and payment mechanisms); however, these methods require the entity managing the Web server to be cooperative and may involve obtaining explicit legal authorization.
Complications. Locating the legal entities behind a suspicious Web site is not always a straightforward process. There is not always a direct link between domain names, URLs, IP addresses, and Web sites. And the ease with which these can be altered make it tricky to distinguish suspicious behavior from legitimate business needs.
Flexible relationship. Despite the amount of information that can be derived from domain-name registrars and IP-address registries, it is important to remember that the relationship between domain names and IP addresses is flexible. A domain name and associated URLs may refer to different IP addresses at different points in time. This flexibility allows the legal entity to change the physical location of the Web site if it wishes to switch server-hosting companies, for example, without having to change the domain name and URLs.
At the same time, several URLs may refer to the same IP address, which allows the legal entity behind a Web site to change the domain name without changing the physical location of the Web site. And it may be that none of the publicly available information on the domain name leads directly to the legal entity that is responsible for the content and conduct of the Web site. That person or company may have chosen to remain anonymous with respect to the registration information on the domain name.
Suspicious behavior. Changing domain names, URLs, or IP addresses is not unusual in the course of maintaining a business on the Web. Companies change domain names and URLs as part of marketing strategies, and businesses may switch Web-hosting services for financial reasons. However, investigating the domain names, URLs, and IP addresses is a useful first step in gaining information on Web sites that are in question and identifying those Web sites that may be engaged in suspect practices.
For example, in the case for which the author consulted, the author monitored the IP addresses during the investigation. Over time, monitoring this information provided some useful insights into the management of the Web site and helped reveal suspicious behavior. In one case, the contact information changed over the course of a week by moving from one continent to another. This can be an indication of a site attempting to remain out of the reach of law enforcement, and the client is currently reviewing the information regarding the changing of IP addresses and the companies to which the IP addresses had been allocated.
In addition to monitoring the URLs and associated Web sites for which the IP addresses had changed over time, the author also suggested that the client trace the Web-hosting services of those Web sites that had suspect domain-name registrants and had changed IP addresses. Once these Web-hosting companies were identified, the investigative firm could begin to trace the payment transactions activity and correspondence traffic at the suspect Web sites (with the cooperation of the company or with the legal authorization).
Those seeking to exploit the lawlessness of cyberspace will endeavor to take advantage of the anonymity they take for granted on the Internet. A critical first step in pulling apart this web of deceit lies in investigating domain names, URLs, and IP addresses to gather information on Web sites that may be engaged in suspect practices. Knowledge of how Web sites are constructed and put on the Internet gives investigators an effective tool in the fight against online crime.
Erik Nemeth, Ph.D., has a background in software development and Internet technology. He provides expertise in Internet technology and computing systems for investigations of theft and fraud.