Author: Mark Dixon
Monday, August 3, 2015
Back in May, I wrote a couple of posts about Illicit Internet bots:
I recently read a short, but interesting report on “Scraping,” a process of using bots and similar tools to steal information. The Scraping Threat Report 2015 published by ScrapeSentry. This reports includes this definition:
Scraping (also known as web scraping, screen scraping or data scraping) is when large amounts of data from a web site is copied manually or with a script or program. Malicious scraping is the systematic theft of intellectual property in the form of data accessible on a web site.
This theft of intellectual property can be very damaging to businesses. If, for example, a scraper can download airline fares from a legitimate site through illicit means, the stolen data can be exploited to fuel unfair business practices.
Some interesting statistics:
- 17 % increase in scraping attacks in 2014
- 22 % of all site visitors are considered to be scrapers
- 49 % of the total scraping traffic originates from the US, but the ratio of total traffic to scraper traffic is worst from traffic originating in China.
- China accounts for 1.40 % of the total traffic but 17.13 % of the scraper traffic.
- Companies in the travel industry remain top targets for scrapers, closely followed by Online Directories and Online Classifieds.
- Amateur Scrapers: These scrapers utilize a small number of IP addresses and user agent strings, and are blatantly visible in traffic logs.
- Professional Scrapers: These scrapers are much more elusive, and usually redistribute what they scrape to other companies for a profit.
- Advanced Scrapers: These scrapers are extremely dedicated and have a wide range of IP addresses. They change their browsing tactics and user-agents moments after a block.
In short, if you are an Internet user, these scrapers are generating so much traffic that they are undoubtedly impacting the performance of websites you visit. If you are website operator and your website contains any type of information that could exploited for nefarious purposes, scrapers probably have already penetrated your defenses or at least have you in their bomb sights.