[Log In] []

Exploring the science and magic of Identity and Access Management
Tuesday, October 20, 2020

The Scraping Threat Report 2015

Information Security
Author: Mark Dixon
Monday, August 3, 2015
5:33 pm

Scraping

Back in May, I wrote a couple of posts about Illicit Internet bots:

I recently read a short, but interesting report on “Scraping,” a process of using bots and similar tools to steal information. The Scraping Threat Report 2015  published by ScrapeSentry. This reports includes this definition:

Scraping (also known as web scraping, screen scraping or data scraping) is when large amounts of data from a web site is copied manually or with a script or program. Malicious scraping is the systematic theft of intellectual property in the form of data accessible on a web site.

This theft of intellectual property can be very damaging to businesses. If, for example, a scraper can download airline fares from a legitimate site through illicit means, the stolen data can be exploited to fuel unfair business practices.

Some interesting statistics:

  • 17 % increase in scraping attacks in 2014
  • 22 % of all site visitors are considered to be scrapers
  • 49 % of the total scraping traffic originates from the US, but the ratio of total traffic to scraper traffic is worst from traffic originating in China.
  • China accounts for 1.40 % of the total traffic but 17.13 % of the scraper traffic.
  • Companies in the travel industry remain top targets for scrapers, closely followed by Online Directories and Online Classifieds.
Scrapers are generally categorized into the following areas:
  • Amateur Scrapers: These scrapers utilize a small number of IP addresses and user agent strings, and are blatantly visible in traffic logs.
  • Professional Scrapers: These scrapers are much more elusive, and usually redistribute what they scrape to other companies for a profit.
  • Advanced Scrapers: These scrapers are extremely dedicated and have a wide range of IP addresses. They change their browsing tactics and user-agents moments after a block.

In short, if you are an Internet user, these scrapers are generating so much traffic that they are undoubtedly impacting the performance of websites you visit. If you are website operator and your website contains any type of information that could exploited for nefarious purposes, scrapers probably have already penetrated your defenses or at least have you in their bomb sights.

Comments Off on The Scraping Threat Report 2015 . Permalink . Trackback URL
WordPress Tags: , ,
 

Bots Generate a Majority of Internet Traffic

Information Security
Author: Mark Dixon
Friday, May 22, 2015
11:16 am

Bot1

According to the 2015 Bad Bot Landscape report, published by Distil Networks, only 40% of Internet traffic is generated by humans! Good bots (e.g. Googlebot and Bingbot for search engines) account for 36% or traffic, while bad bots account for 23%.

Bad bots continue to place a huge tax on IT security and web infrastructure teams across the globe. The variety, volume and sophistication of today’s bots wreak havoc across online operations big and small. They’re the key culprits behind web scraping, brute force attacks, competitive data mining, brownouts, account hijacking, unauthorized vulnerability scans, spam, man-inthe- middle attacks, and click fraud.

These are just averages. It’s much worse for some big players.

Bad bots made up 78% of Amazon’s 2014 traffic, not a huge difference from 2013. VerizonBusiness really cleaned up its act, cutting its bad bot traffic by 54% in 2014.

It was surprising to me that the US is the largest source for bad bot traffic.

The United States, with thousands of cheap hosts, dominates the rankings in bad bot origination. Taken in isolation, absolute bad bot volume data can be somewhat misleading. Measuring bad bots per online user yields acountry’s “Bad Bot GDP.”

Using this latter “bad bots per online user” statistic, the nations of Singapore, Israel, Slovenia and Maldives are the biggest culprits.

The report contains more great information for those who are interested in bots. Enjoy!

Comments Off on Bots Generate a Majority of Internet Traffic . Permalink . Trackback URL
WordPress Tags:
 
Copyright © 2005-2016, Mark G. Dixon. All Rights Reserved.
Powered by WordPress.