There are several reasons why scrapers may be creating bad bots. One of the key reasons is to steal content/listings/prices from the target websites. Competitors and 3rd-party scrapers can scrape data to undermine the competitive advantage of their rivals in the market.
There are so many scraping/data extraction software and services in the market. Some of the popular tools being Import.io, Scrapy, Mozenda, Selenium, wGet, and so on. There are several ways scraped data can be used. The data can be structured and used for seemly useful (and harmless) analytical tasks, like comparing weather information over a period of time from various countries, or for malicious intents. Malicious intents could be comparing prices from a few competitor websites, and creating a pricing strategy to undercut the competition based on this information. Agencies that specialize in malicious bots and data scraping can be hired by competitors to steal proprietary information and gain an unfair advantage.
An eCommerce site would have invested millions of dollars in various resources - in stocking products, developing the web/mobile shopping portal, and most importantly, on pricing strategy. A competitor can send bots to scrape all the pricing info along with the product catalog, and list these items on their website at reduced prices. This seriously impacts the competitive edge of the targeted website, as the competition is attracting their existing and new users to their sites by slyly listing products at lower prices.
A scraper can be hired by the competitor to send thousands of bots to the target website. These bots can go about adding products to carts, only to be left abandoned later. Genuine users visiting the website will be frustrated on seeing ‘out of stock’ messages on items they wanted to buy. They will just go elsewhere.
Bots can severely overload server and bandwidth resources. Genuine users visiting websites like news portals, product review sites, or even eCommerce, will get annoyed with slower page loads, and just exit the site.
When unique/proprietary content posted on the website is scraped and posted immediately on another website, the original source might get outranked on search engines. This decreases the visibility the original website gets on search queries, leading to lesser users or subscribers, or even a loss in revenue from advertisements.
In most cases, the damage is done before the website even notices the scraping problem, let alone taking action on the scrapers. An automated bot prevention will be the best way to detect and take action on scrapers, in near real-time, before it’s too late.