You’ve worked hard to create the content on your website. Unfortunately, not everyone wants to work as hard on creating content for their site, so they steal content from other people’s sites. This is called site scraping. It’s a form of plagiarism that's a constant problem for website owners.
If you’re wondering what you can do to battle these content thieves and plagiarizers, read on. We’ve got helpful information for you.
Why Content Scraping Is a Big Problem
Content thieves can do real damage to your website, its rankings, its reputation and its popularity. Some specific ways content theft can hurt your site include:
- Links from scraping sites can hurt your site’s web search engine rankings. Google is constantly adjusting their search ranking methods and has a history of getting stricter as to which types of links to your content can hurt your rankings.
- The thieves’ scraped content might outrank your content on a search engine.
- Content theft could damage your ability to license your content to other online or physical publications.
As you can tell, this is a problem that can have serious repercussions. You need to take proactive steps to combat the problem.
5 Steps to Combat Content Scrapping
Here are five key steps (and tips) to tackle website scrapping and protect your website from content thieves who steal what you have worked hard to create:
1. Find the Sites Scrapping Your Content
It’s easy to find websites that are using your content without permission. These sites typically employ the same copy and paste method content burglars use to publish your content as their own.
If you want to quickly check if a website is publishing your content and claiming it as their own, use a plagiarism checker like Copyscape, Plagium, Dupli Checker, or Plagiarisma. All of these allow you to either paste a link to a specific web page or text from one of your articles and then check the web to determine if someone is using your content as their own.
Many writers use the software application Grammarly to check their writing for spelling and grammar mistakes. The app also checks your content for plagiarism to ensure you haven’t inadvertently written text identical to any already published on the web.
You can also use this handy feature to check for scraping of your content by copying and pasting your previously published articles into Grammarly. If you see a large amount of your article matching another website, there’s a good chance they’ve stolen your content.
2. Identify the Content Thieves
Okay, you’ve found a site that is scraping your content. What do you do next? First, check the website for a “Contact Us” page or some other type of contact information, such as an email or a name. Sites that steal content don’t often offer up contact info, but you might be lucky they do.
If there is no obvious contact info on the website, check the WHOIS records for info. WHOIS will sometimes tell you who the owner of a domain is. While it should always display the name of the domain name owner, many users make use of the private registration feature offered by their domain registrar or web host to hide this ownership information.
In this case, if you find that the website in question has its domain owner’s name and info hidden, you’ll still usually be able to find out the info about the registrar company, and possibly the web host’s nameservers. If the WHOIS results include the nameserver information, visit WhoIsHostingThis.com to help identify the hosting company.
3. Contact the Content Thieves, Their Web Host, and Popular Search Engines
Once you’ve found contact info, send a takedown notice to the owner of the website via email. The first email should be a polite, yet firm message requesting that they take down the scraped content. Give the scraper a few business days to respond before sending a second email.
I have found that an initial email is all it takes to have the issue resolved, because sometimes the offender has no idea what they’ve done is wrong. If you don’t hear back from them, escalate things with a more strongly worded second email, explaining again what they need to do, and that if needed you’ll take additional actions to force them to take down the content.
If an email gets no response, visit the web host’s site. You’ll often find an email or a contact form to be used specifically for customer abuses. Fill out the form, or send a takedown request via email. Many web hosts have their own Terms of Service that customers agree too, and there’s an excellent chance that scraping content is a violation of those terms.
Last, but certainly not least, contact all of the major search engines, including Google, Bing, DuckDuckGo, and any others you can think of. File a DMCA request with each of the search engines, which should get the offending materials removed from the engines search results.
4. Hit Content Thieves Where It Hurts the Most—Their Purse!
There is a good chance the offending site has ads on it. So, find out which ad networks the scraper is using and report the website’s offenses to the ad network. A reputable ad network doesn’t want their ads running on a site that steals content.
If your content is video or audio, such as is found on YouTube, and you’ve found that someone else is claiming the content as their own, contact the streaming site and request that it either be taken down or that the ad revenue generated by the content be redirected to you.
5. Prevent Image Hotlinking
If a scraper is hotlinking to your articles’ images, (which means they’re linking to the images you used and that are hosted on your website), they’re stealing your bandwidth in addition to your content.
To prevent this, redirect hotlink requests from outside websites to an image that clearly brands the offending site as a thief, or that offers your site’s URL as the original source of the image.
Check with your web host how to do this, as different web hosts or content management systems accomplish this task in various ways.
By following these proven tips, you can foil attempts by the owners of other websites to claim your hard work as their own and preventing you from getting your hard-earned credit, as well as any cash that is rightfully yours. Stay vigilant and stay strong; you can beat content thieves!