SEO Tips and Tricks 1

Duplicate Content

This is the first blog entry from the SEO Tips and Tricks series and today we will discuss about duplicate content. Duplicate content refers to large matching blocks of content within the same website or across different websites. The search engines are trying to keep their results’ relevancy high so they don’t want duplicate content in their databases. There’s no point in having and showing pages with the same text. It doesn’t help the user and it occupies disk space. Webmasters and SEOs should minimize similar content in order to avoid search engines penalties.

The “penalties” for having duplicate content are not harmful to the website, but are harmful to your position in the SERPs, because while the website itself will not be penalized or de-indexed, the pages will duplicate content will be moved to the supplemental index and will not be shown in the search engine results pages. How exactly do the search engines’ duplicate content filters work is not known, but is assumed that: the crawling date, the authority of the website and the links pointing from one website to the other are taken into consideration when determining which website has the “original” content.

Tips and Tricks

Duplicate content within the website.
There are some steps you can take to prevent duplicate content from appearing on your website:

  • Duplicate pages. You decided to create a printer-friendly version of each page so your visitors can print the information in an organized fashion. Great added value! However, the printer-friendly version is considered duplicate content. There are two ways to avoid that:
    1. Add a ‘noindex’ meta tag to your printer-friendly versions of the pages.

      <meta name=”robots” content=”noindex, nofollow” />

    2. Put all the versions in a specific folder and use robots.txt to disallow search engines’ bots to index those pages.
  • Minimize boilerplate repetition. By providing a link to a page that has the text rather than repeating it on every page. However, sometimes it is desired to be able to showcase the text (i.e. testimonials), so here’s a nice trick to do it: take a “picture” of the text and display the text on each of the pages as a picture (.gif, .jpg, etc.).
  • Duplicate URLs. Usually, to access the website, we can use both the www version of the URL as well as the non-www version of the URL. For the search engines, the two URLs are “different”, so that’s duplicate content. SEO wise that’s bad also; you are dividing the number of incoming links between the two pages. How to fix it?
    1. Use a .htaccess 301 Permanent Redirect to redirect all the non-www traffic to the www version. Open and edit the .htaccess file with the following lines:

      Options +FollowSymLinks
      RewriteEngine on
      RewriteCond %{HTTP_HOST} ^mysite.com [NC]
      RewriteRule ^(.*)$ http://www.mysite.com/$1 [L,R=301]

    2. If you have a Google webmasters account, you can set a preferred domain for indexing (with or without www).



Duplicate content across websites.
There are two external SEO techniques used in web site promotion that create duplicate content: articles publishing and content syndication (i.e. RSS feeds on blogs). While these are useful in promoting your website, you must be careful with the duplicate content issues that may arise:

  • Make sure, if possible, that each site that displays your articles or on which your content is syndicated includes a link back to your original article or blog entry.
  • Create short-versions or modified versions of your articles (around 60-70% modified text) and use them for articles publishing instead of the original articles
  • Don’t copy text from others (that’s an easy one!) :)
  • Use duplicate detection tools to find out who’s copying your content and if they are not allowed to do it, file a complaint with the search engines. For Google use this link: file a DMCA request

Duplicate Content Tools

  1. CopyScape.com is a well-known website that offers services to detect and protect your content against online plagiarism. Go to www.copyscape.com and enter the URL of your web site or web page. The script will look and retrieve similar pages.

  2. DuplicateContent.net has another useful tool that allows you to compare two web pages and calculates the degree of HTML tags and text similarity between them. Use the duplicate content detector below to check for duplicate content:


Link/WebPage #1


Link/WebPage #2


Trackbacks & Pingbacks 2

  1. From Do multiple article submissions work? - WebProWorld on 11 Feb 2008 at 10:43 pm

    [...] away ! Be careful with duplicate content though. More info on duplicate content: search WPW or here Hosting Articles and Recommended Hosting Plans | SEO and Web Marketing [...]

  2. From seo submissions on 14 Jun 2008 at 6:50 pm

    seo submissions…

    A Trackback is one of three types of Linkbacks, methods for Web authors to request notification when somebody links to one of their documents….

Post a Comment

Your email is never published nor shared. Required fields are marked *