How to Manage Duplicate Content in Your SEO

May 5th, 2012 by TOPer


This article will guide you through the main reasons why duplicate content is a bad thing for your site, how to avoid it, and most importantly, how to fix it. What it is important to understand initially, is that the duplicate content that counts against you is your own. What other sites do with your content is often out of your control, just like who links to you for the most part… Keeping that in mind. How to determine if you have duplicate content. When your content is duplicated you risk fragmentation of your rank, anchor text dilution, and lots of other negative effects. But how do you tell initially? Use the “value” factor. Ask yourself: Is there additional value to this content? Don’t just reproduce content for no reason. Is this version of the page basically a new one, or just a slight rewrite of the previous? Make sure you are adding unique value. Am I sending the engines a bad signal? They can identify our duplicate content candidates from numerous signals. Similar to ranking, the most popular are identified, and marked. How to manage duplicate content versions. Every site could have potential versions of duplicate content. This is fine. The key here is how to manage these. There are legitimate reasons to duplicate content, including: 1) Alternate document formats. When having content that is hosted as HTML, Word, PDF, etc. 2) Legitimate content syndication. The use of RSS feeds and others. 3) The use of common code. CSS, JavaScript, or any boilerplate elements. In the first case, we may have alternative ways to deliver our content. We need to be able to choose a default format, and disallow the engines from the others, but still allowing the users access. We can do this by adding the proper code to the robots.txt file, and making sure we exclude any urls to these versions on our sitemaps as well. Talking about urls, you should use the nofollow attribute on your site also to get rid of duplicate pages, because other people can still link to them. As far as the second case, if you have a page that consists of a rendering of an rss feed from another site – and 10 other sites also have pages based on that feed - then this could look like duplicate content to the search engines. So, the bottom line is that you probably are not at risk for duplication, unless a large portion of your site is based on them. And lastly, you should disallow any common code from getting indexed. With your CSS as an external file, make sure that you place it in a separate folder and exclude that folder from being crawled in your robots.txt and do the same for your JavaScript or any other common external code. Additional notes on duplicate content. Any URL has the potential to be counted by search engines. Two URLs referring to the same content will look like duplicated, unless you manage them properly. This includes again choosing the default one, and 301 redirecting the other ones to it. By Utah SEO Jose Nunez