There is a big discussion goes on duplicate content penalty by Google and how supplemental results affect your site. I don’t want to get too technical here but at same time I feel this topic is extremely important for every blogger and covering our bases in simplest manner is a must.

But before I proceed into discussion on what you can do to ensure that all your relevant pages are getting appropriate rank I want to share some background information to help you understand what this is all about and Why Should YOU Care!

Perhaps most important part of this would be the comprehensive answer to what it all means. I have found a great answer that avoids all the technical stuff and yet sheds a bright light on the problem and here is direct quote:

What is a Supplemental Result?
Supplemental results are generally pages that Google has determined to be secondary to other, more relevant pages that Google has indexed on your website. In effect, supplementary results are actually a secondary database of results that are only called upon when the most obscure queries force Google to check all its indexed resources.

How Do You Find What Pages From Your Blog Seen As Supplemental?

Answer came from same article and while I have seen different way to perform this query this one seem to be simplest to remember and produced needed results: “site:www.howtospoter.com ***”. Just type the query without quotes and replace my domain name with yours.

Using above showed that Google sees 414 pages from my blog as supplemental and what is even worth, many of those pages I don’t want to be in supplemental simply because they are highly relevant in my personal opinion and I want them to rank well in Google Results.

So How Do I Beat Google Into Submission And Get Those Pages Back Into Search Results?

Answer to this would be using Google Webmaster Tools. If you are still not using these great tools provided by the big G – stop everything you are doing and go get yourself signup. There are multiple reasons to do it but the main one would be some level of control on how Google sees your blog in addition to some great reporting.

Once you have signed up, submit your sitemap. If you don’t have one yet I recommend you revisit my WordPress Web 2.0 Guide as it is one of the recommended plugins on my list and will generate it for you automatically. It might take a few days before you will see any results after submission.

Using Diagnostics tab in Webmaster Tools is one of the way you can see and control indexing of your site. But it all starts with YOUR blog! Your robots.txt file has to be properly configured to tell Google what pages you don’t want to see. Click here to view my file but please be aware this is work in progress and might change based on Analysis of results it will produce.

So here are your three steps to beat the big G!

  1. Do a search in Google for supplemental results using this query: site:www.yourblogaddress.com ***
  2. Visit Google Webmaster Tools -> Diagnostics tab and look at details in URLs restricted by robots.txt. Configure your robots.txt file to remove all irrelevant pages from indexing. In my case it is all pages within tags, category, pages, etc. Those are similar to archive and present only posts excerpt and as such add to Duplicate Content problem. I want Google to know about the Full posts and pages but really don’t care about one’s mentioned above.
  3. Once you robots.txt properly configured go to URL Removals in Diagnostics and request removal of directories that include all the supplemental results.

My big hope here is two fold – reduce number of pages in Supplemental results and as such create higher relevancy for real pages. Get the pages and full posts that got into supplemental results completely out of it and make them available for regular search results.

What you are reading here is an experiment in progress and once I’ll see some results I’ll share my findings and any tweaks I made but I know one thing for sure is that having your good pages in Supplemental results and as such considered to be irrelevant is definitely not good thing.