How to Prevent Duplicate Content with Effective Use of the Robots.txt and Robots Meta Tag |
Duplicate content is one of the problems that we regularly come across as part of the search engine optimization services we offer. If the search engines determine your site contains similar content, this may result in penalties and even exclusion from the search engines. Fortunately it?s a problem that is easily rectified. Your primary weapon of choice against duplicate content can be found within ?The Robot Exclusion Protocol? which has now been adopted by all the major search engines. There are two ways to control how the search engine spiders index your site. 1. The Robot Exclusion File or ?robots.txt? and 2. The Robots < Meta > Tag The Robots Exclusion File (Robots.txt)
This is a simple text file that can be created in Notepad. Once created you must upload the file into the root directory of your website e.g. www.yourwebsite.com/robots.txt. Before a search engine spider indexes your website they look for this file which tells them exactly how to index your site?s content. The use of the robots.txt file is most suited to static html sites or for excluding certain files in dynamic sites. If the majority of your site is dynamically created then consider using the Robots Tag. Creating your robots.txt file Example 1 Scenario
If you wanted to make the .txt file applicable to all search engine spiders and make the entire site available for indexing. The robots.txt file would look like this: User-agent: *
Disallow: Explanation
The use of the asterisk with the ?User-agent? means this robots.txt file applies to all search engine spiders. By leaving the ?Disallow? blank all parts of the site are suitable for indexing. Example 2 Scenario
If you wanted to make the .txt file applicable to all search engine spiders and to stop the spiders from indexing the faq, cgi-bin the images directories and a specific page called faqs.html contained within the root directory, the robots.txt file would look like this: User-agent: *
Disallow: /faq/
Disallow: /cgi-bin/
Disallow: /images/
Disallow: /faqs.html Explanation
The use of the asterisk with the ?User-agent? means this robots.txt file applies to all search engine spiders. Preventing access to the directories is achieved by naming them, and the specific page is referenced directly. The named files & directories will now not be indexed by any search engine spiders. Example 3 Scenario
If you wanted to make the .txt file applicable to the Google spider, googlebot and stop it from indexing the faq, cgi-bin, images directories and a specific html page called faqs.html contained within the root directory, the robots.txt file would look like this: User-agent: googlebot
Disallow: /faq/
Disallow: /cgi-bin/
Disallow: /images/
Disallow: /faqs.html Explanation By naming the particular search spider in the ?User-agent? you prevent it from indexing the content you specify. Preventing access to the directories is achieved by simply naming them, and the specific page is referenced directly. The named files & directories will not be indexed by Google. That?s all there is to it! As mentioned earlier the robots.txt file can be difficult to implement in the case of dynamic sites and in this case it?s probably necessary to use a combination of the robots.txt and the robots tag. The Robots Tag
This alternative way of telling the search engines what to do with site content appears in the section of a web page. A simple example would be as follows; In this example we are telling all search engines not to index the page or to follow any of the links contained within the page. In this second example I don?t want Google to cache the page, because the site contains time sensitive information. This can be achieved simply by adding the ?noarchive? directive. What could be simpler! Although there are other ways of preventing duplicate content from appearing in the Search Engines this is the simplest to implement and all websites should operate either a robots.txt file and or a Robot tag combination. Should you require further information about our search engine marketing or optimization services please visit us at http://www.e-prominence.co.uk ? The search marketing company
7 Search Engine Optimization Strategy
Search engine optimization refers to the technique of making your web pages search engine friendly so that search engines are more easy to understand and analyze your website. Consequently, your site has a better chance to gain high search engine ranking.
SEO and the Outsourcing of Inbound Link Building
Search Engine Optimization nowadays has a lot to do with building inbound links to your website. Building inbound links is a cumbersome tasks and webmasters have always been looking for shortcuts to do this.
Meta Tags - An Important Part of Every Web Page
Meta tags are an absolute must from a search engine optimisation perspective, there are many mistakes that can be made by not including these or even trying to 'spam' the SE's using them so heres a quick summary of the three main ones:Meta Page Title TagThe title tag is supported by all search engines and should be considered as the most important element in your optimisation process. Why is this you may ask? Well the page title is the first thing search engine spiders and human visitors will see.
MLM and SEO - Bad Business! No Business!
MLM has been around way before the Internet. It is a few steps above a chain letter.
How and Why to Avoid the SEO Mania
The WhyBut what is the reality of reaching a number 1 position on any of the big three search engines ? Google, Yahoo, and MSN ? and staying there? Somewhere between a remote possibility to impossible.Why? Because there are well over 16 million websites battling for the number 1 position on ?any? given day.
Search Engine Optimization Lies & Misconceptions
In a perfect world, everyone would be honest.In a perfect world, no one would violate search engine policies to try get a better listing.
The Easiest Way to Your Google Sitemap
Google Sitemaps is a new tool for website owners and publishers, released by Google themselves. It allows you to submit a sitemap (a document that contains links to every page of your site) from your own homepage in .
Should You Buy Text Links?
You can rank number one (Or at least in the top ten) for just about any search phrase by just buying text link ads, even if the web site isn't related to the search phrase in anyway, it can still rank in the top ten of the search results. Some web site owners see this as the only true way to the top ten.
Guide to Search Engine Optimization
What is Search Engine Optimization?Search Engine Optimization or SEO for short is modification done in the web site design, coding, content and/or structure of a web site in an effort to achieve the higher ranking within search engines. Search Engine Optimization are done to attend the highest ranking in the search engine results for some targeted keywords or key phrases.
Torpedo and Sink the Ship SS Search Engine Ranking
I was recently contacted by one of my best clients who asked
me what I thought of his decision to make a major change to
one of his highly ranked pages. His initial concern was that
visitor sales conversion ratio was low.
|