الاثنين، 8 فبراير 2016

How to display different content to search engines and visitors





A variety of strategies are used to segment content delivery. The idea is to serve content that are not meant for search engines in a un spiderable format( placing text in images flash files and plugins)

However dont use these formats for the purpose for cloaking, rather you should use them for if they bring substantial benefit to users If you want to show the search engines , you dont want visitors to see ,you can use CSS formatting( preferably not display:none) as the engines might have filters to track this
 However keep in mind that  search engines are very way of webmasters to use such tactics. Use  cloaking only if it brings substantial user benefits



 Tactics to show  different content for search engines and users Robot.txt files : This file is located at the root level of your domain( www.domain.com/robots.txt) which you can use to 1)Prevent crawlers form accessing non public from parts of your website 2) Block search engines from accessing index scripts,utilities or other types of code, 3) Auto discovery of XML sitemaps.

The robot.txt must reside in it root directory and should be in small case. Any other format is not valid for search engines

 Syntax of robot,txt file : The basic syntax of robot.txt is fairly simple. You specific a robot name such as google bot and specify an action. Some of the major actions you can specify are

Disallow : use this to specify google bot not crawl your certain parts of your website
NoIndex : Use this page for telling the bots not to iindex your site in its SERPs. ( this might be used when you wish to hide duplicate content pages in your site)



heres an example of "robot.txt file"

User agent : Googlebot Disallow: 
User agent :msn bot #Block all robot  tmp and logs directories ( the has symbol# )may be used for comments within a robot.txt file where everything after the # on that line be ignored 

One additional problem webmasters run into , is   when they have ssl installed so that pages may served v iaHTTP and HTTPS .
However the search engine will not interpret this as robot.txt file at http://www.domain.com/robot.txt as a guiding their crawl behavior on http:www.domain.com/txt.
For this you need to create additional robots.txt file at yourdomain.com/txt. so if you want toallow crawling of all pages served from your https server you would need to implement the following FOR HTTP user agent: * disallow FOR HTTPS user agent:* disallow:

ليست هناك تعليقات:

إرسال تعليق