robots.txt - Saidul Hassan

Robots Exclusion protocol (REP)

28 Aug, 2013 By Saidul Hassan · Filed Under: School

Robots Exclusion Protocol: now with even more flexibility “Details Last updated Februar 17, 2012. Webmasters — Google Developers Playing with the X-Robots-Tag HTTP header “Details On moz.com>learn>seo>robotstxt How to stop search engines from crawling the whole website? “Details Apache HTTP Server Version 2.2 httpd.apache.org>docs>2.2>mod Header set X-Robots-Tag “noindex, noarchive, nosnippet, nofollow, noimageindex, noodp” www.intelligentpositioning.com/blog/2009/08/x-robots-tag-control-google-indexing-via-http-headers/ http://noarchive.net/xrobots/

Robots.txt explained and blocking bad bots with htaccess

26 Aug, 2013 By Saidul Hassan · Filed Under: School

Blocking Baidu and Yandex Search Spiders Implemented the following into the top of the HTACCESS file to have block the Baidu, Yandex and Sosospider bots from spidering website. “———————————————————————————– SetEnvIfNoCase User-agent “Baidu” spammer=yes SetEnvIfNoCase User-agent “Yandex” spammer=yes SetEnvIfNoCase User-agent “Sosospider” spammer=yes order deny,allow deny from env=spammer ————————————————————————————-” Source Robots.txt Specifications > Webmasters > Google Developers […]