Blocking Baidu and Yandex Search Spiders
Implemented the following into the top of the HTACCESS file to have block the Baidu, Yandex and Sosospider bots from spidering website.
“———————————————————————————–
SetEnvIfNoCase User-agent “Baidu” spammer=yes
SetEnvIfNoCase User-agent “Yandex” spammer=yes
SetEnvIfNoCase User-agent “Sosospider” spammer=yes
order deny,allow
deny from env=spammer
————————————————————————————-”
Source
Robots.txt Specifications > Webmasters > Google Developers Source
Frequently asked questions > Webmasters > Google Developers Source
Block or remove pages using a robots.txt file – Webmaster Tools Help Source.
Example
User-agent: *
Disallow: /blog/wp-admin/
Disallow: /blog/wp-includes/
Disallow: /blog/wp-content/plugins/
Disallow: /blog/wp-content/cache/
Disallow: /blog/wp-content/themes/
Disallow: /blog/wp-includes/js
Disallow: /category/*/*
Disallow: */trackback
Disallow: /*?*
Disallow: /*?
Disallow: /*~*
Disallow: /*~
User-Agent: Baiduspider
User-Agent: Baiduspider-ads
User-Agent: Baiduspider-cpro
User-Agent: Baiduspider-favo
User-Agent: Baiduspider-news
User-Agent: Baiduspider-video
User-Agent: Baiduspider-image
User-agent: Yandex
Disallow: /
sitemap: https://saidulhassan.com/sitemap.xml