• Home
  • SEM
    • PPC
    • SEO
    • How To
  • Bangladesh
  • School
    • HTTP
    • REP
    • Bots n’ htaccess
    • Source Code
    • Excel 2010
    • HTML XHTML Entities
    • Gmail Operators
    • HTML & ASCII
  • About
    • Privy
    • About This Blog
  • Contact
  • বাংলা

Saidul Hassan

Digital Marketing Evangelist

Home / School / Robots.txt explained and blocking bad bots with htaccess

Robots.txt explained and blocking bad bots with htaccess

26 Aug, 2013 By Saidul Hassan · Filed Under: School

Blocking Baidu and Yandex Search Spiders
Implemented the following into the top of the HTACCESS file to have block the Baidu, Yandex and Sosospider bots from spidering website.

“———————————————————————————–

SetEnvIfNoCase User-agent “Baidu” spammer=yes
SetEnvIfNoCase User-agent “Yandex” spammer=yes
SetEnvIfNoCase User-agent “Sosospider” spammer=yes

order deny,allow

deny from env=spammer

————————————————————————————-”
Source


Robots.txt Specifications > Webmasters > Google Developers Source
Frequently asked questions > Webmasters > Google Developers Source
Block or remove pages using a robots.txt file – Webmaster Tools Help Source.


Example
User-agent: *
Disallow: /blog/wp-admin/
Disallow: /blog/wp-includes/
Disallow: /blog/wp-content/plugins/
Disallow: /blog/wp-content/cache/
Disallow: /blog/wp-content/themes/
Disallow: /blog/wp-includes/js
Disallow: /category/*/*
Disallow: */trackback
Disallow: /*?*
Disallow: /*?
Disallow: /*~*
Disallow: /*~

User-Agent: Baiduspider
User-Agent: Baiduspider-ads
User-Agent: Baiduspider-cpro
User-Agent: Baiduspider-favo
User-Agent: Baiduspider-news
User-Agent: Baiduspider-video
User-Agent: Baiduspider-image
User-agent: Yandex
Disallow: /

sitemap: https://saidulhassan.com/sitemap.xml

The following two tabs change content below.
  • Bio
  • Latest Posts
My Twitter profileMy Facebook profileMy LinkedIn profileMy Instagram profileMy Pinterest profile

Saidul Hassan

Managing Partner at Up Arrow Consulting
COO at Up Arrow Consulting, MCC manager, & Technical SEO consultant. Certified Google Partners and Microsoft Bing Ads Accredited Professional, Python enthusiast, wannabe SysAdmin. Graduated from School of Management & Business Administration (SOMBA), Khulna University.
My Twitter profileMy Facebook profileMy LinkedIn profileMy Instagram profileMy Pinterest profile

Latest posts by Saidul Hassan (see all)

  • Batch script to create directories & moving fixed number of files to those directories - 30 Apr, 2021
  • How to use PrismJS syntax highlighter on WordPress without plugin - 30 Mar, 2020
  • Download an Entire Website for Offline Viewing - 26 Nov, 2019

Share:

  • Twitter
  • Facebook
  • LinkedIn
  • Pinterest
  • Pocket
  • Email
  • Print

First published on 26 Aug, 2013 · Last updated 26 Aug, 2013 · Tagged With: bot, crawler, robots.txt, spider

Batch script to create directories & moving fixed number of files to those directories

30 Apr, 2021 By Saidul Hassan

How to use PrismJS syntax highlighter on WordPress without plugin

30 Mar, 2020 By Saidul Hassan

Download an Entire Website for Offline Viewing

26 Nov, 2019 By Saidul Hassan

How to color highlight .htaccess files in Xed/Gedit

20 Aug, 2018 By Saidul Hassan

HMA Pro VPN Setup for Multiple Locations without User/Password Every time in Linux CLI

14 May, 2018 By Saidul Hassan

  • ♥ Bangladesh ♥
    Log in · Privacy Policy · Contact
    Copyright © 2011 Saidul Hassan

  • DMCA