Robot.txt is a file that gives instructions to all search engine spiders to index or follow certain page or pages of a website. This file is normally use to disallow the spiders of a search engines from indexing unfinished page of a website during it's development phase. Many webmasters also use this file to avoid spamming. The creation and uses of Robot.txt file are listed below:
Robot.txt Creation:
To all robots out
User-agent: *
Disallow: /
To prevent pages from all crawlers
User-agent: *
Disallow: /page name/
To prevent pages from specific crawler
User-agent: GoogleBot
Disallow: /page name/
To prevent images from specific crawler
User-agent: Googlebot-Image
Disallow: /
To allows all robots
User-agent: *
Disallow:
Finally, some crawlers now support an additional field called "Allow:", most notably, Google.
To disallow all crawlers from your site EXCEPT Google:
User-agent: *
Disallow: /
User-agent: Googlebot
Allow: /
"robots" meta tag
If you want a page indexed but do not want any of the links on the page to be followed, you can use the following instead:
< meta name="robots" content="index,nofollow"/>
If you don't want a page indexed but want all links on the page to be followed, you can use the following instead:
< meta name="robots" content="noindex,follow"/>
If you want a page indexed and all the links on the page to be followed, you can use the following instead:
< meta name="robots" content="index,follow"/>
If you don't want a page indexed and followed, you can use the following instead:
< meta name="robots" content="noindex,nofollow"/>
Invite robots to follow all pages
< meta name="robots" content="all"/>
Stop robots to follow all pages
< meta name="robots" content="none"/>
Robot.txt Creation:
To all robots out
User-agent: *
Disallow: /
To prevent pages from all crawlers
User-agent: *
Disallow: /page name/
To prevent pages from specific crawler
User-agent: GoogleBot
Disallow: /page name/
To prevent images from specific crawler
User-agent: Googlebot-Image
Disallow: /
To allows all robots
User-agent: *
Disallow:
Finally, some crawlers now support an additional field called "Allow:", most notably, Google.
To disallow all crawlers from your site EXCEPT Google:
User-agent: *
Disallow: /
User-agent: Googlebot
Allow: /
"robots" meta tag
If you want a page indexed but do not want any of the links on the page to be followed, you can use the following instead:
< meta name="robots" content="index,nofollow"/>
If you don't want a page indexed but want all links on the page to be followed, you can use the following instead:
< meta name="robots" content="noindex,follow"/>
If you want a page indexed and all the links on the page to be followed, you can use the following instead:
< meta name="robots" content="index,follow"/>
If you don't want a page indexed and followed, you can use the following instead:
< meta name="robots" content="noindex,nofollow"/>
Invite robots to follow all pages
< meta name="robots" content="all"/>
Stop robots to follow all pages
< meta name="robots" content="none"/>
6 comments:
Nice piece of information. keep it up.
Hi,I got a lot of information from this blog.
rize tower noida
the paras rize t28
rise tower noida
rise tower in noida
the paras rise t28
do u know ,ur blog is copied here:-
http://onlineseodelhi.blogspot.in/2011/11/what-is-robottxt-file.html
I really enjoyed the quality information you offer to your visitors for this blog. I will bookmark your blog and have my friends check up here often. SEO Company || SEO Services || SEO Company in Mumbai || SEO Services in Mumbai
Reading this amazing article was gave me an amazing happiness because the article was one of the best article of this topic.
seo company in Jaipur
Thanks for sharing this valuable and understanding article with us.Finding SEO Company in surat then plusply digital is offering the best SEO Services in surat for your business website or Online Marketing.
Post a Comment