Robot.txt is a file that gives instructions to all search engine spiders to index or follow certain page or pages of a website. This file is normally use to disallow the spiders of a search engines from indexing unfinished page of a website during it's development phase. Many webmasters also use this file to avoid spamming. The creation and uses of Robot.txt file are listed below:
Robot.txt Creation:
To all robots out
User-agent: *
Disallow: /
To prevent pages from all crawlers
User-agent: *
Disallow: /page name/
To prevent pages from specific crawler
User-agent: GoogleBot
Disallow: /page name/
To prevent images from specific crawler
User-agent: Googlebot-Image
Disallow: /
To allows all robots
User-agent: *
Disallow:
Finally, some crawlers now support an additional field called "Allow:", most notably, Google.
To disallow all crawlers from your site EXCEPT Google:
User-agent: *
Disallow: /
User-agent: Googlebot
Allow: /
"robots" meta tag
If you want a page indexed but do not want any of the links on the page to be followed, you can use the following instead:
< meta name="robots" content="index,nofollow"/>
If you don't want a page indexed but want all links on the page to be followed, you can use the following instead:
< meta name="robots" content="noindex,follow"/>
If you want a page indexed and all the links on the page to be followed, you can use the following instead:
< meta name="robots" content="index,follow"/>
If you don't want a page indexed and followed, you can use the following instead:
< meta name="robots" content="noindex,nofollow"/>
Invite robots to follow all pages
< meta name="robots" content="all"/>
Stop robots to follow all pages
< meta name="robots" content="none"/>
Robot.txt Creation:
To all robots out
User-agent: *
Disallow: /
To prevent pages from all crawlers
User-agent: *
Disallow: /page name/
To prevent pages from specific crawler
User-agent: GoogleBot
Disallow: /page name/
To prevent images from specific crawler
User-agent: Googlebot-Image
Disallow: /
To allows all robots
User-agent: *
Disallow:
Finally, some crawlers now support an additional field called "Allow:", most notably, Google.
To disallow all crawlers from your site EXCEPT Google:
User-agent: *
Disallow: /
User-agent: Googlebot
Allow: /
"robots" meta tag
If you want a page indexed but do not want any of the links on the page to be followed, you can use the following instead:
< meta name="robots" content="index,nofollow"/>
If you don't want a page indexed but want all links on the page to be followed, you can use the following instead:
< meta name="robots" content="noindex,follow"/>
If you want a page indexed and all the links on the page to be followed, you can use the following instead:
< meta name="robots" content="index,follow"/>
If you don't want a page indexed and followed, you can use the following instead:
< meta name="robots" content="noindex,nofollow"/>
Invite robots to follow all pages
< meta name="robots" content="all"/>
Stop robots to follow all pages
< meta name="robots" content="none"/>