Robots.txt is a file fixed in our websites root directory and its used to tell search engines which files or folders allowed to crawl and which is disallowed,
Here in this demo I will show you how to create a robots.txt file and a few of its steps to allow and disallow search engine crawlers to view.
If robot wants to vists a Web site URL link, like this http://www.demo.com/welcome.html.
But Before it does so, it first of all checks for http://www.demo.com/robots.txt, and search
............................................................................
robots.txt
............................................................................
User-agent: *
Disallow: /
The "User-agent: *" means this parts handle to all robots.
The "Disallow: /" shows the robot that should not be visit any pages on the website.
There are two important notes when using /robots.txt :
=>robots can avoid your /robots.txt. mainly malware robots that scan the web for security blame.
=>the /robots.txt file is a open available file. everyone can see what parts of your server you don't want robots to access.
Here follow Few examples :
............................................................................
Robots meta
............................................................................<meta name="robots" content="noindex">
We can also disallow robots to index files access meta tags in your website.
............................................................................
To block all robots from the full server
............................................................................User-agent: *
Disallow: /
............................................................................
To allow all robots complete access
............................................................................User-agent: *
Disallow:
(just only create an blank "/robots.txt" file, or don't access one at everywhere)
............................................................................
To block all robots from part of the server
............................................................................User-agent: *
Disallow: /cgi-bin/
Disallow: /tmp/
Disallow: /junk/
............................................................................
To block a single robot
............................................................................User-agent: BadBot
Disallow: /
............................................................................
To allow a single robot
............................................................................User-agent: Google
Disallow:
............................................................................
Disallow specific folder
............................................................................User-agent: *
Disallow: /admin/
............................................................................
sooner you can clearly disallow all disallowed pages:
............................................................................User-agent: *
Disallow: /user/subway.html
Disallow: /user/dust.html
Disallow: /admin/ahir.html

Comments
Post a Comment