Robots Meta Tags

Friday, December 3, 2010
What are robot meta tags and how do they look like and how can you use them?
Webmasters and search engine optimization companies make standard use of this meta tag. The tag will tell search engines which pages they should index and which ones not.

You can place anywhere in between your meta data section, which is located inside the head section of your HTML code.



Important Robots Tags

Follow all links and index everything
Not index and follow all links
Index everything and not follow all links
Do not index and follow anything
This tag is not valid
rel=<nofollow"> placed within Do not follow the link




Rel="nofollow" Attribute
Meta tags can exclude all outgoing links on a page which is most used by tricky SEO companies which do link exchanges, but you can also command search engines not to follow individual text links by adding rel="nofollow" tag to a hyperlink.

When Googlebot sees the attribute rel="nofollow" for example on a text link, the link will not get any votes when Google ranks the websites and gives not any PageRank.

Example what SEO's do, if you would exchange links with SEO-WATCH and you would place a link like below to our site:

SEO Company

You could replace it with:

SEO Company.

In this way your site will not parse any PR to our site but the link exchange is done in a dishonest way and we will definitely check this and remove your link from our site.

Robot.txt File
The content of the robots Meta Tag contains directives separated by commas.

The currently defined directives are [NO]INDEX and [NO]FOLLOW.
The INDEX directive specifies if an indexing robot should index the page. The follow directive specifies if a robot is to follow links on the web page.

The defaults are INDEX and FOLLOW.
The values all and none set all directives on or off: ALL=INDEX,FOLLOW and NONE=NOINDEX,NOFOLLOW.

You can place as an alternative the robots text into the root of your server, to control which of your web pages will be listed in the search results of search engines and which directories are forbidden for the crawler or bot. Some stuff you do not want publicly displayed should be disallowed for bots, because when not restricting the robots it will be most likely public. You should save the text file as robots.txt and insert the examples listed below to forbid specific files and directories. The below record describes the default access policy for any robot symbolized by the *:
  • User-agent:*
Disallow: /cgi-bin
Disallow
The value of this field specifies a partial URL that is not allowed o be visited. This can be a full path, or a partial path; any URL that starts with this value will not be retrieved.. For example:
  • Disallow: /help
This disallows all files in this directory
whereas.
  • Disallow: /help/
would disallow /help/index.html but allow /help.html
You do not need to allow specific files and directories as the robots are allowed to any files not declared as disallowed.
Do all bots follow up the robot.txt file? No! Some black hat bots will ignore all your efforts and you can only block them by the .htaccess file.

0 comments: