Now a days every one is praying that “Oh God let google index and cache my web pages”. The article I am going to write is just the reverse of what common man or woman thinks. Many time we need to hide our data from Search Engines. Some times we will be shocked that the information which we kept secretly in our site present in google cache. That will remain there even if I remove it from my web page. So anyone can easily view it which I removed from my site. To prevent google or any major web sites from making cache links of our site follow the tips provided below.
Cache links are created when a spider visit our web page and create a snapshot of it. According to google words “This “cached” version allows a web-page to be retrieved for your end users if the original page is ever unavailable (due to temporary failure of the web server)”. The cached page will look exactly same as how the page looks at the time of spider visit. To see the cached links of your website follow the command below.
On google search box you can type the command “cache: “and the URL you want to see. The same command is valid for MSN search also. The cached page will change when spider’s next visit on your site. To exclude google spiders from caching your web page follow the steps below.
Exclusion of a page from cache:
1. Tell spiders to not archive links using robots.txt
To exclude a web page from caching we need to use the following meta tag on our website. The tag is:
Note: Please ignore the symbol “|” from the tag above.
This tag prevents all spiders from archiving the web page.
2. Setting access permissions
We can set a user name and password for a page to view it. Search Engine spiders cannot log on to a page using user name and password. So it cannot index that page so your web page will never indexed in cached link list.
The steps provided above will help to prevent spiders from being caching our web page. But for already cached webpage what we can do?
Remove a cached link
To remove an already indexed link from cached list we can send a request to google from google webmaster tool. To perform this follow the steps below.
1. Log on to google webmaster tools
2. Click on Tools
3. Click on remove URL
4. Click on new removal requests
5. Enter the URL which is already cached to remove.
In a nutshell the robots.txt file on your web server is very important. If you edit it without knowing the usage of syntax your site may not indexed well on Search Engine. In the reverse to un-index a site link also robots.txt file is very important.