Experibot

An experimental web crawler. Important reader key items:

I did my best to adhere to robots.txt directives, especially disallows. It is possible that coding mistakes exist in the implementation (even though the code has been checked and verified many, many times). If you see my bot disregarding your robots.txt - I am sorry - it is not on purpose, and please let me know.
The crawler is programmed to not download anything from robots.txt which contained the word "experibot", in all combinations of upper-lower cases (regardless of the crawler's version). I will not take chances here.
The crawler will never crawl the same host, or the same IP, twice within 60 seconds (a generous "crawl-delay").
After about 12 hours, each cached robots.txt file will become obsolete and re-downloaded (the new download will take place only after a page from that website has been asked, so it could take longer than 12 hours before the robots.txt file is re-requested). If you made changes to your robots.txt file, they will affect my crawler in the following ways:
- If you specifically disallowed my bot ("experibot"), the disallow will occur immediately when re-fetching.
- If the change did not involve my bot specifically, it will take up to two days before it affects the crawler.