Hi,
We use a crawler to collect urls of wrl files for linking in our web 3d search engine.
Its useragent string is ExitRealitySearchAgent.
This crawler obeys the normal robot.txt set of rules for search engines as described
in:
http://en.wikipedia.org/wiki/Robots.txtTo address the issue that has been describe on this forum I suggest these entries to
tell the ER spider not to index vrml worlds on your domain:
User-agent: ExitRealitySearchAgent
Disallow: /*.wrl$
User-agent: ExitRealitySearchAgent
Disallow: /*.wrz$
To ensure your site is crawled quickly I would also suggest submitting your domain
for a priority spidering at the following url once your robot.txt file has been
updated.
http://www.ER.com/submiturl.phpAlso I noted Paul Aslin's post on the public web 3d mailing lists in regards to vrml
browsers providing referrer in the http header. We already provide a referrer in the
http headers when our browser makes a request. This contains the referring file
(whether it is html or vrml) and was implemented to help prevent deep linking.
Your webserver can be configured to disallow deeplinking by checking this referrer and
constraining it to referrers that you allow. Configuration of this is dependant on your
hosting service and the type of http server they use.
All this information will be fasttracked to our knowledgebase the assist others who
share these concerns.
We have read your comments on our web 3d search and are discussing your suggestions. I
would like to take this opportunity to state our intentions for the web 3d search. We do
not intend it to pass off others work as our own, rather the search is meant to provide
an accessible way for people to find and view 3d content.
Thanks for your interest and please feel free contribute feedback.
Rory Hart
Head of Development
ER
http://www.ER.com