E-Mail Protector
Description
This script provides you with two methods of protecting e-mail addresses on your website from being collected by e-mail
address harvesters (spambots) that assemble "targetted" e-mail lists which ultimately end up being used to send
you "spam" about any number of questionable products or even resold to a 3
rd party!
Both of our protection methods use subtle manipulation of the e-mail address they are protecting to make it much harder
for any automated agent to find that address page on the page - this means that even if the primary protection mechanism
is bypassed they still need to do more work to be able to capture the address.
Using an "armouring" filter also has advantages over other related methods of hiding, obfuscation or
munging e-mail addresses - the most obvious being that the output is still perfectly valid HTML and remains clickable
in any browser - even without javascript enabled. The reason the first step involves a filter is that most e-mail
harvesters use pattern matching to find e-mail addresses on a page, so if you break up the pattern they are looking
for they can't find them any more unless they use a more time consuming method.
An additional advantage is that due to the text manipulation being based on the server-side should you feel the protection
has a weak spot you can simply upgrade a single include and every e-mail address using that include will automatically
have its protection upgraded. Lastly as this script uses a combination of techniques to ensure the e-mail address
remained protected from harvesting, this makes it much harder for a malicious spider to try to cheat an address out
of as it needs to know a way to bypass each of the levels rather than just the one.
On top of the "armour" filter, we make use of pattern matching to identify known "bad" and/or
"questionable" crawlers which are just out to gather data for their own purposes and which return very little
traffic in return. This allows us to stop any of the existing e-mail harvesting products that are recognisable,
in addition to this we also search for questionable keywords associated with these products to allow a certain degree
of protection from new or updated harvesters.
If the request failed any of the above checks then instead of the correct e-mail address we insert a fake e-mail address
with a display name which makes it obvious a real person that something is not quite right letting them correct the
problem, if a crawler picks up that address it lacks the ability to seem the problem and continues on oblivious - and more
importantly leaving the correct e-mail address unharvested!
Simple, yet effective...
On paper these two filters should offer minimal protection but in reality they are effecient enough to stop approximately
90% of the harvesters from getting hold of an address. Why does this work so well? I'm not totally sure but I would
suggest that harvesters are designed for quantity not quality and are a result they need to use the fastest extraction
methods possible - even when that means they are not using the best technique!
In multiple tests against live spambots this script managed to stop the majority of them from harvesting any e-mail
addresses from the page - most were recognised as pre-banned robots and treated accordingly, however some faked
their user-agent but then were caught by the remaining checks. Until the main-stream spambot technology improves
dramatically then the measures currently in use should be sufficient to continue stopping them for quite a while.
Single Compressed Download
Individual Components
Installation & Setup
-
Save the source code file into a directory somewhere within your webroot, for any examples we've assumed it is called emailprotect.asp.
-
Modify
sEMail_Bad in EMail_Protect() to contain the e-mail address you
want to use when the request is clearly coming from a source other than a valid user.
-
If you want to use the extended method you will also need to modify
sEMail_Unsure
to contain the e-mail address you want to use incase the failure is a fale-positive.
User Guide
Since all the code lives in an external file you'll need to include this into your page with something along
the lines of;
<!-- #INCLUDE FILE="emailprotect.asp" -->
It doesn't really matter if this goes before or after any existing includes as it doesn't contain any code which
will automatically run as it is included by the ASP engine.
Once you have set up the include you then need to protect your any e-mail addresses by putting them inside the
wrapper function
SafeEMail() which might leave your links appearing similar to the following example:
<a href="<%= SafeEMail( "test@example.com" ) %>">contact me</a>
or
<a href="<%= SafeEMail( "test user <test@example.com>" ) %>">contact me</a>
Now if you view your ASP page you should see very little difference - this is because you are most likely going to
be a perfectly valid user browsing your website. Since the script detects this the only change it makes is to
apply the substitution filter to the e-mail address
If you happened to trip the checks at any point you would see a different piece of text give a very brief explanation
of what went wrong and the correct e-mail address would not appear.
Related Links
-
Stopping unwanted crawlers - an ASP script which uses a smaller version
of the ruleset used here to deny certain crawlers access to your site.