|
|
|
 |
|
| |
About Spamming Techniques |
|
| |
Very few people want spam, fewer read it,
fewer respond to the spam by visiting an spamvertised website, and far
fewer actually buy something. Since spam is so ineffective spammers must
make every effort to get their spam into your inbox. If the spam doesn't
get to your inbox then there's certainly no chance you'll open it.
More and more people are using filters, such as PrismEmail, to block
spam. Despite this, spammers seem to think that if they can get their
spam past the filters you are likely to buy from them. This is, of course,
a dubious assumption since it seems that those that specifically filter spam
are those that are least likely to purchase from the spammer. Nonetheless,
spam filtering is essentially an arms race between the spammers and spam
filters such as PrismEmail.
This page will review some of the tactics that spammers use to try to
get past filters and what the filters do in response.
|
| |
Word Substitution |
|
| |
Years ago, spam filtering was pretty easy. Certain words appeared in spam
often but virtually never appeared in legitimate email. For example, the
phrase "limited time offer" might often appear in spam but virtually never
in a valid email. As a result, spam filters would simply look for certain
suspect phrases and discard the message if it found them. Spam filters grew
more effective by adding more and more phrases and words to their database.
Of course, spammers just started substituting silly words for the words
they knew would be in many spam filters. That's why you see spam with
messages such as "V!agra" instead of "Viagra"--they assume that the word
Viagra may be in a spam filter, but V!agra might not be. Of course,
the immediate problem for the spammer is that a message filled with
mis-spelled, mangled words looks a lot less professional and
trustworthy so doing this drives down their response rate even further.
This approach is still used by spammers and can be somewhat effective
depending on the spam filter being used. There are so many ways you can
mangle the word "Viagra" that it is probable that some variations of the
word will get by spam filters that do nothing more than scan the spam for
known suspicious words. This technique is not very useful against modern
spam filters, however--especially Bayesian filters such as those used here
at PrismEmail. After all, an email that contains the word "Viagra" may or
may not be spam, but an email that contains the word "V!agra" is almost
definitely spam--so the fact that a spammer tries to mangle the words in
the email actually can make it easier to realize the message is spam.
|
|
| |
HTML Encoding |
|
| |
When the word substitution became difficult for the spammers, they started
using technical means to try to use the words they wanted to use without
being detected by spam filters. HTML provides a way to encode characters
such that the characters themselves don't appear in the message and spammers
used this to their advantage. For example, Viagra can be written in an email
as %56%69%61%67%72%61. When you read the email with your email program it will
be displayed as "Viagra" but a simple spam filter looking for the word
"Viagra" won't find it since it is encoded with numbers.
As it turns out, this was just spammers taking advantage of very simple
spam filters. Since this hadn't been done before spam filters hadn't been
developed to handle the number-encoded messages so the spam got through.
Of course, as soon as this type of spam was noticed the spam filters simply
improved their software to be able to decode the numbers and filter normally.
Many spammers still use this approach to try to get by spam filters, but it's
not clear why they bother. This technique is completely useless against any
modern spam filter since they all are capable of decoding these types of
messages and filtering normally.
|
|
| |
HTML Comments |
|
| |
In the spirit of the HTML Encoding technique just explained, spammers
found yet another way to use HTML to their advantage by using HTML
comments. HTML comments allow comments to be embedded in an HTML
page or email withoutbeing displayed to the user. This is often used
by legitimate developers to document their webpage so that other
developers can understand what the original developer did. For example,
the comment <!- This is a comment -> can be embedded anywhere
in an HTML page and the user will not normally see it.
Spammers use this by trying to break up suspicious words with HTML
comments. For example, instead of writing the word "Viagra" they may
write "Vi<!- useless comment ->agra". Again, as is the case with
HTML encoding, the user will see this as "Viagra" but a simple spam filter
that is not capable of dealing with HTML comments will not filter this since
it will not see "Viagra" as a continuous string of characters.
Like HTML encoding, it's not clear why spammers still bother with this.
All modern spam filters completely ignore HTML comments so there is no
benefit for spammers to use them. Unfortunately, spammers often use an
insane number of comments such that the size of a spam message can be
doubled just because they insert a comment in the middle of every
word. And, like HTML encoding, the very presence of HTML comments embedded
within words is often a very good indication that the message is spam. So,
once again, the spammers' efforts to get past the filter actually makes it
easier to detect them.
|
|
| |
Random Variations |
|
| |
Years ago, some spam filters tried to catalog each and every spam that was
received. When someone reported a spam that message was saved in a database
so that if anyone else received that same message it would automatically be
discarded as spam. This worked on the assumption (that used to be valid) that
each of the millions of spam contained almost the exact same body with no
changes whatsoever. That being the case it was not terribly difficult to
compare a new message to see if it was very similar to a spam that someone
else had already reported.
When these spam filters became popular, spam software evolved to produce
unique messages. The spam software would insert random garbage words in
various places throughout the message and in the subject. Thus it is very
common to see "Buy Viagra here 3d3fdsas" where there is some random-looking
garbage in the subject and scattered throughout the body. The spammers do
this so that software that analyzes each message will not be able to realize
it is the same message as spam that has already been reported since there will
be sufficient random words to make the system think that it's an entirely
different message.
This approach to spam filtering isn't very common anymore. Since spammers
make each message different it's difficult to use this approach to detect
spam which makes the approach less effective. As such, few systems use this
kind of filtering. Since few systems use this technique it is strange that
spammers still insert random garbage words in the body or subject. Most spam
filters ignore such garbage completely. Still others are smart enough to
realize that these garbage words are a pretty good indicator of spam, so some
spam filters actually are able to detect that a message is probably spam based
on the presence of these words.
|
|
| |
Dictionary Word Inclusion |
|
| |
The newest approach to filtering spam is the Bayesian filter which is
described more fully here. This uses
a statistical approach to spam filtering such that each word in an email
is counted to determine how often it appears in good email and how often
it appears in spam. The word "Viagra" might appear in 50 spams but only
1 good email--as such, the presence of the word "Viagra" is a pretty good
indication that the message is spam. When this information is combined
with the probabilities of other words being "spammy" or not it is possible
to calculate the probability that a given message is spam.
This approach is one of the most effective that has ever been used to
fight spam, is available here at PrismEmail, and is the approach to spam
filtering we advocate.
Spammers are just starting to try to get around Bayesian filters.
Unfortunately for them, it is unlikely they will. Please see
this discussion for
a full explanation, but Bayesian filtering is such that the statistics
are different for each user. That means that for a message to
get through the spammer needs to use words that are commonly used in your
good email and not used in spam. This is very, very difficult
if the spammer doesn't have a large sample of your good email and spam.
Some spammers apparently believe that if they insert random words from the
dictionary that are not usually used in spam that they will have a better
chance of getting past a Bayesian filter. Fortunately for us, this doesn't
usually work. They tend to insert random words from the dictionary, such as
"political," "democracy," "nation," etc. Presumably they believe that since
most spam won't use these words that using them will get them past a Bayesian
filter. It will not. The spammer would have to use distinctive words that are
very specific to you. For example, if you have a friend named
Thomas then that word would help a spammer get passed a Bayesian filter--
but it would only help them get past your Bayesian filter. The word
Thomas wouldn't help them get past someone's Bayesian filter unless they talked
a lot about someone named Thomas.
Since spammers don't know what words are truly innocent for you (such as
"Thomas") it is unlikely that a brute dictionary attack will work. We've
received spam that had entire sections of the U.S. Constitution embedded in
the spam to try to get past the Bayesian filter and even so Bayesian was
able to realize it was spam.
|
|
| |
|
|
|
 |
 |
 |
TRY
1 MONTH FOR FREE!
Just pick a Prism account name, a password, and provide
your POP3 information, and you can start receiving spam-free email within minutes.
|
 |
The
Spam Problem
With the good comes the bad. Today, our email inboxes
are clogged with unwanted, unsolicited emails. In the
early days, spam wasn't a problem. It is today! PrismEmail
can help. |
 |
ISPs
and our system
Most "normal" ISP email accounts and email clients
will work with our service. |
 |
Privacy
and Security is key
We know that an anti-spam service that helps to improve
your privacy is of no use if the service itself were to
abuse your privacy. In addition, our service was designed
from the ground-up to ensure that no customer information
remains on any Internet-accessible server. |
 |
Spam News
Find the latest information about spam in the news and the
Internet's battle against it. |
|
 |
 |
|
|