Spam Detection Using IsMail - An Artificial Immune System

Spam Detection Using IsMail - An Artificial
Immune System For Mail
Slavisa Sarafijanovic and Jean-Yves Le Boudec, EPFL
MICS, Neuchatel, August 2-3, 2007.
1/6
MICS
IsMail – An Artificial Immune System
For Collaborative Spam Detection
An artificial immune system is a system based on the principles of the human immune system)
• One antispam system (Ismail) is added per email server.
• Antispam systems collaborate.
EPFL
ETHZ
UNIL
2/6
Let’ first recall what information can be used
for automated spam recognition
2. Bulkiness of spam
1. Spammines of words
Per user learned database:
P( Spam | Credit_card ) = 0.8
P( Spam | Cent ) = 0.95
…
P( Spam | Picture ) = 0.001
Spam bulk:
Picture the NewYear with 0
Credit Card Debt and
absolutely not spending an
other cent.
You can contact us at :
1--561-282-9476
Detecting bulkiness:
User 1
Hash(
and mind, said Zeb, we don't
want to ....
Picture the NewYear with 0
Credit Card Debt and
absolutely not spending an
other cent.
You can contact us at :
1--561-282-9476
Picture the NewYear with 0
Credit Card Debt and
absolutely not spending an
other cent.
You can contact us at :
1--561-282-9476
and mind, said Zeb, we
don't want to ....
Compute
P( Spam | Picture, NewYear, Credit_card, … )
Per user!
You can contact us at :
1--561-282-9476
)
and mind, said Zeb, we don't
want to ....
User 2
Hash(
and mind, said Zeb, we don't
want to ....
New spam email:
Picture the NewYear with 0
Credit Card Debt and
absolutely not spending an
other cent.
Picture the NewYear with 0
Credit Card Debt and
absolutely not spending an
other cent.
You can contact us at :
1--561-282-9476
)
Counter
and mind, said Zeb, we don't
want to ....
…
Picture the NewYear with 0
Credit Card Debt and
absolutely not spending an
other cent.
Hash(
You can contact us at :
1--561-282-9476
Picture the NewYear with 0
Credit Card Debt and
absolutely not spending an
other cent.
You can contact us at :
1--561-282-9476
)
and mind, said Zeb, we don't
want to ....
and mind, said Zeb, we don't
want to ....
Picture the NewYear with 0
Credit Card Debt and
absolutely not spending an
other cent.
You can contact us at :
1--561-282-9476
and mind, said Zeb, we don't
want to ....
User N
Hash(
Picture the NewYear with 0
Credit Card Debt and
absolutely not spending an
other cent.
You can contact us at :
1--561-282-9476
)
and mind, said Zeb, we don't
want to ....
Not used enough!
3. Sender information
Sent From: [email protected]
Sent From: [email protected]
Botnets, Nigerian spam…!
3/6
How our artificial immune system (AIS) detects spam? (1/2)
AIS produces and uses detectors
- detectors are binary strings able to recognize (using
similarity matching) spammy portions of emails
Set of detectors:
New email:
Picture the NewYear with 0
Credit Card Debt and
absolutely not spending an
other cent.
You can contact us at :
1--561-282-9476
and mind, said Zeb, we
don't want to ....
111010101110
111011100001
…
101010111111
similarity
hashing
010101101101
111011100001
similarity
matching
spam/normal
4/6
How our artificial immune system detects spam? (2/2)
How the detectors are produced:
1
random
candidate detectors
1
randomness
2
adaptation to the user’s
profile
2 Negative
Selection
Maturation
(another system)
3 Maturation
3 4
5
5
4
“delete as spam”
feedback from
the protected
system
local processing
collaboration
(discover new bulky spam)
new
old (memory)
active
detectors
Conclusion: AIS approach seems to fit well distributed detection problems
Analogy to the human immune system: steps 1-5 !
5/6
Project Status
Initial evaluation (simulation):
Patented design:
True Positive
False Positive
Not obfuscated spam
Obfuscated spam
• “METHOD TO FILTER ELECTRONIC
MESSAGES IN A MESSAGE PROCESSING
SYSTEM”, US patent No 11/515,063, filed on
Sept 5, 2006.
Built a realistic prototyping
and evaluation platform:
Preliminary detection results with respect to the number
of collaborating antispam systems: modest
collaboration (small number of neighboring servers)
enables promising detection results; the system is
resistant to the tested obfuscation by spammer.
• “AntispamLab – A Tool For Realistic Evaluation of
Spam Filters”, accepted for The Fourth Conference
on Email and Antispam, Mountain View, California,
USA, August 2-3, 2007.
(Disclaimer: simulation assumptions!)
6/6