Overview of e-mail SPAM Elimination and its Efficiency Tomáš Sochor University of Ostrava Czech Republic 8th Int. Conf. on Research Challenges in Information Science Marrakesh, May 2014 Motivation of SPAM Study l E-mail is still the most frequently used service l l the highest number of users closely related to other services l incl. social media l l l account registration client verification etc. SPAM amount is still increasing l the percentage is now stable around 90% l but absolute figures still rise! T. Sochor – RCIS May 2014 Marrakesh Global SPAM trends SPAM measurement • What is SPAM? • a message containing “Viagra”? • • YES, BUT NOT if you are medical doctor – urologist! the SPAM definition is subject to recipient • unlike e.g. computer virus, malware etc. How to measure SPAM? SPAM ratio = (No. of SPAM detected)/(Tot.No. of messages) T. Sochor – RCIS May 2014 Marrakesh SPAM measurement&detection • False positive detections are important • it ”costs” more to “dig out” one legitimate message in SPAM box • than to manually delete SPAM message not detected False_pos_ratio = (No. of SPAM non-detected)/(Total No of messages) • BUT: how to count non-detected SPAM? T. Sochor – RCIS May 2014 Marrakesh SPAM control history • simple filtering • “Viagra” -> SPAM • easy to obfuscate: “\/iagra” • multifactorial filtering/Bayes heuristics • multiple factors are considered • L4 filtering • using TCP connection parameters • etc. T. Sochor – RCIS May 2014 Marrakesh SPAM control history – phase 2 Non-traditional methods: • Collaborative filtering • Sender Policy Framework • Make sender pay • Authentication of sender • Challenge – Response • Most of these methods tries to distinguish before message delivery T. Sochor – RCIS May 2014 Marrakesh SPAM control history - continues • various ways of filtering • insufficient • applied only to messages already DELIVERED • modern approaches try to eliminate SPAM during message delivery • not applied on recipient’s computer • mail server must check for SPAM T. Sochor – RCIS May 2014 Marrakesh E-mail Operation Reminder anti-SPAM means should be applied HERE • Server under control • Before distribution to users T. Sochor – RCIS May 2014 Marrakesh Typical SPAM control system T. Sochor – RCIS May 2014 Marrakesh Error message “450 Greylisted” answered here T. Sochor – RCIS May 2014 Marrakesh 0 T. Sochor – RCIS May 2014 Marrakesh July 2012 May 2012 March 2012 January 2012 November 2011 September 2011 July 2011 May 2011 March 2011 January 2011 November 2010 September 2010 July 2010 May 2010 March 2010 January 2010 November 2009 September 2009 July 2009 1 200 000 May 2009 March 2009 January 2009 November 2008 September 2008 July 2008 May 2008 March 2008 January 2008 November 2007 SPAM Totals in long-term University of Ostrava 1 000 000 800 000 600 000 400 000 200 000 0,00% T. Sochor – RCIS May 2014 Marrakesh August 2012 June 2012 April 2012 February 2012 December 2011 October 2011 August 2011 June 2011 April 2011 February 2011 December 2010 October 2010 August 2010 June 2010 April 2010 February 2010 December 2009 October 2009 August 2009 June 2009 April 2009 February 2009 December 2008 October 2008 August 2008 June 2008 April 2008 February 2008 December 2007 October 2007 SPAM percentage long-term 100,00% 90,00% 80,00% 70,00% 60,00% 50,00% 40,00% 30,00% 20,00% 10,00% SPAM detection before delivery l Formal check l l Blacklisting l l e.g. existence of recipient address message coming from SPAMming server Greylisting l temporary blocking l only applied to unknown SMTP servers T. Sochor – RCIS May 2014 Marrakesh Blacklisting idea l verification of the sender l l against a BLACKlist of SPAMming servers known problem: l sender e-mail address can be spoofed easily l sender identification: IP address l at present: even more addresses required l IPv6 T. Sochor – RCIS May 2014 Marrakesh Blacklisting issues l l Almost nobody is able to maintain it own blacklist Third-party database (blacklist) – not suitable for each organization l Legitimate message sender in blacklist: – the delivery is usually impossible – refusal is announced to the sender • reason could be unclear • sender has limited tools to ask for exclusion from blacklist Blacklisting – error rate l Errors can happen l l l usually as a result of wrong listing the frequency of such errors provide a metric for blacklist correctness Errors are difficult to detect l but they occur Blacklisting potential errors Blue column: No. of IP addresses blocked by blacklisting Red column: Same number as of mid 2013 Drop means IP addreses removed from the blacklist Blacklisting error rate Period Tot_req Err1 Err2 Err_ratio December 2009 45,478 1 4 0.01% December 2010 47,015 12 6 0.04% December 2011 47,084 15 1 0.03% December 2012 44,530 5 1 0.13% January 2013 2,836 2 9 0.39% February 2013 3,148 4 1 0.39% Greylisting principle l simple idea: – operates BEFORE message delivery – inserting short delay in message delivery • approx. 5 minutes – SPAMmer does not repeat the attempt – in practice only applied to unknown sources T. Sochor – RCIS May 2014 Marrakesh Greylisting weakness l It is easy to adapt to greylisted server – so far it seems not efficient for SPAMmers l SPAMmer can gets into AWL: – After several successful deliveries through greylisting it is considered to be a legal source • and not checked any more – this behaviour can be eliminated by connection with SPAM scanner • DNSB T. Sochor – RCIS May 2014 Marrakesh Efficiency of various SPAM control mechanisms l measurement at 2 independent universities – 20,000 – 50,000 attempts/day avg. – i.e. 2,000 – 5,000 legal messages/day – other smaller SMTP server had been studied for shorter period T. Sochor – RCIS May 2014 Marrakesh Blacklisting and greylisting efficiency Greylisting and content-search efficiency Greylisting efficiency comparison l l 3 SMTP servers as mentioned Short-term comparison – average for the only period available (March 2012) GL efficiency avg. Ostrava Uni 89,3% Nitra Uni 95,0% Zebra 92,2% x 100000 SPAM Elimination Efficiency 6 5 4 3 2 1 0 Spam blocked by scanner Spam blocked by greylist T. Sochor – RCIS May 2014 Marrakesh Improving the SPAM detection efficiency • components of the multilevel SPAM protection do not share information IP address of SPAMmer to remove from AWL Thousands SPAM search+greylisting linkage potential 18 16 14 12 10 8 6 4 2 0 Blocked by scanner Potentially blocked by greylisting T. Sochor – RCIS May 2014 Marrakesh Conclusions l Blacklisting is the most efficient l l l l but not 100% error-free Greylisting efficiency is stable in long-term Combination of blacklisting, greylisting and SPAM scanner is recommended Better cooperation between all components will improve the efficiency T. Sochor – RCIS May 2014 Marrakesh Inlet filtering results 70,00% 0,7 60,00% 0,6 50,00% 0,5 filtered by S B L filtered by other pos tfix filters 40,00% 0,4 30,00% 0,3 20,00% 0,2 10,00% 0,1 0,00% 0 J anuary 2010 F ebruary 2010 Marc h 2010 A pril 2010 T. Sochor – RCIS May 2014 Marrakesh 0,0% 04/2010 03/2010 02/2010 01/2010 11/2009 12/2009 10/2009 09/2009 08/2009 07/2009 06/2009 05/2009 03/2009 04/2009 02/2009 01/2009 12/2008 90,0% 11/2008 10/2008 09/2008 08/2008 06/2008 07/2008 05/2008 04/2008 03/2008 02/2008 01/2008 12/2007 11/2007 09/2007 10/2007 08/2007 07/2007 06/2007 05/2007 04/2007 03/2007 02/2007 Greylisting efficiency rectified 100,0% Blocked deliveries modified 80,0% 70,0% 60,0% 50,0% 40,0% 30,0% 20,0% 10,0% Thank you for attention l l Questions? Comments? • [email protected] • http://www1.osu.cz/home/sochor
© Copyright 2026 Paperzz