There’s been this rash of really irritating image spam lately, difficult for spam filters to catch because of its nature.
For example, if you look at this spam:
and view the HTML source, you see the following:
<!DOCTYPE HTML PUBLIC “-//W3C//DTD HTML 4.0 Transitional//EN”>
<HTML><HEAD>
<META content=”MSHTML 6.00.2800.1106″ name=GENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY bgColor=#ffffff>
<DIV><FONT face=Arial size=2><IMG alt=”” hspace=0
src=”cid:000001c66dfc$24202377$7a47e8c8@legunj.hqyivu”
align=baseline
border=0></FONT></DIV>
</BODY></HTML>
Not necessarily a lot for a spam filter to go off of, unless you simply want to ban HTML emails (not entirely practical for most…).
So, we are killing it with a regex expression in our Ninja messaging security product, which looks like this:
^s*?<!doctypes+?htmls+?publics+?”[^”]+?”s*?>s*?<html>s*?<head>s*?<metas+?[^>]*?contents*?=s*?([“‘])[^1]*?1s*?names*?=s*?[“‘]?GENERATOR[“‘]?s*?>s*?<style[^>]*?>.*?</styles*?>s*?</heads*?>s*?<bodys+?bgColors*?=s*?S{7,7}s*?>s*?<div[^>]*?>.*?<fonts+?faces*?=s*?arials+?sizes*?=s*?2*?>[^<]*?<imgs+?alts*?=s*?([“‘])2s+?hspaces*?=s*?0s+?srcs*?=s*?([“‘])cid:[^@]{30,30}@[^3]*?3s+?aligns*?=s*?baselines+?borders*?=s*?0>s*?</font>s*?</div>s*?</body>s*?</html>s*?$
Alex Eckelberry