(The Wildlist is the basis of in-the-wild virus testing and certification by VirusBulletin, ICSA, and West Coast Labs.)
Larry Seltzer’s pejorative post last week about the Wildlist started a few strong discussions going in the industry (with arguments on both sides). And there’s been more since the article (related or not).
The shocker was last Thursday, when it was reported that Trend Micro (following Panda’s lead) has decided to “boycott” the Wildlist.
I am of the belief that the Wildlist is an outdated method of determining the efficacy of an antivirus product. Oddly, let me make it clear that it’s to my benefit to say just the opposite: to promote the Wildlist as effective, since it’s a fairly small list of malware to worry about. Once one is “certified” for the Wildlist, one could then be considered a “real” antivirus product. Nothing is further from the truth, and therein lies the problem: It’s an implicit (and unintentional) form of fraud.
Andreas Marx echoes a fair amount of this sentiment, in an email he circulated among some researchers last week:
…there is nothing wrong with the actual testing performed by Virus Bulletin, the problem is related to the samples from the WildList. Indeed, as Larry Seltzer pointed out, there is something seriously wrong with WildList-based testing and certification.
At the Virus Bulletin Conference in 2007, Frank Dessmann and I gave a presentation, “The WildList is Dead, Long Live the WildList!”. What we found (proven with facts and figures), pointed to the actual problem of the WildList: We are mainly speaking about a “list of a small number of irrelevant random malware samples”.
And here we go for the facts (updated to include the most current WildList from April 2008):
1. The threat landscape has changed dramatically, just a few years back, we had to deal with 10 to 20 virus samples per day, now we are up to 21,000 unique new samples per day, but the current April WildList only includes 678 samples — that’s the number of samples we are getting on an average day in under an hour! Besides this, the WildList only covers self-replicating malware such as viruses, but not today’s most common threats, like Trojan Horses or rootkits. By ignoring today’s reality, the list MISSES the really HOT samples and the numbers of samples on the WildList is TOO SMALL.
2. While the WildList shows that currently 86 reporters (mainly working for AV companies) are submitting samples to the organization, most of them are inactive, so that a WildList is usually created by 8 to 10 active reporters — and these 8 to 10 people can “decide” which malware is the most active one world-wide? Therefore, the list is a RANDOM selection of samples.
3. The WildList is usually outdated when published: The April WildList was released on June 1, 2008, so the entire May (or at least 640,000 new malware samples we’re seeing per month) have passed by before the WildList was finally published… so the list contains only OLD malware, not really new samples.
The problems are not new — I’ve written about them one year ago, but
(almost) nothing has changed. 🙁
Another source of some contention on the Wildlist issue is the venerable Randy Abrams of ESET. In his words, “Agreement was virtually unanimous that the WildList is no longer useful as a metric of the ability of a product to protect users”.
Mary Landesman, on the other hand, wrote last year that “As much fun as it is to take cheap potshots and sling similes, the fact is the WildList is more pertinent than ever – particularly given today’s threat landscape. By setting a standard, definable bar, the WildList has consistently improved detection across the board. Reputable anti-virus vendors must work (hard) to gain credibility, participating fully in order to engage in the sample sharing necessary to build the library of threats required to score well on the tests. But what WildList testing really offers today is a measure of trust.” Checking back with Mary recently, she still believes the Wildlist is valid, but expanded on her viewpoint:
…I still think the WildList provides a resource to the average user to distinguish between *real* scanners and rogue (or just plain lousy) scanners. It also establishes some consistency between testing organizations so it helps set a minimum bar for the testers as well. Keep in mind, I’ve never believed (even in the mid-90s) that WildList testing alone was a good enough measure of a products overall protection capability. I just believe in the value of having a minimum bar and for that purpose it does well. This doesn’t mean it’s not in need of an overhaul. As an example, I’d love to see some minimum standard measure of behavioral analysis. And it needs a name change because it certainly isn’t reflective of what’s in the wild (indeed, nothing could be and still be manageable!) So when I say I think the WildList is still pertinent, I am referring to the value of setting those minimum bars, not necessarily about the WildList as it exists per se.
So let’s shift gears. I’ll pose the problem in a question: What would you consider the gauge of effectivness of an antivirus product? (I use the term “antivirus product” to denote the current mode of protecting against malware in general — viruses, spyware, etc.).
If I’m guessing right, your answer would probably be something like:
“A good antivirus program should block the broadest amount of malware possible, including new malware as it comes out, and should be able to clean infected systems with a high degree of effectiveness.”
The Wildlist only has about 700 samples in the list. Well, there’s lots and lots of malware out there. We recently ran scans on over 10 million pieces of malware, and found hundreds of thousands of variants. Are they all in the wild? Not necessarily. But you get the picture. There’s a lot of stuff out there, and the Wildlist itself is not “the test”.
And really, I don’t think anyone really disagrees with that.
The Wildlist itself is peachy, and I think it’s just fine as a test of file infecting viruses, etc. It’s not an easy test to pass, and I think it should continue to be there. But it’s misleading, if consumers rely on the Wildlist (through VB 100 certification or another certification) to make a determination as to the validity of a virus product. Perhaps there should be a “Basic” certifcation, and there then should be a “Premium” Wildlist certification.
As the final word on this issue, I hope this will elucidate the problem graphically. Here, again from Andreas, are collection statistics on malware (these are newly added/discovered samples per month, with a total of about 11.45 million samples right now). .