Over the past two weeks, Consumer Reports has been slammed by the bulk of professional researchers in the security community for testing antivirus programs using 5,500 “fake” viruses.
Consumer Reports fans and a minority group in the security community, however, fought back —– after all, Consumer Reports is seen by many as a competent, independent testing lab and antivirus companies are generally seen as lazy, self-serving, money-hungry companies who have been soaking users for years with crappy products and high subscription fees, etc, etc. So even though Consumer Reports was lambasted by professional security researchers with no ties to antivirus companies, it was seen by some as whining by money-hungry antivirus companies.
Well, ok, on to Chapter 2, which is more damning than the AV test. Because I have something which is so incredible, it boggles the mind
In addition to antivirus programs, Consumer Reports tested antispyware applications. And they have now confirmed that they did not test against any spyware for their antispyware testing. (Feel free to read that sentence again.)
Instead, their entire test of antispyware applications was based on running applications against Spycar, a set of applications written by Intelguardians that mimic spyware behavior — directly against the explicit instructions of the Spycar developers.
The entire test. Blocking. Scan and remove. The works.
From a letter to us:
We assessed the ability of products to detect and block malware that had not yet been explicitly included in definition updates. This required the software to be capable of examining typical behaviors using heuristic methods. In the case of spyware, we used the public suite of Spycar scripts as published by Intelguardians Network Intelligence LLC, at http://www.spycar.org.
For each tested anti-spyware product, installed as the only anti-spyware product in a virtual session, we did a fresh boot and an update check for the product. We then ran each of the Spycar suite’s 17 components, allowing the anti-spyware program to attempt to detect and either warn the user or block the behavior. We then ran the evaluation tool and noted the behaviors that had been allowed. We then refreshed the session (undid the actions), and repeated the “infections”, but this time, prior to evaluation, we ran a scan with the anti-spyware program and allowed it to detect and undo any behaviors it found post-infection. We then ran an evaluation to see how many behaviors still remained.
The results of our two runs formed the basis of the “Blocking” performance in our ratings.
What does Spycar do? It does things like install fake registry keys, changes your start page and the like. It is specifically designed to test how well antispyware programs block unknown applications — not scan and remove.
Remember that antispyware applications generally should do three things:
a) Scan for spyware.
b) Remove spyware.
c) Block new spyware, hopefully before it infects your system.
Spycar is ONLY designed to be a limited test of the blocking capability of an antispyware program.
As Ed Skoudis, one of the authors of Spycar, pointed out to me:
Spycar is focused on evaluating behavior-based detection mechanisms. That’s labeled very clearly all over the Spycar website. Its only use in testing signature-based scanning products is in showing that they are just that, signature-based scanning products. That is, Spycar can be useful in determining that a product has no real-time behavior-based detection mechanisms. But, it’s not useful beyond that determination in evaluating on-demand signature-based tools or comparing them against each other. Now, it can be used to show that one tool has real-time behavior based defenses, and another doesn’t. That is a useful comparison point, provided that customers understand what it means (and, an article should explain that). But, again, it cannot be used to determine which of two purely signature-based scanners is better [my emphasis].
This fact is made clear in section 1 of Spycar’s EULA:
Intelguardians created Spycar so anyone could test the behavior-based defenses of an anti-spyware tool. It is intended to be used to see how anti-spyware tools cope with new spyware for which they didn’t have a signature. It is not intended to provide perfect anti-spyware tests, or to act as a substitute for any other form of evaluation. In particular, it is designed to test solely the ability of anti-spyware products to conduct behavior-based (non-signature based) detection of spyware. It is also not intended to disparage any particular anti-spyware product.
It is also explicitly not to be used as a sole testing method, something the authors of Spycar make very clear on their website.
Is Spycar a Comprehensive Test of Anti-Spyware Tools?
No. Spycar models some behaviors of spyware tools to see if an anti- spyware tool detects and/or blocks it. But, spyware developers are very creative, adding new and clever behaviors all the time. Spycar tests for some of these common behaviors, but not all. Also, with its behavior-based modeling philosophy, Spycar does not evaluate the signature base, the user interface, and other vital aspects of an anti-spyware tool. Thus, Spycar alone cannot be used to determine how good or bad an anti-spyware product is. We’ve used it to find several gaps in anti-spyware product defenses, but Spycar is but one tool for analyzing one set of characteristics of anti-spyware products. A comprehensive review of anti-spyware tools should utilize a whole toolbox, of which Spycar may be one element…
And even more surprisingly, even though Consumer Reports used the Spycar testing methodology, they never even contacted the authors of Spycar for advice or feedback.
So, Consumer Reports
a) Ignored the instructions of the Spycar authors and used the simulator as the sole method of testing.
b) Ignored the instructions by the Spycar authors to not use Spycar to test scan and remove functionality.
Consumer Reports carelessly and arrogantly didn’t bother to read the documentation for the simulator, and in the process, did not serve the consumer. RTFM.
But let’s add a little more color.
Spycar is a limited test that can only be used to test certain blocking characteristics of antispyware programs (in other words, the ability of an antispyware program to drive you nuts with constant inane warnings).
For example, one of the Spycar test applications, HKLM_Run.exe, tries to insert the following registry key:
Now, Consumer Reports tested the ability of an application to try to block that registry key. But then it ran a scan on the machine to see if an antispyware application “caught” this supposed infection!
Absolutely mindboggling. This is NOT an infection. It’s a harmless registry key. The entire antispyware scan and remove functionality was solely judged on the ability of an antispyware application to remove a harmless entry.
The only way that an antispyware application would catch this harmless entry is one of two ways:
a) The antispyware company cheated, and made sure that all the Spycar entries were in their database or
b) The antispyware product has some type of “snapshot” ability, something not generally thought of as a requirement for an antispyware application (not necessarily a bad idea, but not entirely relevant to a test of scan and remove functionality).
Spycar can’t even test for some of the really nasty types of spyware out there, which would require a kernel-level driver to detect — malware that is inside a compressed file, unpacks a few kilobytes, hooks into the kernel without even executing an application, and happily installs a rootkit. That’s the nasty crap that truly tests the ability of an antispyware application, contrasted with finding an adware application happily advertising itself in the Run key of the registry.
At any rate, Consumer Reports doesn’t necessarily agree. When presented with an overwhelming amount of evidence as to why they shouldn’t use Spycar, their response was:
Thanks for your insights on the use of behavior simulation to test the performance of anti-spyware programs. We believe we understand your concerns, however we chose this approach because we felt it best captured the flexibility of the software.
We are constantly re-evaluating our test program, and will take these and other considerations into account in future tests.
Brownie, you’re doing a heck of a job.
(More commentary here by Eric Howes.)