You’re going to see a lot more about this over the coming weeks, but a number of reputable publications are being critized for their testing methodologies. Not all the critisisms may be fair, but we have a debate going and it’s healthy. As an industry, there needs to be standards in testing methodologies for all types of security software (something we’re trying to do in the antispyware testing space).
It started back in June, when the New York Times quoted Microsoft as saying that PC Magazine’s antispyware test method was unfair, “pointing out that the particular spyware programs tested were extremely rare and obscure.” Veteran PC Magazine tester and author Neil Rubenking responded an article headlined “Our Tests Are Fair” and further elucidated on testing strategy in the article “Spy vs. AntiSpy”.
Brian Livingston later added fuel to the fire with a full newsletter issue critical of the antivirus testing of PC World.
PC Magazine and PC World are both highly experienced tech publications and know their stuff. So there’s going to be a very active debate, but it will be a healthy one: These publications don’t have their blinders on and they do know technology.
Which brings me to ConsumerReportsGate, involving the publication Consumer Reports, better known for reviewing cars, lawn-mowers and appliances. They have recently published a review of antispyware, antivirus and antispam applications. We’re as baffled by the results as everyone else, especially with our desktop antispam program, which scored in such a way that I can only speculate that the magazine used some antediluvian version of the product with no updated definitions.
Why the big hulabaloo? Consumer Reports made an incredible error: They “created” 5,500 viruses for their antivirus test. Graham Cluley of firm Sophos is reported as having said, “When I read about what Consumer Reports has done I want to bash my head against a brick wall”.
Veteran virus tester and expert Mary Landesman takes Consumer Reports to task as well:
Admittedly, I may know very little about vacuum cleaners, cars, coffee pots, and many of the other things Consumer Reports tests – but I do know security software. The methods used, and the results construed from those methods, cause me to severely question the validity of any of their more mainstream reviews. I’m actually in the market for a new vacuum cleaner and a new coffee pot, and I’m sure of one thing – I won’t be relying on Consumer Reports for buying advice.
Now, there were a number of people who are curious as to why creating viruses is a bad thing in testing, a practice considered taboo in the antivirus industry.
The primary scientific procedural problems with using simulators and creating new viruses were originally explained and substantiated in an open letter Joe Wells (our chief scientist in charge of security research), wrote here. I will quote some relevant passages:
Today’s antivirus products use a variety of sophisticated methods to detect viruses. Such methods include execution analysis, code and data mapping, virtual machine emulation, cryptographic analysis of file sections, etc.
Such advanced antivirus systems make virus simulation for testing virtually impossible. This is because there is no way to know what sections of viral code and/or data are targeted by any given product. That being the case, all of the virus code and data must be in the file and in the correct order for the product to detect it as that virus. If a simulator did create a file with everything possibly needed in place, it would have to create the virus exactly. It would no longer be a simulator and the virus would be real, not simulated. Therefore a virus cannot be reliably simulated.
So simulated viruses cannot reliably take the place of real viruses. This in turn means they are not a measure of an antivirus product’s worth. Think about it. If a product does not report a simulated virus as being infected, it’s right. And if a program does report a simulated virus as being infected, it’s wrong. Thus, using simulated viruses in a product review inverts the test results. It grossly misrepresents the truth of the matter because:
– It rewards the product that incorrectly reports a non-virus as infected.
– It penalizes a product that correctly recognizes the non-virus as not infected.
And then in a section entitled “An Ethical Quandary”:
Most antivirus companies are under some form of self-imposed restrictions that prevent them from knowingly creating new viruses or virus variants. In addition, competent testing and certification bodies such as ICSA, Virus Bulletin, Secure Computing, and AV-Test.org, do not create new viruses or virus variants for testing.
Indeed, the consensus throughout the antivirus development and testing community is that creating a new virus or variant for product testing would be very bad „ and totally unnecessary. To do so would undoubtedly raise questions about their ethics.
Yet, as Wells says, another problem involves the verification of created viruses. How were Consumer Reports’ viruses modified and were they fully functional viruses? If the test is to be validated scientifically, then the samples would be given to another bona fide testing lab to be verified and tested. Thus the original testing body is not just a virus creation lab, but a virus distributor as well. If they refuse to provide the samples, then their claims cannot be independently validated; so their test is invalid.
So how do you test heuristics? It’s easy, and again, I quote Joe Wells:
A tester can easily do a meaningful scientifically valid test based on the real and present danger (actually the real and soon-to-be present danger).
To elaborate on the logic, a tester can install products and download signatures on a specific day, and then test the products against current viruses known to be in the wild (see http://www.wildlist.org).
Then the tester waits a month or two and, using those old detection signatures, test against new viruses that have appeared in the wild after the signatures were downloaded. In this way the unknown viruses being tested are real viruses that are an actual threat. Such testing is therefore a “reality check” in a literal sense.
Simple and effective. And honest. Joe has done this type of testing successfully in the past. He designed and performed such testing for PC World back in 2000. If you look at the “How We Tested” section you will find the simple and real-world solution.
This is turning into a scandal, with only one outcome: Consumer Reports must do a comprehensive re-test. There’s simply no alternative. Otherwise, their reputation for fair and unbiased testing of security software is in the toilet.
Wait — there’s even a disagreement with their toilet reviews as well.