Tuesday, September 30, 2008

Is coverage really enough?

What is coverage?
In this week's assignment, our professor assigned a plethora of reading material relating to coverage. Yet, after reading through all the articles, none of them really explained what coverage is in a simple and concise way that's easy for an undergraduate student to understand. Finally, I figured it out myself, and this is my definition: coverage is the percent of code that is being executed when the program is run. Now, this seems like a very simple definition, but it's actually quite accurate, and the reason why it is will become obvious soon.

How to test coverage?
For this exercise, we were given a Java project that implements a stack using an ArrayList. The project also came with some JUnit test cases. Using Ant and Emma to generate a nicely-formatted HTML file, we can not only see the coverage percentage but also which code is being executed and which ones are not. You can see a sample Emma report here: sample report

How to improve coverage?
Our first task was to write additional code so that the coverage increases to 100% from the default 80% for the stack project. Going back to our earlier definition, coverage is defined as what percentage of code is being executed at runtime. This means that methods need to be called and each line needs to be executed, including any branch conditions. So what's the easiest way to increase coverage? Write code to run what wasn't executed! For this exercise, the most sensible way of doing this is to write additional JUnit tests. By adding some tests that call the unused methods and throw the exceptions in the catch blocks, the code was easily brought up to 100%.

Screenshot of some of the JUnit test cases used

How does coverage help?
So how does coverage help a developer? It helps him to see which parts of his program are untouched, and therefore untested. Also, if the code is covered and there were no errors during runtime, that means that there is nothing wrong with the execution of the program for that particular trial. So does this mean that if the code coverage is 100% and there is no problem during runtime, then there is nothing wrong with the code? This is the most misleading part and as we'll see shortly, code coverage is not the same thing as code robustness.

Why should we not rely solely on coverage?
In the previous scenario, we raised the coverage to 100% and the code ran fine with no problems. It's easy to just assume that because the code ran fine with 100% coverage, then there is no problem with the code. To disprove this, we'll now purposely introduce a bug into the code that doesn't break the 100% coverage, but will blatantly cause problems. The easiest way to do this is to impose a cap on the number of elements the stack can hold. We'll do this by modifying the push method in the Stack class from:
public void push(Object obj) {
this.elements.add(obj);
}
to:
public void push(Object obj) {
if(this.elements.size() <= 4 { this.elements.add(obj); } }
As you can see, this makes it so that only a maximum of 5 items can be added to the stack. Because the JUnit tests only adds 3 elements at most to the stack, the JUnit tests pass, and Emma shows 100% coverage in the code. However, if we create a new JUnit test that tries to push more than 5 items and then pop them off, the elements will obviously be wrong. This goes to show that just because code coverage is high, that does not mean that there are no problems with the code.

Why is it not good to solely rely on coverage?
Going back to our earlier definition once more, coverage ONLY tells us which code is executed during runtime and which ones are not. Just because a line of code is executed successfully, that doesn't mean it's robust and can handle any situation. That job is left to the developer to write good test cases that tests not only the expected parameters but also the out-of-boundary situations too. Coverage is only one of the tools that developers can use to debug their code, but it should never be the sole means of determining the quality of code. Where is coverage comes in handy is showing the developer which pieces of code have not been debugged yet, but it is in no way a replacement for creating good, robust test cases.

You can get the source code for both versions here:
Stack project without bug
Stack project with bug

No comments: