Friday, November 7, 2008

Software ICU - making sure your code doesn't go blue

Code blue, code blue! We need an ICU!
In a hospital, some patients are put in an Intensive Care Unit (ICU) which monitors all their vitals and ensures that they don't end up with something like cardiac arrest, prompting doctors to come running down the hallway yelling "Code blue!" while pushing along a defibrillator machine. Likewise, for software development, we can also put our projects into a software ICU that monitors its 'health', preventing us from running down the halls yelling "Code red!" when we see all the compiler errors and unhandled exceptions.

Hackystat, it's in your project, hacking your stats
The software ICU we're using for our project is called Hackystat, written and maintained by our professor for this course. It records the data generated from several Java development tools and graph it over time:

Hackystat server screenshot showing our project's health

In the screenshot, you can see several columns:

Coverage: code coverage measured using Emma, higher is better.
Complexity: cyclomatic complexity measured using JavaNCSS, a measure of how many branches and loops the code has, lower is better.
Coupling: how dependent a class is on other classes (such as how many imports it uses), measured using DependencyFinder, lower is better.
Churn: the number of lines that are modified in a SVN commit, measured using the SVN changelogs, lower is better (commit early, commit often).
CodeIssue: the number of issues that the QA tools reported, in our case Checkstyle, FindBugs, and PMD, ideally this should be zero.
Size: size of the project in lines of code measured using SCLC.
DevTime: amount of time spent developing the project, measured using an Eclipse plug-in provided by Hackystat, it records whether there was any work done in 5-minute increments.
Commit: number of SVN commits made, measured using the SVN changelogs.
Build: number of builds made, measured using Ant.
Test: number of unit tests done on the project, measured using JUnit.

As you can see, some of the columns are colored (green for good, red for bad) and some aren't. The columns that aren't colored are the ones that are hard to use as metrics for evaluating a project's health. For example, how do you determine how healthy a project is based on the number of lines of code it has? A simple program will obviously have less lines than a large project, but that's no indicator of how healthy either of them are.

As for our project, you can see that the complexity is steadily falling and churn is going down too, indicating that we're simplifying the code and committing it more often. Unfortunately, our coverage has taken quite a huge drop, from around 80% to 50%. This is because I had to disable a unit test that was failing half the time. It was querying the same web page three times in a row and it would occasionally fail to download the page, resulting in a failure. This made me realize that I had to redesign that class because the unit tests for it are clunky and hard to write. Apart from this issue though, our project is quite healthy - lots of greens, and as long as we get those unit tests written the project status is looking good.

I want me a piece of that Hackystat too
So now you've seen what Hackystat can do, and you might be thinking that you want to use it for your projects too. But how do you set it up? Fortunately, the Hackystat website provides a detailed tutorial on how to set up your development environment correctly. Unfortunately, you have to set up each sensor manually by writing several configuration files, installing the necessary tools you want to gather data from, and then setting up the Ant build files to report the sensor data to the Hackystat server. I myself didn't run into any significant problems while setting up my system but several of my classmates did. If you're going to do it yourself, especially if you want to set up all the sensors that Hackystat supports, be prepared to spend a good deal of time getting everything to work. You only have to do it once for each computer you want to use the Hackystat system for, so it's better to just bite the bullet now and reap the benefits of having a software ICU automatically monitor your project.

So why use Hackystat?
So why use Hackystat if it's difficult to set up? After all, the tools that Hackystat monitors can all be run manually without any significant difficulties. Well, as with any tool, it's only effective if it's being used. To expect the developers (or even yourself!) to manually run each tool every day is practically asking them to forget to do it. Also, running the tools manually can only tell you the status of the project at that given moment. Hackystat can display the trend of the project over time, which can reveal information that would be hidden otherwise. Perhaps the coverage of the project is slowly dropping over time, indicating that more unit tests need to be written. Seeing a graph of the coverage over the period of a week can instantly reveal this, but it might not be obvious if the coverage percentage is still high. All in all, having another tool to monitor a project with is always a good thing.

No comments: