Thursday, January 25, 2007

Four basic ways to avoiding annoying bugs

Let's start with a disclaimer: This is not, and doesn't even try to be a comprehensive list of any sort. Nor am I claiming that these are the most important things. These are simply the four first things that came to my mind within the time I was willing to spend on this post. They are all rather obvious and some may be debatable.

1) Don't use empty catch blocks. Ever.

When you first write a catch block don't navigate away from it without at least putting in a call to the printStackTrace method.

try {
...
} catch (Exception e) {
e.printStackTrace();
}

The reason why this is important has to do with the predictability of a piece of code. When people first start off with a new language (or programming in general) it often leads to silly things like:

int value = 3;
if (value != 3) {
System.out.println("Problem with the assignment.");
}

The code doesn't work the way you think it should work and you start doubting the most basic things like a simple variable assignment. "Well, maybe there's some catch to it", you think. As you grow more comfortable with the language, you get past doubts like this which in turn speeds up your debugging process a lot.

The problem with exceptions is that if you get no feedback whatsoever from the program, the effect of the exception appears to be just that; An assignment that fails. Granted "int value = 3;" cannot throw an exception, but if the right-hand side of the assignment is a simple method, it might.

I actually like to keep a printStackTrace in place until I've finished a piece of code and tested that it works even if it's a block catching something very specific and controlled such as InterruptedException or NumberFormatException.

What if the exception block is something that should never happen. Or better yet, if you're absolutely certain that will never get executed, surely then it's ok to leave an empty catch block, right? Wrong. If it never gets executed, fine, the printStackTrace never does anything so it doesn't hurt. And in the eventuality that you break the code and the thing that was "never going to happen" does happen, you won't be totally lost.

For those fairly exceptional situations where you really need to silently discard the exception, I suggest you originally write the catch block with at least the stack trace printing and only remove that once the code block in question is finished and tested. And when you remove it, you put a comment explaining the silent discard.


2) Minimize scope

If you only need a variable within a loop, define it in the loop, and not at the start of the method even though it might seem like a good idea in terms of organization.

If you know your variable's scope is limited to that loop, you know that any code outside that loop can't access/modify your variable. It makes reading the code easier. It makes debugging easier. It makes modifying the code easier. It helps avoid and makes easier to spot some nasty bugs.


3) Use descriptive names

The bigger the scope of your types/variables, the more important a good name is. If the scope of a variable is 2 lines, the importance of the name is not so great.

If the scope is global - such as the name of a class - a good name is vital.

When your names are descriptive, pieces of code that do illogical things - you know, stuff that you wrote in the morning before actually waking up, stuff that you wrote when your mind was already out to lunch, etc - stand out more clearly.

Consider, for example:
bankAccountBalance = -7;

4) Don't reuse variables

Recycling is good, but it's also inherently complex. So unless you have a really valid performance reason to do this, you shouldn't. This is somewhat related to items 2 and 3. Allow me to explain:

-If you're reusing a variable instead of having 2 variables with a small scope you'll have one variable with at least doubled scope.

-If your one variable holds the bank account balance, mail server port and the width/height ratio of your dialog window it's fairly hard to come up with a good, descriptive name for it. (In case you were curious: no, in my opinion "balancePortRatio" is not a good name.)

Obfuscating by overloading method and field names

Some time ago, while testing reJ I came across an interesting form of obfuscation that I hadn't realized was possible.

This obfuscated classfile had several fields with the exact same name, but a different type. And also, several methods with identical names and parameters, but different return types.

For example:

public class Example {
private int a;
private String a;
private double[] a;

public void method() {
}

public String method() {
return null;
}
}

Obviously, this is an illegal situation in a java source file. But in the compiled code this is not a problem, as in the java bytecode all the instructions that refer to fields or methods always define the entire signature of the field or method in question. That is, including the (return) type.

Apparently ProGuard's agressive overloading produces this kind of an obfuscation.

(http://proguard.sourceforge.net/manual/usage.html#overloadaggressively):
Specifies to apply aggressive overloading while obfuscating. Multiple fields and methods can then get the same names, as long as their arguments and return types are different (not just their arguments). This option can make the output jar even smaller (and less comprehensible). Only applicable when obfuscating.