When we first begin programming in a new language, we often times have trouble seeing incorrect code. I know when I was first learning Perl I had an extremely difficult time spotting bugs by just looking at my code. I did not know clean from unclean and did not understand good coding conventions specific to the language. Joel says that ultimately you will understand a superficial level of cleanliness and will begin to see incorrect code to the point that you will want to reach out and fix the code.
Often times when I am coding, especially when I am building on code somebody else wrote, I typically will change things as I go along to conform to the coding standard that has been established at the time the project was issued. I can definitely sympathize with Joel when he talks about the temptations of attacking incorrect looking code. To the employer or manager it can seem like it is wasting a ton of time modifying code to conform to some kind of static curly brace convention, but in the end it is the uniformity that will train the eye to spot the bad eggs in the code. It is always best to establish a coding convention at the beginning of a project, and not after thousands of lines of awkwardly connected code has been put in place.
Joel moves on to establish a general rule for making wrong code look wrong. If you can see a variable on the page, you should be able to tell everything you need to know about it right then and there without looking at some other function, a macro, a global in a header file, or scouring through some class hierarchy to find what you are looking for. In the land of C++, this can be especially difficult. If we are using simple variable names like i and j, just looking at the following statement can be utterly unhelpful:
i = j * 5;
In C, we know that j is being multiplied by five and the results are being stored in i. Easy. But in C++, an object-oriented language, we know nothing about what i and j are. We don't even know what this is doing! If i and j are built-in types, then fine, it is probably doing a multiplication of some sort. But, fact of the matter is that i and j could be user-defined types that has operator* overloaded, and we will have to go searching through the class hierarchy of the type to find out what is going on. And really, we can't really know exactly what is going on if some kind of polymorphism is involved because we need to the type i and j are now and not when they were simply declared. Don't get me wrong, these are the kinds of features that make C++ work beautifully, but with great power comes great responsibility, and I don't want to be the one searching through your code because you thought you were being clever about something.
This brings me to another point that Joel talks about, and one that I find hits particularly close to home. I used to think that writing clever code was the bee's knees. Everyone would succumb to the power of my witty loops and if statements and automatically think it was the coolest thing they had ever seen. This is BAD. If there are no performance benefits for cleverness over being simply writing the simplest and easiest-to-understand line of code, then why not save yourself the headache later when you forgot what was going on? Let's take a super simple line of code like the following:
if (status == 1)
status--;
First of all, for the last time will you please put the damn decrement operator before the variable? Anyways, if I see this sequence of code, my first thought is that status must be some sort of numeric value, but I can't know for sure. The same questions arise: Is status an integer or other numeric value? Oh clever you, is it a boolean? Or... Could it be a user-defined type in which both operator==, and operator-- are overloaded? The truth is I simply have no idea without looking somewhere else, and Joel gives a nice solution to making this easier to understand: Hungarian Notation.
Hungarian Notation is a coding notation invented by Microsoft programmer Charles Simonyi. In Simonyi's version of Hungarian Notation, every variable is prefixed with a lowercase tag which indicates the kind of thing the variable contains. The problem is that when Simonyi wrote his paper on Hungarian Notation, he made it very academically dense and made a brutal mistake that led to many misconceptions by following programmers later on. He used the word "type" instead of "kind" when referring to the first lowercase letters. While his paper clearly shows that he did not mean the defined type of the variable, many people, including all of Microsoft took this as literal, and soon all of Microsoft's code had the type of the variable prepended to it. This is nice and all, but isn't it better to not only give information about what a variable is, but also what it's doing? By prepending the kind of variable we can not only get a glimpse into what the variable is, but also what in the hell it is doing. Using the intended notation it is far easier to see errors in lines of code that would otherwise be invisible. Let's say we have an int prepended with the letter 'ts', representing "thread status", and another with the letters 'pc' representing process status. If this convention was understood, it would be apparent that the following line makes absolutely no sense:
ts_status = pc_status;
By seeing that they are differing status "kinds", we can easily see that the thread status is probably not supposed to have the same value as the process variables status. I wasn't there when they created the imaginary "kind" convention, but the idea is that it was understood by everyone involved in the creation of the code, and that the preceding line of code is incorrect even to the naked eye.
One thing I do find intriguing in Joe's article is his incessant hatred and disapproval of exceptions. I can see where he is coming from in regards to the article, and that is that with every function and method call, the programmer has close to zero idea if that function can even complete unless he or she knows whether or not the function or method throws an exception. This often times means searching through inherited classes and other files to see what is really going on, which is exactly the opposite of the standard Joel is trying to get across. Raymond Chen says the following:
"It is extraordinarily difficult to see the difference between bad exception-based code and not-bad exception-based code... exceptions are too hard and I'm not smart enough to handle them."
Basically, it is almost impossible to tell whether or not exception-based code is bad or not to the naked eye. If you're going to code with exceptions, you better have some pretty serious standards set in place or else risk your end product breaking at extremely inopportune times (and give all of the other programmers involved a serious headache).
The key is to write simple tools that humans can see and understand, not complex ones with all kinds of crazy hidden features and abstractions that assume all programmers are perfect.
No comments:
Post a Comment