Monday, May 7, 2012

Final Post :(

So, I guess is the end of the road for me and Object-Oriented Programming, well, at least the class at UT Austin. Honestly though, I cannot stress enough how much I have learned in this class. Dr. Downing is by all means an asset to this University in teaching students about the world of C++ and the subtleties of different Object Oriented programming languages.

I recently took my last exam in the class last Friday, and while I did not do too well, I understand each and every error I made. The truth is that I did not get a whole lot of sleep the night before because I was working on an Operating Systems project that was due at 8 am. The project, however, was in C++, and the subjects we learned about in Object-Oriented Programming proved vital to complete the project. I have no idea how anyone else who did not already know C++ came up with even a half-decent design when they were too worried about just getting their code to compile.

My only wish for this class is that I would have done better on the tests. It really irks me that I made the mistakes I made during test time. The thing is, I did stellar on all the projects, barely missing any points. I do very well in the application of the things I've learned in a development environment, but when it comes time to take the test I get a little anxious and answer questions too quickly.

It is because of this class that I have taken some leaps into C++ and that C++ is now my favorite programming language. It is light years faster than Java, and the amount of power it offers is incredible.

A++. Great Communication. Would take again!!!!


Thursday, May 3, 2012

Making Wrong Code Look Wrong

"Making Wrong Code Look Wrong" is an article by Joel Spolsky that articulates the necessity and benefits of adhering to coding standards that make incorrect code literally stand out to your eyes. For those that do not know, Joel spolsky is the co-founder of Fog Creek Software and Stack Overflow and has written countless articles on coding semantics and best practices that can be found on his website.

When we first begin programming in a new language, we often times have trouble seeing incorrect code. I know when I was first learning Perl I had an extremely difficult time spotting bugs by just looking at my code. I did not know clean from unclean and did not understand good coding conventions specific to the language. Joel says that ultimately you will understand a superficial level of cleanliness and will begin to see incorrect code to the point that you will want to reach out and fix the code.

Often times when I am coding, especially when I am building on code somebody else wrote, I typically will change things as I go along to conform to the coding standard that has been established at the time the project was issued. I can definitely sympathize with Joel when he talks about the temptations of attacking incorrect looking code. To the employer or manager it can seem like it is wasting a ton of time modifying code to conform to some kind of static curly brace convention, but in the end it is the uniformity that will train the eye to spot the bad eggs in the code. It is always best to establish a coding convention at the beginning of a project, and not after thousands of lines of awkwardly connected code has been put in place.

Joel moves on to establish a general rule for making wrong code look wrong. If you can see a variable on the page, you should be able to tell everything you need to know about it right then and there without looking at some other function, a macro, a global in a header file, or scouring through some class hierarchy to find what you are looking for. In the land of C++, this can be especially difficult. If we are using simple variable names like i and j, just looking at the following statement can be utterly unhelpful:

    i = j * 5;


In C, we know that j is being multiplied by five and the results are being stored in i. Easy. But in C++, an object-oriented language, we know nothing about what i and j are. We don't even know what this is doing! If i and j are built-in types, then fine, it is probably doing a multiplication of some sort. But, fact of the matter is that i and j could be user-defined types that has operator* overloaded, and we will have to go searching through the class hierarchy of the type to find out what is going on. And really, we can't really know exactly what is going on if some kind of polymorphism is involved because we need to the type i and j are now and not when they were simply declared. Don't get me wrong, these are the kinds of features that make C++ work beautifully, but with great power comes great responsibility, and I don't want to be the one searching through your code because you thought you were being clever about something. 


This brings me to another point that Joel talks about, and one that I find hits particularly close to home. I used to think that writing clever code was the bee's knees. Everyone would succumb to the power of my witty loops and if statements and automatically think it was the coolest thing they had ever seen. This is BAD. If there are no performance benefits for cleverness over being simply writing the simplest and easiest-to-understand line of code, then why not save yourself the headache later when you forgot what was going on? Let's take a super simple line of code like the following:


    if (status == 1)
    status--;


First of all, for the last time will you please put the damn decrement operator before the variable? Anyways, if I see this sequence of code, my first thought is that status must be some sort of numeric value, but I can't know for sure. The same questions arise: Is status an integer or other numeric value? Oh clever you, is it a boolean? Or... Could it be a user-defined type in which both operator==, and operator-- are overloaded? The truth is I simply have no idea without looking somewhere else, and Joel gives a nice solution to making this easier to understand: Hungarian Notation. 


Hungarian Notation is a coding notation invented by Microsoft programmer Charles Simonyi. In Simonyi's version of Hungarian Notation, every variable is prefixed with a lowercase tag which indicates the kind of thing the variable contains. The problem is that when Simonyi wrote his paper on Hungarian Notation, he made it very academically dense and made a brutal mistake that led to many misconceptions by following programmers later on. He used the word "type" instead of "kind" when referring to the first lowercase letters. While his paper clearly shows that he did not mean the defined type of the variable, many people, including all of Microsoft took this as literal, and soon all of Microsoft's code had the type of the variable prepended to it. This is nice and all, but isn't it better to not only give information about what a variable is, but also what it's doing? By prepending the kind of variable we can not only get a glimpse into what the variable is, but also what in the hell it is doing. Using the intended notation it is far easier to see errors in lines of code that would otherwise be invisible. Let's say we have an int prepended with the letter 'ts', representing "thread status", and another with the letters 'pc' representing process status. If this convention was understood, it would be apparent that the following line makes absolutely no sense:

    ts_status = pc_status; 


By seeing that they are differing status "kinds", we can easily see that the thread status is probably not supposed to have the same value as the process variables status. I wasn't there when they created the imaginary "kind" convention, but the idea is that it was understood by everyone involved in the creation of the code, and that the preceding line of code is incorrect even to the naked eye.

One thing I do find intriguing in Joe's article is his incessant hatred and disapproval of exceptions. I can see where he is coming from in regards to the article, and that is that with every function and method call, the programmer has close to zero idea if that function can even complete unless he or she knows whether or not the function or method throws an exception. This often times means searching through inherited classes and other files to see what is really going on, which is exactly the opposite of the standard Joel is trying to get across. Raymond Chen says the following:

"It is extraordinarily difficult to see the difference between bad exception-based code and not-bad exception-based code... exceptions are too hard and I'm not smart enough to handle them."


 Basically, it is almost impossible to tell whether or not exception-based code is bad or not to the naked eye. If you're going to code with exceptions, you better have some pretty serious standards set in place or else risk your end product breaking at extremely inopportune times (and give all of the other programmers involved a serious headache). 

The key is to write simple tools that humans can see and understand, not complex ones with all kinds of crazy hidden features and abstractions that assume all programmers are perfect.

Sunday, April 29, 2012

Multimaps

So recently on an operating systems project I had to come up with a data structure that mapped a set of keys to an arbitrary number of values. To be specific, each key is a username, and each value is a list of sessions that is associated with the user.

My first instinct was to simply use a map< string, list<size_t> >. It is easy to understand and it offers easy access to the values with map's overloaded indexing operator and an iterator over the list. This works perfectly fine, but it did not allow me to do something I had never done before, so I decided to be a little wild and use something called a multimap. If one looks at the stl reference for multimap, he or she can see the definition is rather simple:

Multimaps are a kind of associative container that stores elements formed by the combination of a key value and a mapped value, much like map containers, but allowing different elements to have the same key value.


Perfect! This is exactly what I needed. It was all good and fun until I began to see the horrible wretched syntax that is involved in using these forsaken data structures. While the initialization looks normal:

multimap< string, size_t> sessions;

In my opinion not a single method in the API is intuitive. You want to insert an element? That goes something like this:

sessions.insert(pair<string, size_t>(username, session_number));

Everything has to be inserted as pairs, which makes inserts very non intuitive. If we had done things the map<string, list<size_t> > way, it would have gone something like this:

sessions[username].push_back(session_number);

So, moving on, I needed a function that would check if a username had a specific live session associated with it. With a multimap we get a hideous function that goes like the following:


/**
 * Utility function to check if a username has a specific live session
 */
bool find_session(const string& username, size_t session) {
        //find returns an iterator
        multimap<string, size_t>::iterator it = sessions.find(username);

        //use equal_range to get the range of elements with the same key
        pair<multimap<string,size_t>::iterator,multimap<string,size_t>::iterator> ret;
        ret = sessions.equal_range(username);

        //ret.first is the lower bound, ret.second is the upper bound of the range
        for (it = ret.first; it != ret.second; ++it) {
                if (it->second == session) return true;
        }
        return false;
}


This particular pair of lines looks horrible:


 pair<multimap<string,size_t>::iterator,multimap<string,size_t>::iterator> ret;
        ret = sessions.equal_range(username);

This is very non-intuitive, and that declaration is extremely messy in my opinion, albeit very explicit about what it is doing.

Now, the bright side is that with C++11, this code can be made much prettier using the auto keyword that will do some magic to clean this up. Since one of my fellow brogrammers is blogging about this, I will not still his thunder, but merely request that he rewrite the function below in the comments using the keyword. :)



Monday, April 23, 2012

A New Way to Test

In the previous project for Object-Oriented Programming, or "Brogramming", (since we always program with partners :) ) I was trying to conjure up new ways to get access to my private variables from cppunit tests.

The ugliest way is to use getters and setters to play with the variables you want to check and modify. Using getters and setters is typically frowned upon though, and there are much better ways to test your code, especially if you are only creating the getter and setter methods for testing purposes.

One of the ways that was conversed about in class was to make the test cppunit structs friends of the classes you want access to. This gives the cppunit tests access to all of the variables, and we can do our testing without the hassle of dealing with getters and setters. The problem with this is that we literally have to declare the test class or struct a friend in each class or struct we want access to, and I would argue that putting these friends in your classes can needlessly clutter source code, even it is only a few added lines.

I then got the wonderful idea of using macros to play with the private, public, and protected declarations within the classes, isolating the test file access from the body of the source code. The first option with this is to do something like the following:

//not the test file
#ifdef test_mode
    #define private public
#endif

//testfile
#define test_mode

This could be considered by some to be a little on the dangerous side, so at this point I introduce a tradeoff. In one hand, we could simply give anyone who defined "test_mode" access to all of the private variables in the class we are trying to test. In the other we could do something a little more complex like the following:

//not the test file
#ifdef test_mode
    #define private protected
#endif

//testfile
#define test_mode

So instead of simply making all of the private variables public, we make them protected. Why? Well, in this case any class which is a child of all the classes in the source file being tested will have access to all of the previously private variables. The idea is that now we can simply declare the test class a child of the class(es) we want to test and test away!

Sunday, April 15, 2012

Elegant Copy Constructors and Assignment Operators

So recently I discussed the importance of adhering to the Orthodox Canonical Class Form, and in this previous week in class I learned of a very elegant way to complete two of these: the copy constructor and the assignment operator. As long as you are careful, these two methods can often be very simple to write and do not typically require a large amount of thought. Most problems arise when there are pointer variables as fields in a class. Using the default assignment operator will simply copy the address of the pointer over which results in two variables pointing to the same address. Honestly, I am having trouble recalling how I used to do this because this new method I have learned of completing these two tasks as nullified all of my old practices.

So let's say we have this class, my_object, and in this class we have only an int and an int* as private data fields for the class, like so: (Yes, it is stupid, but I am simply trying to show a point).

class my_object {
    private:
        int x;
        int* y;
};

Now, we do not want the default copy constructor or assignment operator, because they will both result in some nasty pointer bugs. We want each instance of my_object to have its own value of y, and more importantly a unique address for y! So, we will continue on to create the copy constructor using C++'s interesting syntax for initializing variables in the method declaration:

my_object::my_object(const my_object& that) : x(that.x), y(new int) {
    *y = that->y;
}

Notice that I accepted that as a reference, and that I simply initialized this->x to that.x, and that I created a new int on the heap and set its value to the value of that's y in the method body. How can we write an assignment operator that takes advantage of the copy constructor we just created? I would argue that this can be done precisely how Professor Downing enlightened me in class. First I will show it here and see if you can figure it out:

my_object::my_object& operator = (my_object that) {
    std::swap(x, that.x);
    std::swap(y, that.y);
    return *this;
}

In the words of Dr. Downing, "Beautiful!". Really, I truly believe this is a beautiful solution. We created the copy constructor that took care of the pointer copy issue, and simply passed the my_object by value in the assignment operator such that a correct copy is created, and we can swap the variables out of the object to get the proper assignment.

Monday, April 9, 2012

Interview With ARM

I had an interview with ARM last week, and I truly believe that if it were not for Dr. Downing's wonderful Object-Oriented Programming class, I would not have felt nearly as confident going into and during the interview. While Arm really is not the type of company that requires an army of computer scientists, it is interesting to see how they are currently looking to see how computer scientists will adapt to different environments and how they can apply an abstract view to different problems.

One of the recurring themes of the interview was the subject of a GUI they were interested in designing. Even though I do not have much experience with GUI's, I am still confident that I would be able to create one without a tremendous amount of difficulty. After all, it is just object manipulation and message passing. Regardless of whether an object is representing a frame on the monitor or some abstract data structure, it is still just an object that can be asked to do things and/or produce some result given a message or what state the object is in.

Pertaining to experiences in this particular OOP class, the interviewers were also interested in experience with version control software, issue trackers, and most importantly, proper documentation. The engineers loved to see that I have experience with git and svn, and loved the fact that I keep almost all homework under version control even more. Two of the code samples I sent were in fact projects from OOP, and they thoroughly enjoyed looking at code in which each function and method was documented well and easy to read.

While I do not have the job yet, I am anxious to see how the job will play out if I am hired. As of right now it seems I would be working mostly with engineers, and I am very curious of how my abstract brain can apply itself to the lower levels of the computer world.


Sunday, April 1, 2012

Avoiding Setters and Getters

For the novice programmer who is just beginning to delve in the world of object-oriented programming, setters and getters are typically the goto method of manipulating data fields of objects. Sometimes it is difficult to program without using setters and getters, specifically getters. It is often necessary to ask an object what kind of state it is in, or the value of some data field it holds. This is nice and all for testing and quick/dirty class creation, but it fails to shield the programmer and other objects from its implementation. Ideally all classes should hide all of the dirty details of the class and instead perform the computation that another class would use the value for in the first place.

In the current project in OOP, this has been a real chore, but this one of those projects that can be a disaster if designed poorly or a lot of fun if designed with good object-oriented design practices in mind. The project basically consists of a grid that holds creatures of different, or equivalent species. These species can turn, move around, and infect other creatures such that they are now the same species. In this picture, it is easy to see that there should be a grid, creature, and species class, in which the creature holds some form of a species.

The question is: How does the grid communicate with the different creatures on the board without simply asking for different data fields? How will the grid know which direction they are facing? How the instructions of the different creatures be passed around? The easy approach would be to have the grid simply ask for different data fields of the creatures and do all of the computation itself, but this is ugly and can lend itself to debugging disasters. It is much cleaner to give the creature some information based on what is in front of it and have it do some computations based on its own data to figure out what to do next , and then relay that information to the grid.

It is just lately that I have been forced to program without the use of getters and setters. While they do have their purposes in some cases, I am beginning to see how objects can communicate in a much more elegant and purposeful fashion if they are avoided.