Sunday, April 29, 2012

Multimaps

So recently on an operating systems project I had to come up with a data structure that mapped a set of keys to an arbitrary number of values. To be specific, each key is a username, and each value is a list of sessions that is associated with the user.

My first instinct was to simply use a map< string, list<size_t> >. It is easy to understand and it offers easy access to the values with map's overloaded indexing operator and an iterator over the list. This works perfectly fine, but it did not allow me to do something I had never done before, so I decided to be a little wild and use something called a multimap. If one looks at the stl reference for multimap, he or she can see the definition is rather simple:

Multimaps are a kind of associative container that stores elements formed by the combination of a key value and a mapped value, much like map containers, but allowing different elements to have the same key value.


Perfect! This is exactly what I needed. It was all good and fun until I began to see the horrible wretched syntax that is involved in using these forsaken data structures. While the initialization looks normal:

multimap< string, size_t> sessions;

In my opinion not a single method in the API is intuitive. You want to insert an element? That goes something like this:

sessions.insert(pair<string, size_t>(username, session_number));

Everything has to be inserted as pairs, which makes inserts very non intuitive. If we had done things the map<string, list<size_t> > way, it would have gone something like this:

sessions[username].push_back(session_number);

So, moving on, I needed a function that would check if a username had a specific live session associated with it. With a multimap we get a hideous function that goes like the following:


/**
 * Utility function to check if a username has a specific live session
 */
bool find_session(const string& username, size_t session) {
        //find returns an iterator
        multimap<string, size_t>::iterator it = sessions.find(username);

        //use equal_range to get the range of elements with the same key
        pair<multimap<string,size_t>::iterator,multimap<string,size_t>::iterator> ret;
        ret = sessions.equal_range(username);

        //ret.first is the lower bound, ret.second is the upper bound of the range
        for (it = ret.first; it != ret.second; ++it) {
                if (it->second == session) return true;
        }
        return false;
}


This particular pair of lines looks horrible:


 pair<multimap<string,size_t>::iterator,multimap<string,size_t>::iterator> ret;
        ret = sessions.equal_range(username);

This is very non-intuitive, and that declaration is extremely messy in my opinion, albeit very explicit about what it is doing.

Now, the bright side is that with C++11, this code can be made much prettier using the auto keyword that will do some magic to clean this up. Since one of my fellow brogrammers is blogging about this, I will not still his thunder, but merely request that he rewrite the function below in the comments using the keyword. :)



Monday, April 23, 2012

A New Way to Test

In the previous project for Object-Oriented Programming, or "Brogramming", (since we always program with partners :) ) I was trying to conjure up new ways to get access to my private variables from cppunit tests.

The ugliest way is to use getters and setters to play with the variables you want to check and modify. Using getters and setters is typically frowned upon though, and there are much better ways to test your code, especially if you are only creating the getter and setter methods for testing purposes.

One of the ways that was conversed about in class was to make the test cppunit structs friends of the classes you want access to. This gives the cppunit tests access to all of the variables, and we can do our testing without the hassle of dealing with getters and setters. The problem with this is that we literally have to declare the test class or struct a friend in each class or struct we want access to, and I would argue that putting these friends in your classes can needlessly clutter source code, even it is only a few added lines.

I then got the wonderful idea of using macros to play with the private, public, and protected declarations within the classes, isolating the test file access from the body of the source code. The first option with this is to do something like the following:

//not the test file
#ifdef test_mode
    #define private public
#endif

//testfile
#define test_mode

This could be considered by some to be a little on the dangerous side, so at this point I introduce a tradeoff. In one hand, we could simply give anyone who defined "test_mode" access to all of the private variables in the class we are trying to test. In the other we could do something a little more complex like the following:

//not the test file
#ifdef test_mode
    #define private protected
#endif

//testfile
#define test_mode

So instead of simply making all of the private variables public, we make them protected. Why? Well, in this case any class which is a child of all the classes in the source file being tested will have access to all of the previously private variables. The idea is that now we can simply declare the test class a child of the class(es) we want to test and test away!

Sunday, April 15, 2012

Elegant Copy Constructors and Assignment Operators

So recently I discussed the importance of adhering to the Orthodox Canonical Class Form, and in this previous week in class I learned of a very elegant way to complete two of these: the copy constructor and the assignment operator. As long as you are careful, these two methods can often be very simple to write and do not typically require a large amount of thought. Most problems arise when there are pointer variables as fields in a class. Using the default assignment operator will simply copy the address of the pointer over which results in two variables pointing to the same address. Honestly, I am having trouble recalling how I used to do this because this new method I have learned of completing these two tasks as nullified all of my old practices.

So let's say we have this class, my_object, and in this class we have only an int and an int* as private data fields for the class, like so: (Yes, it is stupid, but I am simply trying to show a point).

class my_object {
    private:
        int x;
        int* y;
};

Now, we do not want the default copy constructor or assignment operator, because they will both result in some nasty pointer bugs. We want each instance of my_object to have its own value of y, and more importantly a unique address for y! So, we will continue on to create the copy constructor using C++'s interesting syntax for initializing variables in the method declaration:

my_object::my_object(const my_object& that) : x(that.x), y(new int) {
    *y = that->y;
}

Notice that I accepted that as a reference, and that I simply initialized this->x to that.x, and that I created a new int on the heap and set its value to the value of that's y in the method body. How can we write an assignment operator that takes advantage of the copy constructor we just created? I would argue that this can be done precisely how Professor Downing enlightened me in class. First I will show it here and see if you can figure it out:

my_object::my_object& operator = (my_object that) {
    std::swap(x, that.x);
    std::swap(y, that.y);
    return *this;
}

In the words of Dr. Downing, "Beautiful!". Really, I truly believe this is a beautiful solution. We created the copy constructor that took care of the pointer copy issue, and simply passed the my_object by value in the assignment operator such that a correct copy is created, and we can swap the variables out of the object to get the proper assignment.

Monday, April 9, 2012

Interview With ARM

I had an interview with ARM last week, and I truly believe that if it were not for Dr. Downing's wonderful Object-Oriented Programming class, I would not have felt nearly as confident going into and during the interview. While Arm really is not the type of company that requires an army of computer scientists, it is interesting to see how they are currently looking to see how computer scientists will adapt to different environments and how they can apply an abstract view to different problems.

One of the recurring themes of the interview was the subject of a GUI they were interested in designing. Even though I do not have much experience with GUI's, I am still confident that I would be able to create one without a tremendous amount of difficulty. After all, it is just object manipulation and message passing. Regardless of whether an object is representing a frame on the monitor or some abstract data structure, it is still just an object that can be asked to do things and/or produce some result given a message or what state the object is in.

Pertaining to experiences in this particular OOP class, the interviewers were also interested in experience with version control software, issue trackers, and most importantly, proper documentation. The engineers loved to see that I have experience with git and svn, and loved the fact that I keep almost all homework under version control even more. Two of the code samples I sent were in fact projects from OOP, and they thoroughly enjoyed looking at code in which each function and method was documented well and easy to read.

While I do not have the job yet, I am anxious to see how the job will play out if I am hired. As of right now it seems I would be working mostly with engineers, and I am very curious of how my abstract brain can apply itself to the lower levels of the computer world.


Sunday, April 1, 2012

Avoiding Setters and Getters

For the novice programmer who is just beginning to delve in the world of object-oriented programming, setters and getters are typically the goto method of manipulating data fields of objects. Sometimes it is difficult to program without using setters and getters, specifically getters. It is often necessary to ask an object what kind of state it is in, or the value of some data field it holds. This is nice and all for testing and quick/dirty class creation, but it fails to shield the programmer and other objects from its implementation. Ideally all classes should hide all of the dirty details of the class and instead perform the computation that another class would use the value for in the first place.

In the current project in OOP, this has been a real chore, but this one of those projects that can be a disaster if designed poorly or a lot of fun if designed with good object-oriented design practices in mind. The project basically consists of a grid that holds creatures of different, or equivalent species. These species can turn, move around, and infect other creatures such that they are now the same species. In this picture, it is easy to see that there should be a grid, creature, and species class, in which the creature holds some form of a species.

The question is: How does the grid communicate with the different creatures on the board without simply asking for different data fields? How will the grid know which direction they are facing? How the instructions of the different creatures be passed around? The easy approach would be to have the grid simply ask for different data fields of the creatures and do all of the computation itself, but this is ugly and can lend itself to debugging disasters. It is much cleaner to give the creature some information based on what is in front of it and have it do some computations based on its own data to figure out what to do next , and then relay that information to the grid.

It is just lately that I have been forced to program without the use of getters and setters. While they do have their purposes in some cases, I am beginning to see how objects can communicate in a much more elegant and purposeful fashion if they are avoided.