Sunday, April 29, 2012

Multimaps

So recently on an operating systems project I had to come up with a data structure that mapped a set of keys to an arbitrary number of values. To be specific, each key is a username, and each value is a list of sessions that is associated with the user.

My first instinct was to simply use a map< string, list<size_t> >. It is easy to understand and it offers easy access to the values with map's overloaded indexing operator and an iterator over the list. This works perfectly fine, but it did not allow me to do something I had never done before, so I decided to be a little wild and use something called a multimap. If one looks at the stl reference for multimap, he or she can see the definition is rather simple:

Multimaps are a kind of associative container that stores elements formed by the combination of a key value and a mapped value, much like map containers, but allowing different elements to have the same key value.


Perfect! This is exactly what I needed. It was all good and fun until I began to see the horrible wretched syntax that is involved in using these forsaken data structures. While the initialization looks normal:

multimap< string, size_t> sessions;

In my opinion not a single method in the API is intuitive. You want to insert an element? That goes something like this:

sessions.insert(pair<string, size_t>(username, session_number));

Everything has to be inserted as pairs, which makes inserts very non intuitive. If we had done things the map<string, list<size_t> > way, it would have gone something like this:

sessions[username].push_back(session_number);

So, moving on, I needed a function that would check if a username had a specific live session associated with it. With a multimap we get a hideous function that goes like the following:


/**
 * Utility function to check if a username has a specific live session
 */
bool find_session(const string& username, size_t session) {
        //find returns an iterator
        multimap<string, size_t>::iterator it = sessions.find(username);

        //use equal_range to get the range of elements with the same key
        pair<multimap<string,size_t>::iterator,multimap<string,size_t>::iterator> ret;
        ret = sessions.equal_range(username);

        //ret.first is the lower bound, ret.second is the upper bound of the range
        for (it = ret.first; it != ret.second; ++it) {
                if (it->second == session) return true;
        }
        return false;
}


This particular pair of lines looks horrible:


 pair<multimap<string,size_t>::iterator,multimap<string,size_t>::iterator> ret;
        ret = sessions.equal_range(username);

This is very non-intuitive, and that declaration is extremely messy in my opinion, albeit very explicit about what it is doing.

Now, the bright side is that with C++11, this code can be made much prettier using the auto keyword that will do some magic to clean this up. Since one of my fellow brogrammers is blogging about this, I will not still his thunder, but merely request that he rewrite the function below in the comments using the keyword. :)



1 comment:

  1. :) You can read a pretty version of that function here:
    http://ryanrheecs371p.wordpress.com/2012/04/30/c11-the-auto-keyword/

    ReplyDelete