"One more thing..."

I recently started to toy around with shiny new C++11, and one of the new features that immediately caught my attention is the new container type unordered_set. Unordered sets are containers that store unique elements in no particular order, and allow for fast retrieval of individual elements.

Luckily, the ubiquitous goats-lions-wolves problem that haunts us UnRisk-ers since a couple of weeks is a nice testing ground for such data containers. In his post, Sascha promised that he would write no more on this problem. I didn't give that promise :) One of the more time-consuming parts of the problem is the calculation of a set of unique "next forests". Sascha's solution solves that problem by stuffing the next forests into a vector, sorting that vector and then removing the duplicates. At first sight, it seems a bit wasteful to first generate all possible solutions, then sorting them (an order N*log(N) operation in the worst case), and then throwing away the duplicates.

An alternative solution would be to insert the next forests into a set or an unordered_set, where inserting takes care of the duplicates and is an order log(N) operation. In theory, the operation with the lower complexity should lead to faster code, at least if N is large enough. In practice, it is not all that clear how large "large enough" is.

Enough of the talking: Here are the measured timings in seconds as a function of the number of animals in a log-log plot (blue: Sascha's vector-sort version, red: version using unordered_set, yellow: version using set)

To be honest, I am a bit surprised by the horrible performance of the set-based versions (they are only a bit more than two times faster than the Fortran version!). If one takes a closer look, the unordered_set version comes closer to the vector-sort version for higher values of N (it is about a factor of 3 slower for small numbers, and only a factor of 2 slower for the larger ones). One would need to go to impractically large numbers of animals, however, to reach break-even, if break-even occurs at all. Two quick checks with the profiler reveal that:

  • 82% of the consumed time really goes into the (unordered_)set.insert
  • for very large sets (many millions of elements), inserting really goes with Log(N)
  • However: our sets don't get that large in this problem
One explanation for the relative speed of the vector-sort version could be that the vectors are not really random, so sorting is a lot faster than N*Log(N) - but I have to investigate that in more detail. Or Sascha breaks his promise ;)