30 March 2013

Quick benchmark on the bounded_buffer<T> example from Boost Templated Circular Buffer

Boost's Templated Circular Buffer documentation claims that some modest Boost.Thread wrapping of a boost::circular_buffer<T> makes a bounded producer-consumer queue. Happily, they provide a benchmark. I believe 'em, of course, but I'd like to get an idea of the variability behind the claims.

Rusty Russell, of CCAN fame, put out a nice pipe-oriented processor for gathering statistics. This can directly be fed the output of the benchmark as a pretty cute one-liner:

gcc -Wall -O3 -pthread bounded_buffer_comparison.cpp -lboost_thread-mt
for i in seq 1 100; do ./a.out; done | ./stats


With a touch of reformatting and sorting the columns by increasing mean, here are the results on a circa-2007 T61 stinkpad of mine:

bounded_buffer<int>                         0.45 -- 0.98  (0.7588 +/- 0.17 ) s
bounded_buffer<std::string>                 0.58 -- 0.91  (0.7678 +/- 0.095) s
bounded_buffer_deque_based<std::string>     0.43 -- 0.94  (0.8291 +/- 0.095) s
bounded_buffer_deque_based<int>             0.48 -- 1.03  (0.8374 +/- 0.12 ) s
bounded_buffer_space_optimized<int>         0.55 -- 1.10  (0.8459 +/- 0.1  ) s
bounded_buffer_space_optimized<std::string> 0.46 -- 1.21  (0.8852 +/- 0.24 ) s
bounded_buffer_list_based<int>              2.84 -- 6.65  (4.2641 +/- 0.74 ) s
bounded_buffer_list_based<std::string>      2.22 -- 5.77  (4.2663 +/- 0.66 ) s

Building -O3 makes a big difference as you might expected and I've not bothered showing -O2. On average, the sample bounded_buffer<T> does slightly outperform the deque-based logic as the Boost Templated Circular Buffer documentation suggests. The variability is, uhhh, more variable than I'd expected.

19 March 2013

Mergeable accumulation of the running min, mean, max, and variance

Boost Accumulators are pretty slick, but they do have the occasional shortcoming. To my knowledge, one cannot merge data from independent accumulator sets. Merging is a nice operation to have when you want to compute statistics in parallel and then roll up the results into a single summary.

In his article Accurately computing running variance, John D. Cook presents a small class for computing a running mean and variance using an algorithm with favorable numerical properties reported by Knuth in TAOCP. Departing from Cook's sample, my rewrite below includes

• templating on the floating point type,
• permitting accumulating multiple statistics simultaneously without requiring multiple counters,
• additionally tracking the minimum and maximum while holding storage overhead constant relative to Cook's version,
• providing sane (i.e. NaN) behavior when no data has been processed,
• permitting merging information from multiple instances,
• permitting clearing an instance for re-use,
• assertions to catch when one screws up,

Please let me know if you find this useful. If anyone is interested, I will also post the associated regression tests.

12 March 2013

bash parameter substitution quick reference

I can never remember the syntax for all the various substitution possibilities in bash. Though many places detail them all (e.g. man bash, Advanced Bash Scripting Guide), the syntax is threaded through paragraphs of text. Here is my distilled quick reference.

Many thanks to snipt for housing this and the many other miscellaneous code snippets I can never keep in my head but constantly need.