25 January 2011

Converting to and from human-readable byte counts

Converting some number of bytes (say 1024) into a human-readable byte count (say "1K") seems to be a common problem (simply search or ask StackOverflow). I also needed to go the other direction and turn things like "1.5 MB" into 1,572,864. Here's what I cooked up in C99:

// Adapted from http://stackoverflow.com/questions/3758606/
// how-to-convert-byte-size-into-human-readable-format-in-java
void to_human_readable_byte_count(long bytes,
                                  int si,
                                  double *coeff,
                                  const char **units)
{
    // Static lookup table of byte-based SI units
    static const char *suffix[][2] = { { "B",  "B"   },
                                       { "kB", "KiB" },
                                       { "MB", "MiB" },
                                       { "GB", "GiB" },
                                       { "TB", "TiB" },
                                       { "EB", "EiB" },
                                       { "ZB", "ZiB" },
                                       { "YB", "YiB" } };
    int unit = si ? 1000 : 1024;
    int exp = 0;
    if (bytes > 0) {
        exp = min( (int) (log(bytes) / log(unit)),
                   (int) sizeof(suffix) / sizeof(suffix[0]) - 1);
    }
    *coeff = bytes / pow(unit, exp);
    *units  = suffix[exp][!!si];
}

// Convert strings like the following into byte counts
//    5MB, 5 MB, 5M, 3.7GB, 123b, 456kB
// with some amount of forgiveness baked into the parsing.
long from_human_readable_byte_count(const char *str)
{
    // Parse leading numeric factor
    char *endptr;
    errno = 0;
    const double coeff = strtod(str, &endptr);
    if (errno) return -1;

    // Skip any intermediate white space
    while (isspace(*endptr)) ++endptr;

    // Read off first character which should be an SI prefix
    int exp  = 0;
    int unit = 1024;
    switch (toupper(*endptr)) {
        case 'B':  exp =  0; break;
        case 'K':  exp =  3; break;
        case 'M':  exp =  6; break;
        case 'G':  exp =  9; break;
        case 'T':  exp = 12; break;
        case 'E':  exp = 15; break;
        case 'Z':  exp = 18; break;
        case 'Y':  exp = 21; break;

        case ' ':
        case '\t':
        case '\0': exp =  0; goto done;

        default:   return -1;
    }
    ++endptr;

    // If an 'i' or 'I' is present use SI factor-of-1000 units
    if (toupper(*endptr) == 'I') {
        ++endptr;
        unit = 1000;
    }

    // Next character must be one of B/empty/whitespace
    switch (toupper(*endptr)) {
        case 'B':
        case ' ':
        case '\t': ++endptr;  break;

        case '\0': goto done;

        default:   return -1;
    }

    // Skip any remaining white space
    while (isspace(*endptr)) ++endptr;

    // Parse error on anything but a null terminator
    if (*endptr) return -1;

done:
    return exp ? coeff * pow(unit, exp / 3) : coeff;
}

15 January 2011

Using Argp with MPI-based applications

Argp is a great, great parser for command line options. When using it in MPI-based applications, there's a catch in that you want only one MPI rank to print --help information, usage warnings, error messages, etc. Otherwise, you get a whole mess of repeated, jumbled output as each MPI rank squawks about the same problem.

Here's a wrapper for Argp for use in MPI-based applications that solves just this nuisance:

#include <stdio.h>
#include <unistd.h>
#include "argp.h"

/**
 * Call <a href="http://www.gnu.org/s/libc/manual/html_node/Argp.html"
 * >Argp</a>'s \c argp_parse in an MPI-friendly way.  Processes
 * with nonzero rank will have their \c stdout and \c stderr redirected
 * to <tt>/dev/null</tt> during \c argp_parse.
 *
 * @param rank MPI rank of this process.  Output from \c argp_parse
 *             will only be observable from rank zero.
 * @param argp      Per \c argp_parse semantics.
 * @param argc      Per \c argp_parse semantics.
 * @param argv      Per \c argp_parse semantics.
 * @param flags     Per \c argp_parse semantics.
 * @param arg_index Per \c argp_parse semantics.
 * @param input     Per \c argp_parse semantics.
 *
 * @return Per \c argp_parse semantics.
 */
error_t mpi_argp_parse(const int rank,
                       const struct argp *argp,
                       int argc,
                       char **argv,
                       unsigned flags,
                       int *arg_index,
                       void *input);

error_t mpi_argp_parse(const int rank,
                       const struct argp *argp,
                       int argc,
                       char **argv,
                       unsigned flags,
                       int *arg_index,
                       void *input)
{
    // Flush stdout, stderr
    if (fflush(stdout))
        perror("mpi_argp_parse error flushing stdout prior to redirect");
    if (fflush(stderr))
        perror("mpi_argp_parse error flushing stderr prior to redirect");

    // Save stdout, stderr so we may restore them later
    int stdout_copy, stderr_copy;
    if ((stdout_copy = dup(fileno(stdout))) < 0)
        perror("mpi_argp_parse error duplicating stdout");
    if ((stderr_copy = dup(fileno(stderr))) < 0)
        perror("mpi_argp_parse error duplicating stderr");

    // On non-root processes redirect stdout, stderr to /dev/null
    if (rank) {
        if (!freopen("/dev/null", "a", stdout))
            perror("mpi_argp_parse error redirecting stdout");
        if (!freopen("/dev/null", "a", stderr))
            perror("mpi_argp_parse error redirecting stderr");
    }

    // Invoke argp per http://www.gnu.org/s/libc/manual/html_node/Argp.html
    error_t retval = argp_parse(argp, argc, argv, flags, arg_index, input);

    // Flush stdout, stderr again
    if (fflush(stdout))
        perror("mpi_argp_parse error flushing stdout after redirect");
    if (fflush(stderr))
        perror("mpi_argp_parse error flushing stderr after redirect");

    // Restore stdout, stderr
    if (dup2(stdout_copy, fileno(stdout)) < 0)
        perror("mpi_argp_parse error reopening stdout");
    if (dup2(stderr_copy, fileno(stderr)) < 0)
        perror("mpi_argp_parse error reopening stderr");

    // Close saved versions of stdout, stderr
    if (close(stdout_copy))
        perror("mpi_argp_parse error closing stdout_copy");
    if (close(stderr_copy))
        perror("mpi_argp_parse error closing stderr_copy");

    // Clear any errors that may have occurred on stdout, stderr
    clearerr(stdout);
    clearerr(stderr);

    // Return what argp_parse returned
    return retval;
}

04 January 2011

C header-only unit testing with FCTX

Just a quick hat tip to FCTX, a library I've found invaluable these past few months. FCTX provides header-only unit testing for C. Sure, if you're in C++ land there's a ton of xUnit-like frameworks available (with Boost.Test being my favorite), but for vanilla C projects FCTX wins hands down.

As an example, here's something I put together for a Stack Overflow response: The logic isn't rocket science, of course. But testing it in C without resorting to external libraries and complicated makefiles shouldn't be rocket science either. Provided that fct.h is in the same directory, this source will compile and run.

Subscribe Subscribe to The Return of Agent Zlerich