26 June 2011

Using Boost Spirit 2.1+ to evaluate constant arithmetic expressions

I used Boost Program_options for my research code's input parsing requirements. As expected, it works well. For floating point quantities, I've found having to specify 4.18879020478639 for 4π/3 is a bit cumbersome. When specifying arguments on the command line, I could use any one of a number of hacks to perform floating point arithmetic in bash. But that didn't help me when taking input from response files. I wanted an arithmetic expression utility that could process a C-like grammar. Long double accuracy was a requirement. I looked around a bit. GNU libmatheval and muParser appeared to be good fits but looked to be double precision-only. Though straightforward, I decided against combining the Shunting-yard algorithm with some in-place RPN logic because I wasn't enthusiastic about either finding or writing an appropriate (and admittedly toy simple) lexer.

Already having a Boost pre-requisite in my code, I decided to try out Boost Spirit since it could provide a header-only, precision-agnostic solution. Spirit has gone through a lot of significant revisions and many articles seemed to be written for the older Spirit Classic. The latest official reference manual gave several excellent examples that almost fit. I adapted Spirit's calc3.cpp and roman.cpp sample grammars to accomplish what I wanted. I added ** as an operator for exponentiation and pi as a built-in constant. I also added a smattering of special functions (e.g. exp). The task required about 150 lines of Spirit-based code. Further extensions should be fairly straightforward.
My sample code does the following:

  1. Declares X-macros for the unary and binary functions built into the grammar.
  2. Declares Boost Phoenix-ready functors for the unary and binary functions.
  3. Declares expression::grammar consisting of the rules expression, term, factor, and primary.
  4. Declares a helper called parse to hide needing boost::spirit::ascii::space when performing whitespace-agnostic parsing.
  5. Declares a small test macro called EXPRTEST to reduce test case boilerplate.
  6. Runs a collection of tests to ensure the parser is behaving as expected.
Compared to the samples on which I built, the biggest changes to the grammar were adding primary and reworking factor. I could have shortened up some details with a few using boost::spirit::qi; declarations, but I wanted to keep track of exactly which parts of Spirit I used. Feedback and suggestions are most welcome.

Updated 9 Jan 2012: During a second pass motivated by reducing compilation times, I removed the X-macro nastiness and replaced it with some qi::symbols-based logic. The code in the anonymous namespace could likely be removed if I could figure out the right boost::phoenix::bind hoodoo to use to replace lazy_ufunc_ and lazy_bfunc_ (any suggestions?). I have not yet measured the difference in either memory footprint or runtime performance. Compilation time and memory is greatly reduced for both GNU and Intel compilers.

Updated 31 Mar 2013: I have moved the code into a gist to facilitate both maintenance and gathering feedback. I have added the constant e and slightly reordered the primary grammar per the comments.

Updated 17-19 Sept 2013: I have added an MPL2 header to the gist. In the comments is a suggestion for how to use standalone MPL2 logic building atop this grammar from my project Suzerain to incorporate such parsing into your own, possibly non-MPL2, source tree.

//--------------------------------------------------------------------------
//
// Copyright (C) 2011, 2012, 2013 Rhys Ulerich
// Copyright (C) 2012, 2013 The PECOS Development Team
// Please see http://pecos.ices.utexas.edu for more information on PECOS.
//
// This Source Code Form is subject to the terms of the Mozilla Public
// License, v. 2.0. If a copy of the MPL was not distributed with this
// file, You can obtain one at http://mozilla.org/MPL/2.0/.
//
//--------------------------------------------------------------------------
#include <cmath>
#include <limits>
#include <iterator>
#include <boost/spirit/version.hpp>
#if !defined(SPIRIT_VERSION) || SPIRIT_VERSION < 0x2010
#error "At least Spirit version 2.1 required"
#endif
#include <boost/math/constants/constants.hpp>
#include <boost/spirit/include/phoenix.hpp>
#include <boost/spirit/include/qi.hpp>
namespace expression {
namespace { // anonymous
struct lazy_pow_
{
template <typename X, typename Y>
struct result { typedef X type; };
template <typename X, typename Y>
X operator()(X x, Y y) const
{
return std::pow(x, y);
}
};
struct lazy_ufunc_
{
template <typename F, typename A1>
struct result { typedef A1 type; };
template <typename F, typename A1>
A1 operator()(F f, A1 a1) const
{
return f(a1);
}
};
struct lazy_bfunc_
{
template <typename F, typename A1, typename A2>
struct result { typedef A1 type; };
template <typename F, typename A1, typename A2>
A1 operator()(F f, A1 a1, A2 a2) const
{
return f(a1, a2);
}
};
template <class T>
T max_by_value ( const T a, const T b ) {
return std::max(a, b);
}
template <class T>
T min_by_value ( const T a, const T b ) {
return std::min(a, b);
}
} // end namespace anonymous
template <typename FPT, typename Iterator>
struct grammar
: boost::spirit::qi::grammar<
Iterator, FPT(), boost::spirit::ascii::space_type
>
{
// symbol table for constants like "pi"
struct constant_
: boost::spirit::qi::symbols<
typename std::iterator_traits<Iterator>::value_type,
FPT
>
{
constant_()
{
this->add
("digits", std::numeric_limits<FPT>::digits )
("digits10", std::numeric_limits<FPT>::digits10 )
("e" , boost::math::constants::e<FPT>() )
("epsilon", std::numeric_limits<FPT>::epsilon() )
("pi", boost::math::constants::pi<FPT>() )
;
}
} constant;
// symbol table for unary functions like "abs"
struct ufunc_
: boost::spirit::qi::symbols<
typename std::iterator_traits<Iterator>::value_type,
FPT (*)(FPT)
>
{
ufunc_()
{
this->add
("abs" , static_cast<FPT (*)(FPT)>(&std::abs ))
("acos" , static_cast<FPT (*)(FPT)>(&std::acos ))
("asin" , static_cast<FPT (*)(FPT)>(&std::asin ))
("atan" , static_cast<FPT (*)(FPT)>(&std::atan ))
("ceil" , static_cast<FPT (*)(FPT)>(&std::ceil ))
("cos" , static_cast<FPT (*)(FPT)>(&std::cos ))
("cosh" , static_cast<FPT (*)(FPT)>(&std::cosh ))
("exp" , static_cast<FPT (*)(FPT)>(&std::exp ))
("floor", static_cast<FPT (*)(FPT)>(&std::floor))
("log10", static_cast<FPT (*)(FPT)>(&std::log10))
("log" , static_cast<FPT (*)(FPT)>(&std::log ))
("sin" , static_cast<FPT (*)(FPT)>(&std::sin ))
("sinh" , static_cast<FPT (*)(FPT)>(&std::sinh ))
("sqrt" , static_cast<FPT (*)(FPT)>(&std::sqrt ))
("tan" , static_cast<FPT (*)(FPT)>(&std::tan ))
("tanh" , static_cast<FPT (*)(FPT)>(&std::tanh ))
;
}
} ufunc;
// symbol table for binary functions like "pow"
struct bfunc_
: boost::spirit::qi::symbols<
typename std::iterator_traits<Iterator>::value_type,
FPT (*)(FPT, FPT)
>
{
bfunc_()
{
this->add
("atan2", static_cast<FPT (*)(FPT, FPT)>(&std::atan2 ))
("max" , static_cast<FPT (*)(FPT, FPT)>(&max_by_value))
("min" , static_cast<FPT (*)(FPT, FPT)>(&min_by_value))
("pow" , static_cast<FPT (*)(FPT, FPT)>(&std::pow ))
;
}
} bfunc;
boost::spirit::qi::rule<
Iterator, FPT(), boost::spirit::ascii::space_type
> expression, term, factor, primary;
grammar() : grammar::base_type(expression)
{
using boost::spirit::qi::real_parser;
using boost::spirit::qi::real_policies;
real_parser<FPT,real_policies<FPT> > real;
using boost::spirit::qi::_1;
using boost::spirit::qi::_2;
using boost::spirit::qi::_3;
using boost::spirit::qi::no_case;
using boost::spirit::qi::_val;
boost::phoenix::function<lazy_pow_> lazy_pow;
boost::phoenix::function<lazy_ufunc_> lazy_ufunc;
boost::phoenix::function<lazy_bfunc_> lazy_bfunc;
expression =
term [_val = _1]
>> *( ('+' >> term [_val += _1])
| ('-' >> term [_val -= _1])
)
;
term =
factor [_val = _1]
>> *( ('*' >> factor [_val *= _1])
| ('/' >> factor [_val /= _1])
)
;
factor =
primary [_val = _1]
>> *( ("**" >> factor [_val = lazy_pow(_val, _1)])
)
;
primary =
real [_val = _1]
| '(' >> expression [_val = _1] >> ')'
| ('-' >> primary [_val = -_1])
| ('+' >> primary [_val = _1])
| (no_case[ufunc] >> '(' >> expression >> ')')
[_val = lazy_ufunc(_1, _2)]
| (no_case[bfunc] >> '(' >> expression >> ','
>> expression >> ')')
[_val = lazy_bfunc(_1, _2, _3)]
| no_case[constant] [_val = _1]
;
}
};
template <typename FPT, typename Iterator>
bool parse(Iterator &iter,
Iterator end,
const grammar<FPT,Iterator> &g,
FPT &result)
{
return boost::spirit::qi::phrase_parse(
iter, end, g, boost::spirit::ascii::space, result);
}
} // end namespace expression
/////////////
// Test cases
/////////////
#include <string>
#include <boost/math/special_functions/fpclassify.hpp>
#define BOOST_TEST_MODULE expression
#include <boost/test/included/unit_test.hpp>
const expression::grammar<double,std::string::const_iterator> eg;
const double close_enough = std::numeric_limits<double>::epsilon();
#define EXPRTEST(casename, expr, expected) \
BOOST_AUTO_TEST_CASE( casename ) \
{ \
const std::string s = expr; \
std::string::const_iterator iter = s.begin(); \
std::string::const_iterator end = s.end(); \
double result; \
BOOST_REQUIRE(parse(iter, end, eg, result)); \
BOOST_CHECK(iter == end); \
BOOST_CHECK_CLOSE(result, (expected), close_enough); \
}
EXPRTEST(literal1, "1.234", 1.234)
EXPRTEST(literal2, "4.2e2", 420)
EXPRTEST(literal3, "5e-01", 0.5)
EXPRTEST(literal4, "-3", -3)
EXPRTEST(literal5, "pi", boost::math::constants::pi<double>())
EXPRTEST(literal6, "epsilon", std::numeric_limits<double>::epsilon())
EXPRTEST(literal7, "digits", std::numeric_limits<double>::digits)
EXPRTEST(literal8, "digits10", std::numeric_limits<double>::digits10)
EXPRTEST(literal9, "e", boost::math::constants::e<double>())
EXPRTEST(basicop1, " 2 +\t3\n", 5) // Whitespace ignored
EXPRTEST(basicop2, " 2 -\t3\n", -1)
EXPRTEST(basicop3, " 2 *\t3\n", 6)
EXPRTEST(basicop4, " 2 /\t3\n", 2./3.) // Double division
EXPRTEST(basicop5, " 2 ** 3\n", 8)
EXPRTEST(pemdas1, "2*3+4*5", 2*3+4*5)
EXPRTEST(pemdas2, "2*(3+4)*5", 2*(3+4)*5)
EXPRTEST(pemdas3, "2**3+4", std::pow(2,3)+4)
EXPRTEST(pemdas4, "2**2**-3", std::pow(2.,std::pow(2.,-3.)))
EXPRTEST(unary1, "-(2)", -2)
EXPRTEST(unary2, "-(-2)", 2)
EXPRTEST(unary3, "+(-2)", -2)
EXPRTEST(unary4, "+(+2)", 2)
EXPRTEST(func_abs, "abs (-1.0)", std::abs(-1.0))
EXPRTEST(func_acos, "acos( 1.0)", std::acos(1.0))
EXPRTEST(func_asin, "asin( 1.0)", std::asin(1.0))
EXPRTEST(func_atan, "atan( 1.0)", std::atan(1.0))
EXPRTEST(func_ceil, "ceil( 0.5)", std::ceil(0.5))
EXPRTEST(func_cos, "cos ( 1.0)", std::cos(1.0))
EXPRTEST(func_cosh, "cosh( 1.0)", std::cosh(1.0))
EXPRTEST(func_exp, "exp ( 1.0)", std::exp(1.0))
EXPRTEST(func_floor, "floor(0.5)", std::floor(0.5))
EXPRTEST(func_log, "log ( 1.0)", std::log(1.0))
EXPRTEST(func_log10, "log10(0.5)", std::log10(0.5))
EXPRTEST(func_sin, "sin ( 1.0)", std::sin(1.0))
EXPRTEST(func_sinh, "sinh( 1.0)", std::sinh(1.0))
EXPRTEST(func_sqrt, "sqrt( 1.0)", std::sqrt(1.0))
EXPRTEST(func_tan, "tan ( 1.0)", std::tan(1.0))
EXPRTEST(func_tanh, "tanh( 1.0)", std::tanh(1.0))
EXPRTEST(func_pow, "pow (2.0, 3.0)", std::pow(2.0,3.0))
EXPRTEST(func_max, "max (2.0, 3.0)", std::max(2.0,3.0))
EXPRTEST(func_min, "min (2.0, 3.0)", std::min(2.0,3.0))
EXPRTEST(func_atan2, "atan2(2.0, 3.0)", std::atan2(2.0,3.0))
BOOST_AUTO_TEST_CASE( parse_inf )
{
const std::string s = "inf";
std::string::const_iterator iter = s.begin();
std::string::const_iterator end = s.end();
double result;
BOOST_REQUIRE(parse(iter, end, eg, result));
BOOST_CHECK(iter == end);
BOOST_CHECK((boost::math::isinf)(result));
}
BOOST_AUTO_TEST_CASE( parse_infinity )
{
const std::string s = "infinity";
std::string::const_iterator iter = s.begin();
std::string::const_iterator end = s.end();
double result;
BOOST_REQUIRE(parse(iter, end, eg, result));
BOOST_CHECK(iter == end);
BOOST_CHECK((boost::math::isinf)(result));
}
BOOST_AUTO_TEST_CASE( parse_nan )
{
const std::string s = "nan";
std::string::const_iterator iter = s.begin();
std::string::const_iterator end = s.end();
double result;
BOOST_REQUIRE(parse(iter, end, eg, result));
BOOST_CHECK(iter == end);
BOOST_CHECK((boost::math::isnan)(result));
}
view raw gistfile1.cpp hosted with ❤ by GitHub

Subscribe Subscribe to The Return of Agent Zlerich