Commit 7058a408 authored by Francis Dupont's avatar Francis Dupont
Browse files

[master] Merged trac4088 (client classification expression parser)

parents d719ccdf ac00ace4
......@@ -1212,6 +1212,49 @@ AC_SUBST(PERL)
AC_PATH_PROGS(AWK, gawk awk)
AC_SUBST(AWK)
AC_ARG_ENABLE(generate_parser, [AC_HELP_STRING([--enable-generate-parser],
[indicates that the parsers will be regenerated. This implies that the
bison and flex are required [default=no]])],
enable_generate_parser=$enableval, enable_generate_parser=no)
# Check if flex is avaible. Flex is not needed for building Kea sources,
# unless you want to regenerate grammar in src/lib/eval
AC_PROG_LEX
# Check if bison is available. Bison is not needed for building Kea sources,
# unless you want to regenerate grammar in src/lib/eval
AC_PROG_YACC
if test "x$enable_generate_parser" != xno; then
if test "x$LEX" == "x"; then
AC_MSG_ERROR("Flex is required for enable-generate-parser, but was not found")
fi
if test "x$YACC" == "x"; then
AC_MSG_ERROR("Bison it required for enable-generate-parser, but was not found")
fi
# Ok, let's check if we have at least 3.0.0 version of the bison. The code used
# to generate src/lib/eval parser is roughly based on bison 3.0 examples.
cat > bisontest.y << EOF
%require "3.0.0"
%token X
%%
%start Y;
Y: X;
EOF
# Try to compile.
$YACC bisontest.y -o bisontest.cc
if test $? -ne 0 ; then
$YACC -V
$RM -f bisontest.y bisontest.cc
AC_MSG_ERROR("Error with $YACC. Possibly incorrect version? Required at least 3.0.0.")
fi
$RM -f bisontest.y bisontest.cc
fi
AC_ARG_ENABLE(generate_docs, [AC_HELP_STRING([--enable-generate-docs],
[regenerate documentation using Docbook [default=no]])],
enable_generate_docs=$enableval, enable_generate_docs=no)
......@@ -1527,6 +1570,10 @@ Log4cplus:
Kea config backend:
CONFIG_BACKEND: ${CONFIG_BACKEND}
Flex/bison:
FLEX: ${LEX}
BISON: ${YACC}
END
# Avoid confusion on DNS/DHCP and only mention MySQL if it
......@@ -1587,6 +1634,7 @@ Developer:
C++ Code Coverage: $USE_LCOV
Logger checks: $enable_logger_checks
Generate Documentation: $enable_generate_docs
Parser Generation: $enable_generate_parser
END
......
......@@ -15,6 +15,10 @@ libkea_eval_la_SOURCES =
libkea_eval_la_SOURCES += eval_log.cc eval_log.h
libkea_eval_la_SOURCES += token.cc token.h
libkea_eval_la_SOURCES += parser.cc parser.h
libkea_eval_la_SOURCES += lexer.cc
libkea_eval_la_SOURCES += eval_context.cc eval_context.h eval_context_decl.h
nodist_libkea_eval_la_SOURCES = eval_messages.h eval_messages.cc
libkea_eval_la_CXXFLAGS = $(AM_CXXFLAGS)
......@@ -48,3 +52,31 @@ s-messages: eval_messages.mes
BUILT_SOURCES = eval_messages.h eval_messages.cc
CLEANFILES = eval_messages.h eval_messages.cc s-messages
# If we want to get rid of all flex/bison generated files, we need to use
# make maintainer-clean. The proper way to introduce custom commands for
# that operation is to define maintainer-clean-local target. However,
# make maintainer-clean also removes Makefile, so running configure script
# is required. To make it easy to rebuild flex/bison without going through
# reconfigure, a new target parser-clean has been added.
maintainer-clean-local:
rm -f location.hh lexer.cc parser.cc parser.h position.hh stack.hh
# To regenerate flex/bison files, one can do:
#
# make parser-clean
# make parser
#
# This is needed only when the lexer.ll or parser.yy files are modified.
# Make sure you have both flex and bison installed.
parser-clean: maintainer-clean-local
parser: lexer.cc location.hh position.hh stack.hh parser.cc parser.h
@echo "Flex/bison files regenerated"
# --- Flex/Bison stuff below --------------------------------------------------
location.hh position.hh stack.hh parser.cc parser.h: parser.yy
$(YACC) --defines=parser.h -o parser.cc parser.yy
lexer.cc: lexer.ll
$(LEX) -o lexer.cc lexer.ll
......@@ -13,8 +13,118 @@
// PERFORMANCE OF THIS SOFTWARE.
/**
@page dhcpEval Expression evaluation (client classification)
@page dhcpEval libeval - Expression evaluation and client classification
@todo: Document how the expression evaluation is implemented.
@section dhcpEvalIntroduction Introduction
*/
The core of the libeval library is a parser that is able to parse an
expression (e.g. option[123] == 'APC'). This is currently used for client
classification, but in the future may be also used for other applications.
The external interface to the library is the @ref isc::eval::EvalContext
class. Once instantiated, it offers a major method:
@ref isc::eval::EvalContext::parseString, which parses the specified
string. Once the expression is parsed, it is converted to a collection of
tokens that are stored in Reverse Polish Notation in
EvalContext::expression.
Internally, the parser code is generated by flex and bison. These two
tools convert lexer.ll and parser.yy files into a number of .cc and .hh files.
To avoid a build of Kea depending on the presence of flex and bison, the
result of the generation is checked into the github repository and is
distributed in the tarballs.
@section dhcpEvalLexer Lexer generation using flex
Flex is used to generate the lexer, a piece of code that converts input
data into a series of tokens. It contains a small number of directives,
but the majority of the code consists of the definitions of tokens. These
definitions are regular expressions that define various tokens, e.g. strings,
numbers, parentheses, etc. Once the expression is matched, the associated
action is executed. In the majority of the cases a generator method from
@ref isc::eval::EvalParser is called, which returns returns a newly created
bison token. The purpose of the lexer is to generate a stream
of tokens that are consumed by the parser.
lexer.cc and lexer.hh must not be edited. If there is a need
to introduce changes, lexer.ll must be updated and the .cc and .hh files
regenerated.
@section dhcpEvalParser Parser generation using bison
Bison is used to generate the parser, a piece of code that consumes a
stream of tokens and attempts to match it against a defined grammar.
The bison parser is created from parser.yy. It contains
a number of directives, but the two most important sections are:
a list of tokens (for each token defined here, bison will generate the
make_NAMEOFTOKEN method in the @ref isc::eval::EvalParser class) and
the grammar. The Grammar is a tree like structure with possible loops.
Here is an over-simplified version of the grammar:
@code
01. %start expression;
02.
03. expression : token EQUAL token
04. | token
05. ;
06.
07. token : STRING
08. {
09. TokenPtr str(new TokenString($1));
10. ctx.expression.push_back(str);
11. }
12. | HEXSTRING
13. {
14. TokenPtr hex(new TokenHexString($1));
15. ctx.expression.push_back(hex);
16. }
17. | OPTION '[' INTEGER ']'
18. {
19. TokenPtr opt(new TokenOption($3));
20. ctx.expression.push_back(opt);
21. }
22. ;
@endcode
This code determines that the grammar starts from expression (line 1).
The actual definition of expression (lines 3-5) may either be a
single token or an expression "token == token" (EQUAL has been defined as
"==" elsewhere). Token is further
defined in lines 7-22: it may either be a string (lines 7-11),
a hex string (lines 12-16) or option (lines 17-21).
When the actual case is determined, the respective C++ action
is executed. For example, if the token is a string, the TokenString class is
instantiated with the appropriate value and put onto the expression vector.
@section dhcpEvalMakefile Generating parser files
In the general case, we want to avoid generating parser files, so an
average user interested in just compiling Kea would not need flex or
bison. Therefore the generated files are already included in the
git repository and will be included in the tarball releases.
However, there will be cases when one of the developers would want
to tweak the lexer.ll and parser.yy files and then regenerate
the code. For this purpose, two makefile targets are defined:
@code
make parser
@endcode
will generate the parsers and
@code
make parser-clean
@endcode
will remove the files. Generated files removal was also hooked
into the maintainer-clean target.
@section dhcpEvalConfigure Configure options
Since the flex/bison tools are not necessary for a regular compilation,
checks conducted during configure, but the lack of flex or
bison tools does not stop the configure process. There is a flag
(--enable-generate-parser) that tells configure script that the
parser will be generated. With this flag, the checks for flex/bison
are mandatory. If either tool is missing or at too early a version, the
configure process will terminate with an error.
*/
// Copyright (C) 2015 Internet Systems Consortium, Inc. ("ISC")
//
// Permission to use, copy, modify, and/or distribute this software for any
// purpose with or without fee is hereby granted, provided that the above
// copyright notice and this permission notice appear in all copies.
//
// THE SOFTWARE IS PROVIDED "AS IS" AND ISC DISCLAIMS ALL WARRANTIES WITH
// REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY
// AND FITNESS. IN NO EVENT SHALL ISC BE LIABLE FOR ANY SPECIAL, DIRECT,
// INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM
// LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE
// OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR
// PERFORMANCE OF THIS SOFTWARE.
#include <eval/eval_context.h>
#include <eval/parser.h>
#include <exceptions/exceptions.h>
#include <fstream>
EvalContext::EvalContext()
: trace_scanning_(false), trace_parsing_(false)
{
}
EvalContext::~EvalContext()
{
}
bool
EvalContext::parseString(const std::string& str)
{
file_ = "<string>";
string_ = str;
scanStringBegin();
isc::eval::EvalParser parser(*this);
parser.set_debug_level(trace_parsing_);
int res = parser.parse();
scanStringEnd();
return (res == 0);
}
void
EvalContext::error(const isc::eval::location& loc, const std::string& what)
{
isc_throw(EvalParseError, loc << ": " << what);
}
void
EvalContext::error (const std::string& what)
{
isc_throw(EvalParseError, what);
}
// Copyright (C) 2015 Internet Systems Consortium, Inc. ("ISC")
//
// Permission to use, copy, modify, and/or distribute this software for any
// purpose with or without fee is hereby granted, provided that the above
// copyright notice and this permission notice appear in all copies.
//
// THE SOFTWARE IS PROVIDED "AS IS" AND ISC DISCLAIMS ALL WARRANTIES WITH
// REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY
// AND FITNESS. IN NO EVENT SHALL ISC BE LIABLE FOR ANY SPECIAL, DIRECT,
// INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM
// LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE
// OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR
// PERFORMANCE OF THIS SOFTWARE.
#ifndef EVAL_CONTEXT_H
#define EVAL_CONTEXT_H
#include <string>
#include <map>
#include <eval/parser.h>
#include <eval/eval_context_decl.h>
#include <exceptions/exceptions.h>
// Tell Flex the lexer's prototype ...
#define YY_DECL isc::eval::EvalParser::symbol_type yylex (EvalContext& driver)
// ... and declare it for the parser's sake.
YY_DECL;
namespace isc {
namespace eval {
/// @brief Evaluation error exception raised when trying to parse an axceptions.
class EvalParseError : public isc::Exception {
public:
EvalParseError(const char* file, size_t line, const char* what) :
isc::Exception(file, line, what) { };
};
/// @brief Evaluation context, an interface to the expression evaluation.
class EvalContext
{
public:
/// @brief Default constructor.
EvalContext();
/// @brief destructor
virtual ~EvalContext();
/// @brief Parsed expression (output tokens are stored here)
isc::dhcp::Expression expression;
/// @brief Method called before scanning starts on a string.
void scanStringBegin();
/// @brief Method called after the last tokens are scanned from a string.
void scanStringEnd();
/// @brief Run the parser on the string specified.
///
/// @param str string to be written
/// @return true on success.
bool parseString(const std::string& str);
/// @brief The name of the file being parsed.
/// Used later to pass the file name to the location tracker.
std::string file_;
/// @brief The string being parsed.
std::string string_;
/// @brief Error handler
///
/// @param loc location within the parsed file when experienced a problem.
/// @param what string explaining the nature of the error.
void error(const isc::eval::location& loc, const std::string& what);
/// @brief Error handler
///
/// This is a simplified error reporting tool for possible future
/// cases when the EvalParser is not able to handle the packet.
void error(const std::string& what);
private:
/// @brief Flag determining scanner debugging.
bool trace_scanning_;
/// @brief Flag determing parser debugging.
bool trace_parsing_;
};
}; // end of isc::eval namespace
}; // end of isc namespace
#endif
// Copyright (C) 2015 Internet Systems Consortium, Inc. ("ISC")
//
// Permission to use, copy, modify, and/or distribute this software for any
// purpose with or without fee is hereby granted, provided that the above
// copyright notice and this permission notice appear in all copies.
//
// THE SOFTWARE IS PROVIDED "AS IS" AND ISC DISCLAIMS ALL WARRANTIES WITH
// REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY
// AND FITNESS. IN NO EVENT SHALL ISC BE LIABLE FOR ANY SPECIAL, DIRECT,
// INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM
// LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE
// OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR
// PERFORMANCE OF THIS SOFTWARE.
#ifndef EVAL_CONTEXT_DECL_H
#define EVAL_CONTEXT_DECL_H
/// @file eval_context_decl.h Forward declaration of the EvalContext class
namespace isc {
namespace eval {
class EvalContext;
}; // end of isc::eval namespace
}; // end of isc namespace
#endif
......@@ -18,8 +18,3 @@ $NAMESPACE isc::dhcp
This debug message indicates that the expression has been evaluated
to said value. This message is mostly useful during debugging of the
client classification expressions.
% EVAL_SUBSTRING_BAD_PARAM_CONVERSION starting %1, length %2
This debug message indicates that the parameter for the starting postion
or length of the substring couldn't be converted to an integer. In this
case the substring routine returns an empty string.
This diff is collapsed.
/* Copyright (C) 2015 Internet Systems Consortium, Inc. ("ISC")
Permission to use, copy, modify, and/or distribute this software for any
purpose with or without fee is hereby granted, provided that the above
copyright notice and this permission notice appear in all copies.
THE SOFTWARE IS PROVIDED "AS IS" AND ISC DISCLAIMS ALL WARRANTIES WITH
REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY
AND FITNESS. IN NO EVENT SHALL ISC BE LIABLE FOR ANY SPECIAL, DIRECT,
INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM
LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE
OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR
PERFORMANCE OF THIS SOFTWARE. */
%{ /* -*- C++ -*- */
#include <cerrno>
#include <climits>
#include <cstdlib>
#include <string>
#include <eval/eval_context.h>
#include <eval/parser.h>
#include <boost/lexical_cast.hpp>
// Work around an incompatibility in flex (at least versions
// 2.5.31 through 2.5.33): it generates code that does
// not conform to C89. See Debian bug 333231
// <http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=333231>.
# undef yywrap
# define yywrap() 1
// The location of the current token. The lexer will keep updating it. This
// variable will be useful for logging errors.
static isc::eval::location loc;
%}
/* noyywrap disables automatic rewinding for the next file to parse. Since we
always parse only a single string, there's no need to do any wraps. And
using yywrap requires linking with -lfl, which provides the default yywrap
implementation that always returns 1 anyway. */
%option noyywrap
/* nounput simplifies the lexer, by removing support for putting a character
back into the input stream. We never use such capability anyway. */
%option nounput
/* batch means that we'll never use the generated lexer interactively. */
%option batch
/* Enables debug mode. To see the debug messages, one needs to also set
yy_flex_debug to 1, then the debug messages will be printed on stderr. */
%option debug
/* I have no idea what this option does, except it was specified in the bison
examples and Postgres folks added it to remove gcc 4.3 warnings. Let's
be on the safe side and keep it. */
%option noinput
/* This line tells flex to track the line numbers. It's not really that
useful for client classes, which typically are one-liners, but it may be
useful in more complex cases. */
%option yylineno
/* These are not token expressions yet, just convenience expressions that
can be used during actual token definitions. */
int \-?[0-9]+
hex [0-9a-fA-F]+
blank [ \t]
%{
// This code run each time a pattern is matched. It updates the location
// by moving it ahead by yyleng bytes. yyleng specifies the length of the
// currently matched token.
#define YY_USER_ACTION loc.columns(yyleng);
%}
%%
%{
// Code run each time yylex is called.
loc.step();
%}
{blank}+ {
// Ok, we found a with space. Let's ignore it and update loc variable.
loc.step();
}
[\n]+ {
// Newline found. Let's update the location and continue.
loc.lines(yyleng);
loc.step();
}
\'[^\'\n]*\' {
// A string has been matched. It contains the actual string and single quotes.
// We need to get those quotes out of the way and just use its content, e.g.
// for 'foo' we should get foo
std::string tmp(yytext+1);
tmp.resize(tmp.size() - 1);
return isc::eval::EvalParser::make_STRING(tmp, loc);
}
0[xX]{hex} {
// A hex string has been matched. It contains the '0x' or '0X' header
// followed by at least one hexadecimal digit.
return isc::eval::EvalParser::make_HEXSTRING(yytext, loc);
}
{int} {
// An integer was found.
std::string tmp(yytext);
try {
static_cast<void>(boost::lexical_cast<int>(tmp));
} catch (const boost::bad_lexical_cast &) {
driver.error(loc, "Failed to convert " + tmp + " to an integer.");
}
// The parser needs the string form as double conversion is no lossless
return isc::eval::EvalParser::make_INTEGER(tmp, loc);
}
"==" return isc::eval::EvalParser::make_EQUAL(loc);
"option" return isc::eval::EvalParser::make_OPTION(loc);
"substring" return isc::eval::EvalParser::make_SUBSTRING(loc);
"all" return isc::eval::EvalParser::make_ALL(loc);
"(" return isc::eval::EvalParser::make_LPAREN(loc);
")" return isc::eval::EvalParser::make_RPAREN(loc);
"[" return isc::eval::EvalParser::make_LBRACKET(loc);
"]" return isc::eval::EvalParser::make_RBRACKET(loc);
"," return isc::eval::EvalParser::make_COMA(loc);
. driver.error (loc, "Invalid character: " + std::string(yytext));
<<EOF>> return isc::eval::EvalParser::make_END(loc);
%%
using namespace isc::eval;
void
EvalContext::scanStringBegin()
{
loc.initialize(&file_);
yy_flex_debug = trace_scanning_;
YY_BUFFER_STATE buffer;
buffer = yy_scan_bytes(string_.c_str(), string_.size());
if (!buffer) {
error("cannot scan string");
exit(EXIT_FAILURE);
}
}
void
EvalContext::scanStringEnd()
{
yy_delete_buffer(YY_CURRENT_BUFFER);
}
// Generated 2015114
// A Bison parser, made by GNU Bison 3.0.4.
// Locations for Bison parsers in C++
// Copyright (C) 2002-2015 Free Software Foundation, Inc.
// This program is free software: you can redistribute it and/or modify
// it under the terms of the GNU General Public License as published by
// the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.
// This program is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU General Public License for more details.
// You should have received a copy of the GNU General Public License
// along with this program. If not, see <http://www.gnu.org/licenses/>.
// As a special exception, you may create a larger work that contains
// part or all of the Bison parser skeleton and distribute that work
// under terms of your choice, so long as that work isn't itself a
// parser generator using the skeleton or a modified version thereof
// as a parser skeleton. Alternatively, if you modify or redistribute
// the parser skeleton itself, you may (at your option) remove this
// special exception, which will cause the skeleton and the resulting
// Bison output files to be licensed under the GNU General Public
// License without this special exception.
// This special exception was added by the Free Software Foundation in
// version 2.2 of Bison.
/**
** \file location.hh
** Define the isc::eval::location class.
*/
#ifndef YY_YY_LOCATION_HH_INCLUDED
# define YY_YY_LOCATION_HH_INCLUDED
# include "position.hh"