Commit 4c5902da authored by Tomek Mrugalski's avatar Tomek Mrugalski 🛰

[master] Merge branch 'trac5036' (Dhcp6 bison parser)

parents 8d9f4d2e 5b7aaae7
......@@ -35,6 +35,7 @@ config.h.in~
/py-compile
/stamp-h1
/test-driver
/ylwrap
/all.info
/coverage-cpp-html
......
......@@ -66,6 +66,20 @@
"subnet6": [
{
"pools": [ { "pool": "2001:db8:1::/80" } ],
# This defines PD (prefix delegation) pools. In this case
# we have only one pool. That consists of /64 prefixes
# being delegated out of large /48 pool. Each delegated
# prefix will contain an excluded-prefix option.
"pd-pools": [
{
"prefix": "2001:db8:abcd::",
"prefix-len": 48,
"delegated-len": 64,
"excluded-prefix": "2001:db8:abcd:1234::",
"excluded-prefix-len": 62
}
],
"subnet": "2001:db8:1::/64",
"interface": "ethX"
}
......
......@@ -3,7 +3,7 @@
{ "Dhcp6":
{
{
# Kea is told to listen on ethX interface only.
"interfaces-config": {
"interfaces": [ "ethX" ]
......@@ -26,8 +26,8 @@
"name": "lab",
"test": "pkt.iface == 'ethX'",
"option-data": [{
"name": "dns-servers",
"data": "2001:db8::1"
"name": "dns-servers",
"data": "2001:db8::1"
}]
},
......@@ -40,36 +40,36 @@
# Let's pick cable modems. In this simple example we'll assume the device
# is a cable modem if it sends a vendor option with enterprise-id equal
# to 4491.
# to 4491.
{
"name": "cable-modems",
"test": "vendor.enterprise == 4491"
},
}
],
# The following list defines subnets. Each subnet consists of at
# least subnet and pool entries.
"subnet6": [
"subnet6": [
{
"pools": [ { "pool": "2001:db8:1::/80" } ],
"subnet": "2001:db8:1::/64",
"client-class": "cable-modems",
"interface": "ethX"
"pools": [ { "pool": "2001:db8:1::/80" } ],
"subnet": "2001:db8:1::/64",
"client-class": "cable-modems",
"interface": "ethX"
},
# The following subnet contains a class reservation for a client using
# DUID 01:02:03:04:05:0A:0B:0C:0D:0E. This client will always be assigned
# to this class.
{
"pools": [ { "pool": "2001:db8:2::/80" } ],
"subnet": "2001:db8:2::/64",
"reservations": [
{
"duid": "01:02:03:04:05:0A:0B:0C:0D:0E",
"client-classes": [ "cable-modems" ]
} ],
"interface": "ethX"
"pools": [ { "pool": "2001:db8:2::/80" } ],
"subnet": "2001:db8:2::/64",
"reservations": [
{
"duid": "01:02:03:04:05:0A:0B:0C:0D:0E",
"client-classes": [ "cable-modems" ]
} ],
"interface": "ethX"
}
]
},
......@@ -78,18 +78,17 @@
# informational level (info, warn, error and fatal) should be logged to stdout.
"Logging": {
"loggers": [
{
"name": "kea-dhcp6",
"output_options": [
{
"output": "stdout"
}
],
"debuglevel": 0,
"severity": "INFO"
}
{
"name": "kea-dhcp6",
"output_options": [
{
"output": "stdout"
}
],
"debuglevel": 0,
"severity": "INFO"
}
]
}
}
......@@ -7,7 +7,7 @@
"Dhcp6":
{
"interfaces-config": {
# Enable unicast
# Enable unicast
"interfaces": [ "eno33554984/2001:db8:1:1::1" ]
},
......@@ -27,16 +27,16 @@
"pools": [ { "pool": "2001:db8:1:1::1:0/112" } ] }
],
# This enables DHCPv4-over-DHCPv6 support
# This enables DHCPv4-over-DHCPv6 support
"dhcp4o6-port": 6767,
# Required by DHCPv4-over-DHCPv6 clients
# Required by DHCPv4-over-DHCPv6 clients
"option-data": [
{ "name": "dhcp4o6-server-addr",
"code": 88,
"space": "dhcp6",
"csv-format": true,
# Put the server address here
# Put the server address here
"data": "2001:db8:1:1::1" }
]
},
......
......@@ -41,7 +41,11 @@
"library": "/opt/lib/security.so"
},
{
"library": "/opt/lib/charging.so"
"library": "/opt/lib/charging.so",
"parameters": {
"path": "/var/kea/var",
"base-name": "kea-forensic6"
}
}
]
}
......
......@@ -49,7 +49,7 @@
{
"code": 12,
"data": "2001:db8:1:0:ff00::1"
},
}
],
"pools": [
{
......@@ -66,7 +66,7 @@
}
],
"subnet": "2001:db8:1::/64",
"interface": "ethX",
"interface": "ethX"
}
]
},
......
......@@ -45,7 +45,8 @@
"name": "kea",
"user": "kea",
"password": "kea",
"host": "localhost"
"host": "localhost",
"readonly": true
},
# Define a subnet with a pool of dynamic addresses and a pool of dynamic
......
......@@ -81,7 +81,8 @@
{
"name": "nis-servers",
"data": "3000:1::234"
}]
}],
"client-classes": [ "special_snowflake", "office" ]
},
# This is a bit more advanced reservation. The client with the specified
# DUID will get a reserved address, a reserved prefix and a hostname.
......
......@@ -6,15 +6,18 @@
<chapter id="kea-config">
<title>Kea Configuration</title>
<para>Kea is designed to allow different methods by which it can be
configured, each method being implemented by a component known as a
configuration backend. At present, only one such backend is
available, that allowing configuration by means of a JSON file.</para>
<section id="json-backend">
<title>JSON Configuration Backend</title>
<para>JSON is the default configuration backend.
It assumes that the servers are started from the command line
<para>Kea is using JSON structures to handle configuration. Previously
we there was a concept of other configuration backends, but that never was
implemented and the idea was abandoned.</para>
<section id="json">
<title>JSON Configuration</title>
<para>JSON is notation used throughout the Kea project. The most obvious
usage is for configuration file, but it is also used for sending commands
over Management API (see <xref linkend="ctrl-channel"/>) and for
communicating between DHCP servers and DDNS update daemon.</para>
<para>Typical usage assumes that the servers are started from the command line
(either directly or using a script, e.g. <filename>keactrl</filename>).
The JSON backend uses certain signals to influence Kea. The
configuration file is specified upon startup using the -c parameter.</para>
......@@ -23,10 +26,42 @@
<title>JSON Syntax</title>
<para>Configuration files for DHCPv4, DHCPv6 and DDNS modules are defined
in an extended JSON format. Basic JSON is defined in <ulink
url="http://tools.ietf.org/html/rfc4627">RFC 4627</ulink>. Kea components
use a slightly modified form of JSON in that they allow shell-style
comments in the file: lines with the hash (#) character in the first column
are comment lines and are ignored.</para>
url="http://tools.ietf.org/html/rfc7159">RFC 7159</ulink>. Note that Kea
1.2 introduces a new parser that is better at following the JSON spec. In
particular, the only values allowed for boolean are true or false (all
lowercase). The capitalized versions (True or False) are not accepted.
</para>
<para>Kea components use an extended JSON with additional features
allowed:
<itemizedlist>
<listitem>
<simpara>shell comments: any text after the hash (#)
character is ignored. Dhcp6 allows # in any column, while
Dhcp4 and Ddns require hash to be in the first
column.</simpara>
</listitem>
<listitem>
<simpara>C comments: any text after the double slashes (//)
character is ignored. We're in a process of
migrating the configuation parsers and currently only Dhcp6
supports this feature.</simpara>
</listitem>
<listitem>
<simpara>Multiline comments: any text between /* and */ is
ignored. This commenting can span multiple lines. We're in a
process of migrating the configuation parsers and currently
only Dhcp6 supports this feature.</simpara>
</listitem>
<listitem>
<simpara>File inclusion: JSON files can include other JSON
files. This can be done by using &lt;?include
"file.json"?&gt;. We're in a process of migrating the
configuation parsers and currently only Dhcp6 supports this
feature.</simpara>
</listitem>
</itemizedlist>
</para>
<para>The configuration file consists of a single object (often colloquially
called a map) started with a curly bracket. It comprises the "Dhcp4", "Dhcp6",
......@@ -89,7 +124,7 @@
# The whole configuration structure ends here.
}
</screen>
</para>
</para>
<para>More examples are available in the installed
<filename>share/doc/kea/examples</filename> directory.</para>
......@@ -113,7 +148,7 @@
valid-lifetime in the Dhcp6 component can be referred to as
Dhcp6/valid-lifetime and the pool in the first subnet defined in the DHCPv6
configuration as Dhcp6/subnet6[0]/pool.</para>
<!-- @todo Add a reference here after #3422 is done -->
</section>
......
......@@ -2978,7 +2978,7 @@ It is merely echoed by the server
<para>
It is possible to store host reservations in MySQL or PostgreSQL database. See
<xref linkend="hosts-storage4"/> for information on how to configure Kea to use
<xref linkend="hosts4-storage"/> for information on how to configure Kea to use
reservations stored in MySQL or PostgreSQL. Kea does not provide any dedicated
tools for managing reservations in a database. The Kea wiki <ulink
url="http://kea.isc.org/wiki/HostReservationsHowTo" /> provides detailed
......
......@@ -4,7 +4,7 @@
<!ENTITY mdash "&#x2014;" >
]>
<chapter id="dhcp6">
<chapter id="dhcp6">
<title>The DHCPv6 Server</title>
<section id="dhcp6-start-stop">
......@@ -3496,7 +3496,8 @@ If not specified, the default value is:
provided by the clients.
</para>
<para>
The configuration is controlled by the <command>mac-sources</command>parameter as follows:
The configuration is controlled by the <command>mac-sources</command>
parameter as follows:
<screen>
"Dhcp6": {
<userinput>"mac-sources": [ "method1", "method2", "method3", ... ]</userinput>,
......@@ -3543,7 +3544,7 @@ If not specified, the default value is:
that those addresses are based on EUI-64, which contains MAC address. This
method is not completely reliable, as clients may use other link-local address
types. In particular, privacy extensions, defined in
<ulink utl="http://tools.ietf.org/html/rfc4941">RFC 4941</ulink>, do not use
<ulink url="http://tools.ietf.org/html/rfc4941">RFC 4941</ulink>, do not use
MAC addresses. Also note that successful extraction requires that the
address's u-bit must be set to 1 and its g-bit set to 0, indicating that it
is an interface identifier as per
......@@ -3600,7 +3601,11 @@ If not specified, the default value is:
</simpara>
</listitem>
</itemizedlist>
</para>
</para>
<para>Empty mac-sources is not allowed. If you do not want to specify it,
either simply omit mac-sources definition or specify it with the "any" value
which is the default.</para>
</section>
<section id="dhcp6-decline">
......
......@@ -5,3 +5,4 @@
/spec_config.h
/spec_config.h.pre
/s-messages
/dhcp6_parser.report
......@@ -64,6 +64,10 @@ libdhcp6_la_SOURCES += ctrl_dhcp6_srv.cc ctrl_dhcp6_srv.h
libdhcp6_la_SOURCES += json_config_parser.cc json_config_parser.h
libdhcp6_la_SOURCES += dhcp6to4_ipc.cc dhcp6to4_ipc.h
libdhcp6_la_SOURCES += dhcp6_lexer.ll location.hh position.hh stack.hh
libdhcp6_la_SOURCES += dhcp6_parser.cc dhcp6_parser.h
libdhcp6_la_SOURCES += parser_context.cc parser_context.h
libdhcp6_la_SOURCES += kea_controller.cc
nodist_libdhcp6_la_SOURCES = dhcp6_messages.h dhcp6_messages.cc
......@@ -105,3 +109,30 @@ endif
kea_dhcp6dir = $(pkgdatadir)
kea_dhcp6_DATA = dhcp6.spec
if GENERATE_PARSER
parser: dhcp6_lexer.cc location.hh position.hh stack.hh dhcp6_parser.cc dhcp6_parser.h
@echo "Flex/bison files regenerated"
# --- Flex/Bison stuff below --------------------------------------------------
# When debugging grammar issues, it's useful to add -v to bison parameters.
# bison will generate parser.output file that explains the whole grammar.
# It can be used to manually follow what's going on in the parser.
# This is especially useful if yydebug_ is set to 1 as that variable
# will cause parser to print out its internal state.
# Call flex with -s to check that the default rule can be suppressed
# Call bison with -W to get warnings like unmarked empty rules
# Note C++11 deprecated register still used by flex < 2.6.0
location.hh position.hh stack.hh dhcp6_parser.cc dhcp6_parser.h: dhcp6_parser.yy
$(YACC) --defines=dhcp6_parser.h --report=all --report-file=dhcp6_parser.report -o dhcp6_parser.cc dhcp6_parser.yy
dhcp6_lexer.cc: dhcp6_lexer.ll
$(LEX) --prefix parser6_ -o dhcp6_lexer.cc dhcp6_lexer.ll
else
parser location.hh position.hh stack.hh dhcp6_parser.cc dhcp6_parser.h dhcp6_lexer.cc:
@echo Parser generation disabled. Configure with --enable-generate-parser to enable it.
endif
......@@ -24,6 +24,18 @@ component implementation.
@section dhcpv6ConfigParser Configuration Parsers in DHCPv6
Three minutes overview. If you are here only to learn absolute minimum about
the new parser, here's how you use it:
@code
// The following code:
json = isc::data::Element::fromJSONFile(file_name, true);
// can be replaced with this:
Parser6Context parser;
json = parser.parseFile(file_name, Parser6Context::PARSER_DHCP6);
@endcode
The common configuration parsers for the DHCP servers are located in the
src/lib/dhcpsrv/parsers/ directory. Parsers specific to the DHCPv6 component
are located in the src/bin/dhcp6/json_config_parser.cc. These parsers derive
......@@ -41,6 +53,291 @@ all configuration parsers. All DHCPv6 parsers deriving from this class
directly have their entire implementation in the
src/bin/dhcp6/json_config_parser.cc.
@section dhcpv6ConfigParserBison Configuration Parser for DHCPv6 (bison)
During 1.2 milestone it has been decided to significantly refactor the
parsers as their old implementation became unsustainable. For the brief overview
of the problems, see ticket 5014 (http://kea.isc.org/ticket/5014). In
general, the following issues of the existing code were noted:
-# parsers are overwhelmingly complex. Even though each parser is relatively
simple class, the complexity comes from too large number of interacting parsers.
-# the code is disorganized, i.e. spread out between multiple directories
(src/bin/dhcp6 and src/lib/dhcpsrv/parsers).
-# The split into build/commit never worked well. In particular, it is not
trivial to revert configuration change. This split originates from BIND10
days and was inherited from DNS auth that did receive only changes in
the configuration, rather than the full configuration. As a result,
the split was abused and many of the parsers have commit() being a
no-op operation.
-# There is no way to generate a list of all directives. We do have .spec files,
but they're not actually used by the code. The code has the directives
spread out in multiple places in multiple files in multiple directories.
Answering a simple question ("can I do X in the scope Y?") requires
a short session of reverse engineering. What's worse, we have the .spec
files that are kinda kept up to date. This is actually more damaging that
way, because there's no strict correlation between the code and .spec file.
So there may be parameters that are supported, but are not in .spec files.
The opposite is also true - .spec files can be buggy and have different
parameters. This is particularly true for default values.
-# It's exceedingly complex to add comments that don't start at the first
column or span several lines. Both Tomek and Marcin tried to implement
it, but failed miserably. The same is true for including files (have
include statement in the config that includes other files)
-# The current parsers don't handle the default values, i.e. if there's no
directive, the parser is not created at all. We have kludgy workarounds
for that, but the code for it is in different place than the parser,
which leads to the logic being spread out in different places.
-# syntax checking is poor. The existing JSON parser allowed things like
empty option-data entries:
@code
"option-data": [ {} ]
@endcode
having trailing commas:
@code
"option-data": [
{
"code": 12,
"data": "2001:db8:1:0:ff00::1"
},
]
@endcode
or having incorrect types, e.g. specifying timer values as strings.
To solve those issues a two phase approach was proposed:
PHASE 1: replace isc::data::fromJSON with bison-based parser. This will allow
to have a single file that defines the actual syntax, much better syntax
checking, and provide more flexibility, like various comment types and
file inclusions. As a result, the parser still returns JSON structures that
are guaranteed to be correct from the grammar perspective. Sticking with
the JSON structures also allows us to continue using existing parsers.
Furthermore, it is possible to implement default values at this level
as simply inserting extra JSON structures in places that are necessary.
This part is covered by ticket 5036.
PHASE 2: simplify existing parsers by getting rid of the build/commit split.
Get rid of the inheritance contexts. Essentially the parser should
take JSON structure as a parameter and return the configuration structure.
For example, for options this should essentially look like this:
@code
CfgOptionPtr parse(ConstElementPtr options)
@endcode
The whole complexity behind inheriting contexts should be removed
from the existing parsers and implemented in the bison parser.
It should return extra JSON elements. The details are TBD, but there is
one example for setting up an renew-timer value on the subnet level that
is inherited from the global ("Dhcp6") level. This phase is covered by
ticket 5039.
The code change for 5036 introduces flex/bison based parser. It is
essentially defined in two files: dhcp6_lexer.ll, which defines
regular expressions that are used on the input (be it a file or a
string in memory). In essence, this code is being called repeatedly
and each time it returns a token. This repeats until either the
parsing is complete or syntax error is encountered. For example, for
the following text:
@code
{
"Dhcp6":
{
"renew-timer": 100
}
}
@endcode
this code would return the following sentence of tokens: LCURLY_BRACKET,
DHCP6, COLON, LCURLY_BRACKET, RENEW_TIMER, COLON, INTEGER
(a token with a value of 100), RCURLY_BRACKET, RCURLY_BRACKET, END
This stream of tokens is being consumed by the parser that is defined
in dhcp6_parser.yy. This file defines a grammar. Here's very simplified
version of the Dhcp6 grammar:
@code
dhcp6_object: DHCP6 COLON LCURLY_BRACKET global_params RCURLY_BRACKET;
global_params: global_param
| global_params COMMA global_param
;
// These are the parameters that are allowed in the top-level for
// Dhcp6.
global_param: preferred_lifetime
| valid_lifetime
| renew_timer
| rebind_timer
| subnet6_list
| interfaces_config
| lease_database
| hosts_database
| mac_sources
| relay_supplied_options
| host_reservation_identifiers
| client_classes
| option_data_list
| hooks_libraries
| expired_leases_processing
| server_id
| dhcp4o6_port
;
renew_timer: RENEW_TIMER COLON INTEGER;
@endcode
This may be slightly difficult to read at the beginning, but after getting used
to the notation, it's very powerful and easy to extend. The first line defines
that dhcp6_object consists of certain tokens (DHCP6, COLON and LCURLY_BRACKET)
followed by 'global_params' expression, followed by RCURLY_BRACKET.
The 'global_params' is defined recursively. It can either be a single
'global_param' expression, or (a shorter) global_params followed by a
comma and global_param. Bison will apply this and will be able to
parse comma separated lists of arbitrary lengths.
A single parameter is defined by 'global_param' expression. This
represents any parameter that may appear in the global scope of Dhcp6
object. The complete definition for all of them is complex, but the
example above includes renew_timer definition. It is defined as a
series of RENEW_TIMER, COLON, INTEGER tokens.
The above is a simplified version of the actual grammar. If used in the version
above, it would parse the whole file, but would do nothing with that information.
To build actual structures, bison allows to inject C++ code at any phase of
the parsing. For example, when the parser detects Dhcp6 object, it wants to
create a new MapElement. When the whole object is parsed, we can perform
some sanity checks, inject defaults for parameters that were not defined,
log and do other stuff.
@code
dhcp6_object: DHCP6 COLON LCURLY_BRACKET {
// This code is executed when we're about to start parsing
// the content of the map
ElementPtr m(new MapElement());
ctx.stack_.back()->set("Dhcp6", m);
ctx.stack_.push_back(m);
} global_params RCURLY_BRACKET {
// Whole Dhcp6 parsing completed. If we ever want to do any wrap up
// (maybe some sanity checking, insert defaults if not specified),
// this would be the best place for it.
ctx.stack_.pop_back();
};
@endcode
The above will do the following in order: consume DHCP6 token, consume COLON token,
consume LCURLY_BRACKET, execute the code in first { ... }, parse global_params
and do whatever the code for it tells, parser RCURLY_BRACKET, execute the code
in the second { ... }.
There is a simple stack defined in ctx.stack_, which is isc::dhcp::Parser6Context
defined in src/bin/dhcp6/parser_context.h. When walking through the config file, each
new context (e.g. entering into Dhcp6, Subnet6, Pool), a new Element is added
to the end of the stack. Once the parsing of a given context is complete, it
is removed from the stack. At the end of parsing, there should be a single
element on the stack as the top-level parsing (syntax_map) only inserts the
MapElement object, but does not remove it.
@section dhcpv6ConfigSubParser Parsing Partial Configuration in DHCPv6
One another important capability required is the ability to parse not only the
whole configuration, but a subset of it. This is done by introducing artificial
tokens (e.g. TOPLEVEL_JSON and TOPLEVEL_DHCP6). For complete list of available
starting contexts, see @ref isc::dhcp::Parser6Context::ParserType. The
Parse6Context::parse() method takes one parameter that specifies, whether the
data to be parsed is expected to have a generic JSON or the whole configuration
(DHCP6). This feature is currently mostly used by unit-tests (which often skip
the '{ "Dhcp6": {' preamble), but it is expected to be soon used for parsing
subnets only, host reservations only, options or basically any other elements.
For example, to add the ability to parse only pools, the following could be added:
@code
start: TOPLEVEL_GENERIC_JSON sub_json
| TOPLEVEL_DHCP6 sub_dhcp6
| TOPLEVEL_POOL6 sub_pool6
;
@endcode
The parser code contains the code definition and the Kea-dhcp6 updated
to use that new parser. That parser is able to to load all examples
from doc/example/kea6. It is also able to parser # comments (bash
style, starting at the beginning or middle of the line), // comments
(C++ style, can start anywhere) or / * * / comments (C style, can span
multiple lines).
This parser is currently used in production code. See configure()
method in kea_controller.cc.
There are several new unit-tests written, but the code mostly reuses existing
one to verify that existing functionality was not compromised. There are
several new interesting ones, though. ParserTest.file loads all the
config file examples we have in doc/examples/kea6. This ensures that the
parser is able to load them and also checks that our examples are sane.
@section dhcp6ParserIncludes Config File Includes
The new parser provides an ability to include files. The syntax was chosen
to look similar to how Apache includes PHP scripts in HTML code. This
particular syntax was chosen to emphasize that the inclusion directive
is an additional feature and not really a part of JSON syntax.
To include one file from another, user the following syntax:
@code
{
"Dhcp6": {
"interfaces-config": {
"interfaces": [ "*" ]},
"preferred-lifetime": 3000,
"rebind-timer": 2000,
"renew-timer": 1000,
<?include "subnets.json"?>
"valid-lifetime": 4000
}
}
@endcode
The inclusion is implemented as a stack of files. Typically, when a
single file is parsed, the files_ (a vector of strings) and sfiles_ (a
vector of FILE*) both contain a single entry. However, when lexer
detects &lt;?include "filename.json?&gt;, it calls
@ref isc::dhcp::Parser6Context::includeFile method. Up to ten nesting
levels are supported. This arbitrarily chosen limit is a protection
against recursive inclusions.
@section dhcp6ParserConflicts Avoiding syntactical conflicts in parsers
Syntactic parser has a powerful ability to not only parse the string and
check if it's a valid JSON syntax, but also check that the resulting structures
match expected syntax (if subnet6 are really an array, not a map, if
timers are expressed as integers, not as strings etc.).
However, there are times when we need to parse a string as arbitrary
JSON. For example, if we're in Dhcp6 and the config contains entries
for DhcpDdns or Dhcp4. If we were to use naive approach, the lexer
would go through that content and most likely find some tokens that
are also used in Dhcp6. for example 'renew-timer' would be detected
and the parser would complain that it was not expected. To avoid this
problem, syntactic context was introduced. When the syntactic parser
expects certain type of content, it calls @ref
isc::dhcp::Parser6Context::enter() method to indicate what type of
content is expected. For example, when Dhcp6 parser discovers
uninteresting content like Dhcp4, it enters NO_KEYWORD mode. In this