Commit 297f23fb authored by Stephen Morris's avatar Stephen Morris

[#640] Add fuzzing documentation

Add a doxygen file describing the fuzzing - how to build and
run it, and a brief description of the changes made to the code.
parent 08381e67
// Copyright (C) 2017-2018 Internet Systems Consortium, Inc. ("ISC")
//
// This Source Code Form is subject to the terms of the Mozilla Public
// License, v. 2.0. If a copy of the MPL was not distributed with this
// file, You can obtain one at http://mozilla.org/MPL/2.0/.
/**
@page fuzzer Fuzzing Kea
@section fuzzIntro Introduction
Fuzzing is a software-testing technique whereby a program is presented with a
variety of generated data as input and is monitored for abnormal conditions
such as crashes or hangs.
Fuzz testing of Kea uses the AFL (American Fuzzy Lop) program. In this, Kea is
built using an AFL-supplied program that not only compiles the software but
also instruments it. When run, AFL generates test cases and monitors the
execution of Kea as it processes them. AFL will adjust the input based on
these measurements, seeking to discover and test new execution paths.
@section fuzzTypes Types of Kea Fuzzing
@subsection fuzzTypeNetwork Fuzzing with Network Packets
In this mode, AFL will start an instance of Kea and send it a packet of data.
Kea reads this packet and processes it in the normal way. AFL monitors code
paths taken by Kea and, based on this, will vary the data sent in subsequent
packets.
@subsection fuzzTypeConfig Fuzzing with Configuration Files
Kea has a configuration file check mode whereby it will read a configuration
file, report whether the file is valid, then immediately exit. Operation of
the configuration parsing code can be tested with AFL by fuzzing the
configuration file: AFL generates example configuration files based on a
dictionary of valid keywords and runs Kea in configuration file check mode on
them. As with network packet fuzzing, the behaviour of Kea is monitored and
the content of subsequent files adjusted accordingly.
@section fuzzBuild Building Kea for Fuzzing
Whatever tests are done, Kea needs to be built with fuzzing in mind. The steps
for this are:
-# Install AFL on the system on which you plan to build Kea and do the fuzzing.
AFL may be downloaded from http://lcamtuf.coredump.cx/afl. At the time of
writing (August 2019), the latest version is 2.52b. AFL should be built as
per the instructions in the README file in the distribution. The LLVM-based
instrumentation programs should also be built, as per the instructions in
the file llvm_mode/README.llvm (also in the distribution). Note that this
requires that LLVM be installed on the machine used for the fuzzing.
-# Build Kea. Kea should be compiled and built as usual, although the
following additional steps should be observed:
- Set the environment variable CXX to point to the afl-clang-fast++
compiler.
- Specify a value of "--prefix" on the command line to set the directory
into which Kea is installed.
- Add the "--enable-fuzz" switch to the "configure" command line.
.
For example:
@code
CXX=/opt/afl/afl-clang-fast++ ./configure --enable-fuzz --prefix=$HOME/installed
make
@endcode
-# Install Kea to the directory specified by "--prefix":
@code
make install
@endcode
This step is not strictly necessary, but makes running AFL easier.
"libtool", used by the Kea build procedure to build executable images, puts
the executable in a hidden ".libs" subdirectory of the target directory and
creates a shell script in the target directory for running it. The wrapper
script handles the fact that the Kea libraries on which the executable depends
are not installed by fixing up the LD_LIBRARY_PATH environment variable to
point to them. It is possible to set the variable appropriately and use AFL
to run the image from the ".libs" directory; in practice, it is a lot
simpler to install the programs in the directories set by "--prefix" and run
them from there.
@section fuzzRun Running the Fuzzer
@subsection fuzzRunNetwork Fuzzing with Network Packets
-# In this type of fuzzing, Kea is processing packets from the fuzzer over a
network interface. This interface could be a physical interface or it could
be the loopback interface. Either way, it needs to be configured with a
suitable IPv4 or IPv6 address depending on whether kea-dhcp4 or kea-dhcp6 is
being fuzzed.
-# Once the interface has been decided, these need to be set in the
configuration file used for the test. For example, to fuzz Kea-dhcp4
using the loopback interface "lo" and IPv4 address 10.53.0.1, the
configuration file would contain the following snippet:
@code
"Dhcp4": {
:
"interfaces-config": {
"interfaces": ["lo/10.53.0.1"]
},
"subnet4": [
{
:
"interface": "lo",
:
}
],
:
}
@endcode
-# The specification of the interface and address in the configuration file
is used by the main Kea code. Owing to the way that the fuzzing interface
between Kea and AFL is implemented, the address and interface also need to
be specified by the environment variables KEA_AFL_INTERFACE and
KEA_AFL_ADDRESS. With a configuration file containing statements listed
above, the relevant commands are:
@code
export KEA_AFL_INTERFACE="lo"
export KEA_AFL_ADDRESS="10.53.0.1"
@endcode
(If kea-dhcp6 is being fuzzed, then KEA_AFL_ADDRESS should specify an IPv6
address.)
-# The fuzzer can now be run: a suitable command line is:
@code
afl-fuzz -m 4096 -i seeds -o fuzz-out -- ./kea-dhcp6 -c kea.conf -p 9001 -P 9002
@endcode
In the above:
- It is assumed that the directory holding the "afl-fuzz" program is in
the path, otherwise include the path name when invoking it.
- "-m 4096" allows Kea to take up to 4096 MB of memory. (Use "ulimit" to
check and optionally modify the amount of virtual memory that can be used.)
- The "-i" switch specifies a directory (in this example, one named "seeds")
holding "seed" files. These are binary files that AFL will use as its
source for generating new packets. They can generated from a real packet
stream with wireshark: right click on a packet, then export as binary
data. Ensure that only the payload of the UDP packet is exported.
- The "-o" switch specifies a directory (in this example called "fuzz-out")
that AFL will use to hold packets it has generated and packets that it has
found causes crashes or hangs.
- "--" Separates the AFL command line from that of Kea.
- "./kea-dhcp6" is the program being fuzzed. As mentioned above, this
should be an executable image, and it will be simpler to fuzz one
that has been installed.
- The "-c" switch sets the configuration file Kea should use while being
fuzzed.
- "-p 9001 -P 9002". The port on which Kea should listen and the port to
which it should send replies. If omitted, Kea will try to use the default
DHCP ports, which are in the privileged range. Unless run with "sudo",
Kea will fail to open the port and Kea will exit early on: no useful
information will be obtained from the fuzzer.
-# Check that the fuzzer is working. If run from a terminal (with a black
background - AFL is particular about this), AFL will bring up a curses-style
interface showing the progress of the fuzzing. A good indication that
everything is working is to look at the "total paths" figure. Initially,
this should increase reasonably rapidly. If not, it is likely that Kea is
failing to start or initialize properly and the logging output (assuming
this has been configured) should be examined. Some sample seed packets are
provided in the "src/bin/dhcp4/tests/fuzz-data" and
"src/bin/dhcp6/tests/fuzz-data" directories.
@subsection fuzzRunConfig Fuzzing with Configuration Files
AFL can be used to check the parsing of the configuration files. In this type
of fuzzing, AFL generates configuration files which is passes to Kea to check.
Steps for this fuzzing are:
-# Build Kea as described above.
-# Create a dictionary of keywords. Athough AFL will mutate the files by
byte swaps, bit flips and the like, better results are obtained if it can
create new files based on keywords that could appear in the file. The
dictionary is described in the AFL documentation, but in brief, the file
contains successive lines of the form 'variable=keyword"', e.g.
@code
PD_POOLS="pd-pools"
PEERADDR="peeraddr"
PERSIST="persist"
PKT="pkt"
PKT4="pkt4"
@endcode
"variable" can be anything, as its name is ignored by AFL. However, all the
variable names in the file must be different. "keyword" is a valid keyword
that could appear in the configuration file. The convention adopted in the
example above seems to work well - variables have the same name as keywords,
but are in uppercase and have hyphens replaced by underscores.
-# Run Kea with a command line of the form:
@code
afl-fuzz -m 4096 -i seeds -o fuzz-out -x dict.dat -- ./kea-dhcp4 -t @@
@endcode
In the above command line:
- Everything up to and including the "--" is the AFL command. The switches
are as described in the previous section apart from the "-x" switch: this
specifies the dictionary file ("dict.dat" in this example) described
above.
- The Kea command line uses the "-t" switch to specify the configuration
file to check. This is specified by two consecutive "@" signs: AFL
will replace these with the name of a file it has created when starting
Kea.
@section Fuzzing Internals
@subsection fuzzInternalNetwork Fuzzing with Network Packets
The AFL fuzzer delivers packets to Kea's stdin. Although the part of Kea
concerning the reception of packets could have been modified to accept input
from stdin and have Kea pick them up in the normal way, a less-intrusive method
was adopted.
The packet loop in the main server code for kea-dhcp4 and kea-dhcp6 is
essentially:
@code{.unparsed}
while (not shutting down) {
Read and process one packet
}
@endcode
When --enable-fuzz is specified, this is conceptually modified to:
@code{.unparsed}
while (not shutting down) {
Read stdin and copy data to address/port on which Kea is listening
Read and process one packet
}
@endcode
Implementation is via an object of class "Fuzz". When created, it identifies
an interface, address and port on which Kea is listening and creates the
appropriate address structures for these. The port is passed as an argument to
the constructor because at the point at which the object is constructed, that
information is readily available. The interface and address are picked up from
the environment variables mentioned above. Consideration was given to
extracting the interface and address information from the configuration file,
but it was decided not to do this:
-# The configuration file can contain the definition of multiple interfaces;
if this is the case, the one being used for fuzzing is unclear.
-# The code is much simpler if the data is extracted from environment
variables.
Every time through the loop, the object reads the data from stdin and writes it
to the identified address/port. Control then returns to the main Kea code,
which finds data available on the address/port on which it is listening and
handles the data in the normal way.
In practice, the "while" line is actually:
@code{.unparsed}
while (__AFL_LOOP(count)) {
@endcode
__AFL_LOOP is a token recognized and expanded by the AFL compiler (so no need
to "#include" a file defining it) that implements the logic for the fuzzing.
Each time through the loop (apart from the first), it raises a SIGSTOP signal
telling AFL that the packet has been processed and instructing it to provide
more data. The "count" value is the number of times through the loop before
the loop terminates and the process is allowed to exit normally. When this
happens, AFL will start the process anew. The purpose of periodically shutting
down the process is to avoid issues raised by the fuzzing being confused with
any issues associated with the process running for a long time (e.g. memory
leaks).
@subsection fuzzInternalConfig Fuzzing with Configuration Files
No changes were required to Kea source code to fuzz configuration files. In
fact, other than compiling with afl-clang++ and installing the resultant
executable, no other steps are required. In particular, there is no need to
use the "--enable-fuzz" switch in the configuration command line (although
doing so will not cause any problems).
@subsection fuzzThreads Changes Required for Multi-Threaded Kea
The early versions of the fuzzing code used a separate thread to receive the
packets from AFL and to write them to the socket on which Kea is listening.
The lack of synchronization proved a problem, with Kea hanging in some
instances. Although some experiments with thread synchronization were
successful, in the end the far simpler single-threaded implementation described
above was adopted for the single-threaded Kea 1.6. Should Kea be modified to
become multi-threaded, the fuzzing code will need to be changed back to reading
the AFL input in the background.
*/
......@@ -141,6 +141,7 @@
* - @subpage logNotes
* - @subpage LoggingApi
* - @subpage SocketSessionUtility
* - @subpage fuzzer
* - @subpage docs
* - <a href="./doxygen-error.log">Documentation warnings and errors</a>
*
......
This file documents the process of initial trial runs for running
AFL fuzzer for Kea. Currently only Kea-dhcp6 is extended with this
capability. Once we get more experience with it, we should implement
this capability for Kea-dhcp4.
I have used Ubuntu 16.04 for this. I read somewhere that FreeBSD is
ok for fuzzing, but Mac OS is not.
1. Download AFL
Homepage: http://lcamtuf.coredump.cx/afl/
Version used: 2.35b (afl-latest.tgz)
2. Compile AFL
cd afl-2.35b
make
cd llvm_mode
make
the last step requires to have LLVM installed. On
Ubuntu 16.04 I had to do this:
sudo apt-get install llvm
3. Set up path to AFL binaries
export AFL_PATH=/home/thomson/devel/afl-2.35b
export PATH=$PATH:/home/thomson/devel/afl-2.35b
4. Build Kea using AFL
cd kea
git pull
git checkout experiments/fuzz
autoreconf -i
CXX=afl-clang-fast++ ./configure --enable-fuzz --enable-static-link
make
Note: no unit-tests needed. We will be fuzzing the
production code only.
5. Configure destination address
The defaults (see src/bin/dhcp6/fuzz.cc) are:
interface: eth0
dest address: ff02::1:2
dest port: 547
Those can be changed with the following env. variables:
KEA_AFL_INTERFACE
KEA_AFL_ADDR
KEA_AFL_PORT
E.g.
export KEA_AFL_INTERFACE=eth1
Overriding the parameters with variables has not been tested.
6. Run fuzzer
Set up max size of a virtual memory allowed to 4GB:
ulimit -v 4096000
You may be asked by AFL to tweak your kernel. In my case (ubuntu
16.04), I had to tweak the scaling_governor. The instructions AFL
gives are very easy to follow.
Instruct AFL to allow 4096MB of virtual memory and run AFL:
afl-fuzz -m 4096 -i tests/fuzz-data -o fuzz-out ./kea-dhcp6 -c tests/fuzz-config/fuzz.json
Here's what the switches do:
-m 4096 - allow Kea to take up to 4GB memory
-i tests/fuzz-data - Input seeds. These are the packet files used
to initiate the packet randomization. Several examples are in
src/bin/dhcp6/tests/fuzz-data. You can extract them using wireshark,
right click on a packet, then export as binary data. Make sure you
export the payload of UDP content. the first exported byte should
by message-type.
-o dir - that's the output directory. It doesn't have to exist.
7. Checking that the fuzzer is really working
a) the harness prints out a line to /tmp/kea-fuzz-harness.txt every
time a new packet is sent. This generated 4,5MB of entries in 20
minutes. Obviously, this has to be disabled for production fuzzing,
but it's good for initial trials.
b) I have my fuzz.json (which is renamed doc/examples/kea6/simple.json)
that tell Kea to use logging on level INFO and write output to a
file. This file keeps growing. That's around 3,8MB after 20 minutes.
8. Tweak Kea harness if needed
There are several variables in src/bin/dhcp6/fuzz.cc that you can
tweak. By default, it will write the log to /tmp/kea-fuzz-harness.txt
every 5 packets and will terminate after 100.000 packets processed.
That mechanism is to avoid cases when Kea gets stuck and technically
running, but not processing packets. AFL should be able to restart
Kea and continue running.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment