Libunbin (library for "unbinarying") is a library / toolkit for safely reading binary structures. Its primary use so far is for reading partitions and filesystems in Xen disk images, but certainly it could be applied in other areas. For instance, as a safe way to snoop and decode network traffic.
The primary goals are security and ease of use. For security we want to ensure that malicious data structures cannot compromise the reading program (a frequent cause of vulnerabilities in ethereal / wireshark for example). Coupled with the security goal, we want to keep descriptions of binary structures easy to audit and separate from other code which just uses them. For ease of use we want to be able to easily import descriptions of binary structures — we can import C structures directly out of C code and C header files, including common annotations for bitfields, signedness and endianness. In general libunbin ought to be as easy or easier to use than direct C structures.
This web page is all the documentation which exists so far, but we hope to have much more comprehensive documentation in future.
Libunbin is controlled through a master program called
libunbin which has various subcommands to do things like
importing C, printing the auditable structures, generating stubs, and
so on.
There are also various files, either ones which you must write, or ones which are generated:
| Filename | Made? | Description |
|---|---|---|
file.ubd |
Supplied by user |
Libunbin description file for importing C structures.
This file is basically C code with some additional macros.
You can place #include statements here to include ordinary
C header files, or you can write C structures directly.
The format is described in detail in the next sections.
|
file.ubb |
Generated |
Libunbin file containing the description of the
binary structures (often called the "UBB file").
This is the intermediate file format
between the various import formats and the various
outputs like code stubs. Although this file is a binary
file, you can dump out its contents using
libunbin -print-ubb file.ubb.
|
file.ubm |
Supplied by user, optional | This file contains metadata which can be applied to UBB files to add or change the binary structure. The primary use of this is to annotate imported C structures. For example if the C struct doesn't contain information about endianness, then you can annotate fields from here. |
file_stubs.c file_stubs.h |
Generated | The generated C code for accessing binary structures. |
file_stubs.ml |
Generated | The generated OCaml code for accessing binary structures. |
file.xml |
Supplied by user, optional | An alternate way to describe binary structures is to write an XML description. Useful if describing a binary structure where you have no suitable C header file, and it also allows you much more control. |
file.ml |
Supplied by user or generated | An alternate way to describe binary structures is to dump out the UBB file and edit it by hand, then reimport it. This gives you ultimate flexibility, but is not very portable (ie. will break between libunbin releases). |
UBB files are not compatible between libunbin releases. For each new release you should regenerate them from source.
The diagram below shows how the various files are related. In this case we have written code to decode the EXT2/3 superblock. You can also find this example in the libunbin source distribution.
To import a C struct you must write a .ubd file
which contains C code and includes and some special macros.
Here is ext3.ubd for importing the ext3 superblock
and a constant:
#include <linux/magic.h> #include <linux/ext3_fs.h> typedef struct ext3_super_block LIBUNBIN_EXPORT(ext3_super_block); LIBUNBIN_CONSTANT_INT32 (ext3_super_magic, EXT3_SUPER_MAGIC);
Any type (struct or union) which is to be exported must
be written as a typedef with LIBUNBIN_EXPORT(name)
macro around the type, where name is the
type name as it should be exported.
Any constant should use one of the LIBUNBIN_CONSTANT_*
macros. These are defined in libunbin-ubd-prefix.h
which is included automatically in any UBD file.
For Linux kernel header files, it is often useful to define the following symbols in your UBD file:
#define __CHECKER__ 1 #define __CHECK_ENDIAN__ 1
The effect of defining these is to enable the special
__be32, __le32, etc types to
be recognised by libunbin as indicating endianness.
To compile a UBD file to UBB, do:
libunbin -ubd file.ubd
You can write a .ubm file (or several) to
transform or annotate C structures. This is because C structures
typically don't contain enough information to generate
stubs properly (eg. they are missing endianness information,
or often they don't have any information about dependencies
between parts of the struct).
For example in the ext3 superblock there is a large s_reserved
field at the end of the structure which serves no purpose
other than to pad the C structure out to the right size.
Unless told otherwise libunbin will generate stubs for
reading this field, but that is not really useful and
just makes reading slower. One solution to this would be
to edit the Linux system header file, but that is quite intrusive.
A better solution is to write a metadata file instructing
libunbin to remove the unwanted field:
delete ext3_super_block(s_reserved);
To transform a UBB file using metadata, do:
libunbin -transform file.ubb file.ubm
(This updates file.ubb).
Further documentation for metadata files to follow ...
You can describe structures directly in XML (no C required). This gives you a portable way to describe structures, accessing the full power of libunbin.
Further documentation for XML files to follow ...
To generate C stubs from an UBB file, do:
libunbin -stubs file.ubb
Pay careful attention to any warnings, since they often indicate security issues. If you see warnings then read the section on auditing below.
The above command generates file_stubs.c
and file_stubs.h. You can examine the
header file to find out what functions are available.
To generate OCaml stubs, do:
libunbin -ml-stubs file.ubb
This generates file_stubs.ml.
The UBB file format is not portable, meaning that it can change between releases of libunbin. Nevertheless it is possible to modify UBB files by hand. Firstly dump out the UBB file as text:
libunbin -print-ubb file.ubb > file.ml
Now you can make the required edits to file.ml
(refer to libunbin_ubb.ml in the libunbin source
to find out how it works). Then reimport the edited file
to UBB:
libunbin -ml file.ml
(This generates or overwrites file.ubb).
Libunbin is designed to generate secure stub code. However this does not stop the need to audit structures. For example, if a binary file contains an integer field indicating that an array of a billion structures follows, libunbin will try to read it (you will have got a warning when the stubs were being generated, but the programmer may have ignored it).
It is important therefore to audit the binary structures and if necessary to add assertions about the valid range of numbers in the file, to avoid situations like the above.
You should audit the UBB file after it has been imported from source and any metadata transforms applied. Once you have such an UBB file, do:
libunbin -print-ubb file.ubb | less
More about auditing to follow ...
$Id: index.html,v 1.5 2007/10/30 17:11:05 rjones Exp $