DWARF module versioning

1. Introduction

When CONFIG_MODVERSIONS is enabled, symbol versions for modules are typically calculated from preprocessed source code using the genksyms tool. However, this is incompatible with languages such as Rust, where the source code has insufficient information about the resulting ABI. With CONFIG_GENDWARFKSYMS (and CONFIG_DEBUG_INFO) selected, gendwarfksyms is used instead to calculate symbol versions from the DWARF debugging information, which contains the necessary details about the final module ABI.

1.1. Usage

gendwarfksyms accepts a list of object files on the command line, and a list of symbol names (one per line) in standard input:

Usage: gendwarfksyms [options] elf-object-file ... < symbol-list

Options:
  -d, --debug          Print debugging information
      --dump-dies      Dump DWARF DIE contents
      --dump-die-map   Print debugging information about die_map changes
      --dump-types     Dump type strings
      --dump-versions  Dump expanded type strings used for symbol versions
  -s, --stable         Support kABI stability features
  -T, --symtypes file  Write a symtypes file
  -h, --help           Print this message

2. Type information availability

While symbols are typically exported in the same translation unit (TU) where they’re defined, it’s also perfectly fine for a TU to export external symbols. For example, this is done when calculating symbol versions for exports in stand-alone assembly code.

To ensure the compiler emits the necessary DWARF type information in the TU where symbols are actually exported, gendwarfksyms adds a pointer to exported symbols in the EXPORT_SYMBOL() macro using the following macro:

#define __GENDWARFKSYMS_EXPORT(sym)                             \
        static typeof(sym) *__gendwarfksyms_ptr_##sym __used    \
                __section(".discard.gendwarfksyms") = &sym;

When a symbol pointer is found in DWARF, gendwarfksyms can use its type for calculating symbol versions even if the symbol is defined elsewhere. The name of the symbol pointer is expected to start with __gendwarfksyms_ptr_, followed by the name of the exported symbol.

3. Symtypes output format

Similarly to genksyms, gendwarfksyms supports writing a symtypes file for each processed object that contain types for exported symbols and each referenced type that was used in calculating symbol versions. These files can be useful when trying to determine what exactly caused symbol versions to change between builds. To generate symtypes files during a kernel build, set KBUILD_SYMTYPES=1.

Matching the existing format, the first column of each line contains either a type reference or a symbol name. Type references have a one-letter prefix followed by “#” and the name of the type. Four reference types are supported:

e#<type> = enum
s#<type> = struct
t#<type> = typedef
u#<type> = union

Type names with spaces in them are wrapped in single quotes, e.g.:

s#'core::result::Result<u8, core::num::error::ParseIntError>'

The rest of the line contains a type string. Unlike with genksyms that produces C-style type strings, gendwarfksyms uses the same simple parsed DWARF format produced by --dump-dies, but with type references instead of fully expanded strings.

4. Maintaining a stable kABI

Distribution maintainers often need the ability to make ABI compatible changes to kernel data structures due to LTS updates or backports. Using the traditional #ifndef __GENKSYMS__ to hide these changes from symbol versioning won’t work when processing object files. To support this use case, gendwarfksyms provides kABI stability features designed to hide changes that won’t affect the ABI when calculating versions. These features are all gated behind the --stable command line flag and are not used in the mainline kernel. To use stable features during a kernel build, set KBUILD_GENDWARFKSYMS_STABLE=1.

Examples for using these features are provided in the scripts/gendwarfksyms/examples directory, including helper macros for source code annotation. Note that as these features are only used to transform the inputs for symbol versioning, the user is responsible for ensuring that their changes actually won’t break the ABI.

4.1. kABI rules

kABI rules allow distributions to fine-tune certain parts of gendwarfksyms output and thus control how symbol versions are calculated. These rules are defined in the .discard.gendwarfksyms.kabi_rules section of the object file and consist of simple null-terminated strings with the following structure:

version\0type\0target\0value\0

This string sequence is repeated as many times as needed to express all the rules. The fields are as follows:

  • version: Ensures backward compatibility for future changes to the structure. Currently expected to be “1”.

  • type: Indicates the type of rule being applied.

  • target: Specifies the target of the rule, typically the fully qualified name of the DWARF Debugging Information Entry (DIE).

  • value: Provides rule-specific data.

The following helper macro, for example, can be used to specify rules in the source code:

#define __KABI_RULE(hint, target, value)                             \
        static const char __PASTE(__gendwarfksyms_rule_,             \
                                  __COUNTER__)[] __used __aligned(1) \
                __section(".discard.gendwarfksyms.kabi_rules") =     \
                        "1\0" #hint "\0" #target "\0" #value

Currently, only the rules discussed in this section are supported, but the format is extensible enough to allow further rules to be added as need arises.

4.1.1. Managing definition visibility

A declaration can change into a full definition when additional includes are pulled into the translation unit. This changes the versions of any symbol that references the type even if the ABI remains unchanged. As it may not be possible to drop includes without breaking the build, the declonly rule can be used to specify a type as declaration-only, even if the debugging information contains the full definition.

The rule fields are expected to be as follows:

  • type: “declonly”

  • target: The fully qualified name of the target data structure (as shown in --dump-dies output).

  • value: This field is ignored.

Using the __KABI_RULE macro, this rule can be defined as:

#define KABI_DECLONLY(fqn) __KABI_RULE(declonly, fqn, )

Example usage:

struct s {
        /* definition */
};

KABI_DECLONLY(s);

4.1.2. Adding enumerators

For enums, all enumerators and their values are included in calculating symbol versions, which becomes a problem if we later need to add more enumerators without changing symbol versions. The enumerator_ignore rule allows us to hide named enumerators from the input.

The rule fields are expected to be as follows:

  • type: “enumerator_ignore”

  • target: The fully qualified name of the target enum (as shown in --dump-dies output) and the name of the enumerator field separated by a space.

  • value: This field is ignored.

Using the __KABI_RULE macro, this rule can be defined as:

#define KABI_ENUMERATOR_IGNORE(fqn, field) \
        __KABI_RULE(enumerator_ignore, fqn field, )

Example usage:

enum e {
        A, B, C, D,
};

KABI_ENUMERATOR_IGNORE(e, B);
KABI_ENUMERATOR_IGNORE(e, C);

If the enum additionally includes an end marker and new values must be added in the middle, we may need to use the old value for the last enumerator when calculating versions. The enumerator_value rule allows us to override the value of an enumerator for version calculation:

  • type: “enumerator_value”

  • target: The fully qualified name of the target enum (as shown in --dump-dies output) and the name of the enumerator field separated by a space.

  • value: Integer value used for the field.

Using the __KABI_RULE macro, this rule can be defined as:

#define KABI_ENUMERATOR_VALUE(fqn, field, value) \
        __KABI_RULE(enumerator_value, fqn field, value)

Example usage:

enum e {
        A, B, C, LAST,
};

KABI_ENUMERATOR_IGNORE(e, C);
KABI_ENUMERATOR_VALUE(e, LAST, 2);

4.3. Adding structure members

Perhaps the most common ABI compatible change is adding a member to a kernel data structure. When changes to a structure are anticipated, distribution maintainers can pre-emptively reserve space in the structure and take it into use later without breaking the ABI. If changes are needed to data structures without reserved space, existing alignment holes can potentially be used instead. While kABI rules could be added for these type of changes, using unions is typically a more natural method. This section describes gendwarfksyms support for using reserved space in data structures and hiding members that don’t change the ABI when calculating symbol versions.

4.3.1. Reserving space and replacing members

Space is typically reserved for later use by appending integer types, or arrays, to the end of the data structure, but any type can be used. Each reserved member needs a unique name, but as the actual purpose is usually not known at the time the space is reserved, for convenience, names that start with __kabi_ are left out when calculating symbol versions:

struct s {
        long a;
        long __kabi_reserved_0; /* reserved for future use */
};

The reserved space can be taken into use by wrapping the member in a union, which includes the original type and the replacement member:

struct s {
        long a;
        union {
                long __kabi_reserved_0; /* original type */
                struct b b; /* replaced field */
        };
};

If the __kabi_ naming scheme was used when reserving space, the name of the first member of the union must start with __kabi_reserved. This ensures the original type is used when calculating versions, but the name is again left out. The rest of the union is ignored.

If we’re replacing a member that doesn’t follow this naming convention, we also need to preserve the original name to avoid changing versions, which we can do by changing the first union member’s name to start with __kabi_renamed followed by the original name.

The examples include KABI_(RESERVE|USE|REPLACE)* macros that help simplify the process and also ensure the replacement member is correctly aligned and its size won’t exceed the reserved space.

4.3.2. Hiding members

Predicting which structures will require changes during the support timeframe isn’t always possible, in which case one might have to resort to placing new members into existing alignment holes:

struct s {
        int a;
        /* a 4-byte alignment hole */
        unsigned long b;
};

While this won’t change the size of the data structure, one needs to be able to hide the added members from symbol versioning. Similarly to reserved fields, this can be accomplished by wrapping the added member to a union where one of the fields has a name starting with __kabi_ignored:

struct s {
        int a;
        union {
                char __kabi_ignored_0;
                int n;
        };
        unsigned long b;
};

With --stable, both versions produce the same symbol version.