README.md

ABI, SONAME? What’s that?

Contents

  1. Intro
  2. ABI
  3. Why shared libraries should be versioned?
  4. ELF
  5. ABI evolution
  6. SONAME ABI versioning
  7. Advanced: versioned symbols

Intro

An application app uses an X class from a handy libhello library::

#include <memory>
#include "libhello/hello.h"
int main(int argc, char** argv) {
    std::unique_ptr<IHello> x(new StdGreeter);
    for (int i = 1; i < argc; i++) {
        x->sayHi(argv[i]);
    }
    return 0;
}
  1. Compile the library and the app

     make VARIANT=wrong/v0
    
  2. Run the app

    ./bin/app a b c
    Hi, a!
    Hi, b!
    Hi, c!
    
  3. Make a copy of app just in a case::

    cp -a bin/app bin/app_v0
    
  4. Upgrade the library to v1

    make VARIANT=wrong/v1
    
  5. Run the app linked with the v0 version of the library

    ./bin/app_v0 a b c
    Hi, a!
    Hi, b!
    Hi, c!
    
  6. Upgrade the library to v2

    make VARIANT=wrong/v2
    
  7. Run the v0 version of app again

    ./bin/app_v0 a b c
    ./bin/app_v0: Symbol `_ZTV10StdGreeter' has different size in shared object, consider re-linking
    Hi, a!
    Hi, b!
    Hi, c!
    
  8. Upgrade the library to v3

    make VARIANT=wrong/v3
    
  9. Run the v0 version of app again

    ./bin/app_v0 a b c
    ./bin/app_v0: Symbol `_ZTV10StdGreeter' has different size in shared object, consider re-linking
    Hi, a!
    Hi, b!
    Hi, c!
    
  10. Upgrade the library to v4

    make VARIANT=wrong/v4
    
  11. Run the v0 version of app again

    ./bin/app_v0 a b c
    ./bin/app_v0: Symbol `_ZTV10StdGreeter' has different size in shared object, consider re-linking
    Segmentation fault
    

Oops!

Note: all versions of libhello have a compatible API: the app source is the same, and compiles with every version of the library.

But app_v0 (linked with v0 version of libhello) can not run with v4 version of libhello. Why? Because the ABI has changed.

ABI

Application Binary Interface: set of interfaces available at the run time including, but not limited to:

  • Formats of executables and shared libraries
  • Calling conventions (which arguments are passed via registers/stack, scratch registers versus saved registers),
  • List of public symbols: functions, methods, data
  • Data layout: member pointers, alignment, size of structures
  • Name mangling scheme
  • vtable location and layout
  • typeinfo pointer(s) location and layout

and so on, see Itanium C++ ABI, x86 psABI

Why shared libraries should be versioned

Library interfaces (API, ABI) are protocols, they must be versioned just like any other protocol (think of network protocols, DB schemas, etc).

Unlike network protocols dynamic libraries have two interfaces:

  1. API: interface available at the compile time. Used by
    • programmer when writing the code
    • compiler when compiling the code
  2. ABI: interface available at the run time. Used by
    • the OS when starting/running the binary
    • the dynamic linker when resolving undefined functions

The pitfal: a compatible change in API (such the app code can be recompiled with a new version of the library without any changes) might yield an INCOMPATIBLE change of the ABI (so the binary linked with a previous version of the library won’t work with a new version of a dynamic library).

Shared library versioning == ABI versioning. The ABI version has NOTHING TO DO with the software release number (as in apache version 2.4.18 supports HTTP version 1.1).

ELF

Executable and Linkable Format

  • Consists of the header and arbitrary number of sections
  • Two mandatory tables:
    • program header table: describes program segments
    • section header table: describes the file sections

Segment: continous region of the process address space. Section: continous region of the ELF file.

Typical sections:

  • .text the program code
  • .rodata string constants
  • .data global variables
  • .bss zero-initialized variables (arrays)
  • .interp path to the run time linker (ELF interpreter)

See Executable and Linkable Format for more details.

Tools for examining ELF: objdump, nm

  • Dump all headers

    objdump -x /bin/bash
    
  • Which shared libraries are required for a binary/library

    objdump -p /bin/bash | grep NEEDED
      NEEDED               libreadline.so.7
      NEEDED               libdl.so.2
      NEEDED               libc.so.6
    
  • Which dynamic symbols are exported/referenced by a shared library

    nm -B -D -C /usr/lib64/libstdc++.so.6
    
    • T exported symbol from the .text section – function, method
    • U undefined symbols (presumably should be defined in NEEDED DSOs)
    • W weak exported symbols
    • V weak objects

(see man nm for more info)

ABI evolution

Breaking ABI is easy

  • Remove or unexport exported class(es).
  • Change type hierarchy in any way (add, remove, or reorder base classes).
  • For a template classes: change the template arguments (add, remove, reorder).
  • For virtual methods:

    • Add a virtual method to a class which has no other virtual methods or virtual bases.
    • Add new virtual method to non-leaf class (in particular to class which is designed to be derived from by library clients).
    • Change the order of virtual methods in the class declaration.
    • Override existing virtual method which is not in the primary base class
    • Remove a virtual method, even if it’s a reimplementation of a virtual method from the base class
    • Override an existing virtual function if the overriding function has a covariant return type for which the more-derived type has a pointer address different from the less-derived one (usually happens when, between the less-derived and the more-derived ones, there’s multiple inheritance or virtual inheritance).
  • Changing a method/function signature:

    • changing any types of the arguments in the parameter list, including changing const/volatile qualifiers of existing parameters
    • changing const/volatile qualifiers of the method/function
    • extending a method with another parameter, even if it has a default value
    • changing access rights (say, from private to public)
    • changing the return type in any way

Backward compatible ABI changes

  • Add new class(es).
  • Add or remove friend declarations to classes.
  • Add new non-virtual methods (including constructors).
  • Add a new enum to a class.
  • Remove private non-virtual functions if they are not called by any inline functions (and have never been).
  • Reimplement virtual functions defined in the primary base class (first non-virtual base class, or first non-virtual parent of the base class, etc) IF it’s safe for prior versions to call implementation in the base class rather than in derived ones.
  • When overriding methods with a covariant return type must have the same pointer address as the less-dervied one.

For a more detailed list see KDE ABI policy

SONAME ABI versioning

Every library no matter how carefully designed breaks ABI at certain point. How to properly inform users (programs as opposed to humans) about an incompatible ABI change?

Goals:

  • Avoid relinking client apps/libraries on compatible changes
  • Clearly mark incompatible changes
  • Application which need incompatible versions of library can coexist

Idea: decouple the protocol/library name (SONAME) from the file name. Example: apache supports HTTP 1.1. Just because new version of apache has been released doesn’t mean the (HTTP) protocol has changed. When the binary is linked with a shared library it’s the SONAME of the library which gets recorded as a dependency:

objdump -p /usr/bin/vim | grep NEEDED | grep libX11
  NEEDED               libX11.so.6

SONAME is similar to a protocol name (“HTTP”, “X11”) and version, in general it does NOT match the library filename (libX11.so.6.4.0)

objdump -p /usr/lib64/libX11.so.6.4.0 | grep SONAME
  SONAME               libX11.so.6
$ ls -1 -l /usr/lib64/libX11.so.6
  lrwxrwxrwx 1 root root 15 Jun  8  2021 /usr/lib64/libX11.so.6 -> libX11.so.6.4.0
$ ls -1 -l /usr/lib64/libX11.so*
  lrwxrwxrwx 1 root root      15 Jun  8  2021 /usr/lib64/libX11.so   -> libX11.so.6
  lrwxrwxrwx 1 root root      15 Jun  8  2021 /usr/lib64/libX11.so.6 -> libX11.so.6.4.0
  -rw-r--r-- 1 root root 1318584 Jun  8  2021 /usr/lib64/libX11.so.6.4.0
  • libX11.so used by the compile time linker only (-lX11), usually this symlink points to the latest available SONAME version of the library (libX11.so.6).
  • libX11.so.6SONAME symlink, used by the dynamic linker, points to the latest COMPATIBLE version of the library
  • libX11.so.6.4.0 – the actual DSO (shared library), it’s revision is 4, and patchlevel version is 0

Why such indirection? Historically UNIX’es had troubles writing files with public read-only mappings, hence the upgrade procedure was to

  • install newer version into a different file (named after the revision and the patchlevel version)
  • change the SONAME symlink to point to the newly installed file

This way the processes which use the previous version of the library can continue uninterrupted, and the newly started processes will use the upgraded library.

Rules of the game

  • When making a change which does not affect the ABI:
    • patchlevel++;
  • When making a backward compatible ABI change:
    • revision++;
    • patchlevel = 0;
  • When making an incompatible ABI change:
    • SONAME++;
    • revision = patchlevel = 0;

Note: changing SONAME for no good reason is a bad practice and is not appreciated by users.

Practical implementation: CMake

set_target_properties(libhello PROPERTIES SOVERSION X VERSION X.Y.Z)

Practical implementation: libtool

In attempt to be portable libtool makes things even more confusing:

  • LT_CURRENT: the most recent ABI version supported by the library
  • LT_REVISION: sort of patchlevel version
  • LT_AGE: number of compatible ABIs, that is, LT_CURRENT-LT_AGE is the oldest backward compatible ABI version supported by the library
libhello_la_LDFLAGS = -version-info $(LT_CURRENT):$(LT_REVISION):$(LT_AGE)

Tools for ABI checks

Note: the output should be taken with a grain of salt, there are both false positives and false negatives!

sudo apt-get install vtable-dumper abi-dumper abi-compliance-checker
make VARIANT=ok/v0
abi-dumper -o ABI-0.dump -lver 0 lib/libhello.so.0.0.0 
make VARIANT=ok/v4
abi-dumper -o ABI-4.dump -lver 4 lib/libhello.so.4.0.0 
abi-compliance-checker -l foo -old ABI-0.dump -new ABI-4.dump
xdg-open file://`pwd`/compat_reports/foo/0_to_4/compat_report.html

Notice that the tool hasn’t catched ABI breakage, although examining vtables reveals the incompatibility

vtable-dumper lib/libhello.so.0.0.0 > v0_vtbl.txt
vtable-dumper lib/libhello.so.4.0.0 > v4_vtbl.txt
vimdiff v0_vtbl.txt v4_vtbl.txt

Advanced: versioned symbols

SONAME ABI versioning_ is inconvenient: a single incompatible change requires SONAME bump, which forces re-linking the client apps (to use the new version of the library), even if the app in question hasn’t been using the class (function) which has changed in an incompatible manner.

Just like a server can support multiple versions of the protocol a shared library can support several ABI versions. Linux’ and Solaris’ linkers support versioning of individual symbols

objdump -p /bin/bash | sed -rne '/^Version References:/,$ { p }'
Version References:
  required from libdl.so.2:
    0x09691a75 0x00 09 GLIBC_2.2.5
  required from libc.so.6:
    0x06969191 0x00 10 GLIBC_2.11
    0x06969194 0x00 08 GLIBC_2.14
    0x0d696918 0x00 07 GLIBC_2.8
    0x06969195 0x00 06 GLIBC_2.15
    0x0d696914 0x00 05 GLIBC_2.4
    0x09691974 0x00 04 GLIBC_2.3.4
    0x0d696913 0x00 03 GLIBC_2.3
    0x09691a75 0x00 02 GLIBC_2.2.5
  • When adding a new function, mark them with a new version
  • When changing an existing function foo in a incompatible manner:
    • rename existing function to foo_old
    • write the new code into foo_new
    • export foo_new as foo version N+1, where N is a previous version of foo
    • export foo_old as foo version N
    • set the default version of foo to N+1
  • Increment the patchlevel version of the library
extern "C" int foo_new(int a, int b, int c);
extern "C" int foo_old(int a, int b);

__asm__(".symver foo_old,foo@LIBFOO_0");
__asm__(".symver foo_new,foo@@LIBFOO_1");

Advantages:

  • dependency on specific compatible version of the ABI can be recorded
  • backward compatibility can be maintained over a long time

Disadvantages:

  • It’s tricky, especially for C++ libraries with non-trivial class hierarchies (in fact the only C++ library which uses ELF symbol versioning is GCC’s libstdc++)

Example: GNU libstdc++

WARNING: this informationg might be obsolete

Myth: in order to be compatible with GCC version X.Y.Z a shared library needs to be built with exactly same version of GCC.

Fact: GCC’s libstdc++ backward compatible from GCC 3.4.x to very recent GCC’s, see GCC ABI policy. Thus

  • A binary compiled with GCC X and linked with libstdc++6 will run with GCC Y’s libstdc++6, where 3.4.x <= X <= Y <= 6.x
  • A library compiled with GCC X and linked with libstdc++6 can be used to build binaries/libraries with GCC Y, where 3.4.x <= X <= Y <= 6.y
  • When using a library compiled with GCC <= 4.9.x to link with a code built with GCC’s >= 5.0 that code might need an additional compile time option: -D_GLIBCXX_USE_CXX11_ABI=0
  • GCC’s >= 5.0 libstdc++6 is bi-ABI: it supports both C++98 ABI and C++11 ABI. It’s possible to pick the ABI version at the compile time with -D_GLIBCXX_USE_CXX11_ABI=0 independently on the language standard version.

To re-iterate: a C++ library compiled with GCC 4.4.x can be used with code compiled with GCC >= 5.0 as long as that code is either C++98-only, or compiled with the -D_GLIBCXX_USE_CXX11_ABI=0 option (some C++11 features might be unavailable, though).

Описание

Что такое ABI, SONAME, или версионирование разделяемых библиотек в Linux

Конвейеры
0 успешных
0 с ошибкой