ABI, SONAME? What’s that?
Contents
- Intro
- ABI
- Why shared libraries should be versioned?
- ELF
- ABI evolution
- SONAME ABI versioning
- Advanced: versioned symbols
Intro
An application app
uses an X class from a handy libhello
library::
#include <memory>
#include "libhello/hello.h"
int main(int argc, char** argv) {
std::unique_ptr<IHello> x(new StdGreeter);
for (int i = 1; i < argc; i++) {
x->sayHi(argv[i]);
}
return 0;
}
-
Compile the library and the app
make VARIANT=wrong/v0
-
Run the app
./bin/app a b c Hi, a! Hi, b! Hi, c!
-
Make a copy of
app
just in a case::cp -a bin/app bin/app_v0
-
Upgrade the library to
v1
make VARIANT=wrong/v1
-
Run the
app
linked with thev0
version of the library./bin/app_v0 a b c Hi, a! Hi, b! Hi, c!
-
Upgrade the library to
v2
make VARIANT=wrong/v2
-
Run the
v0
version ofapp
again./bin/app_v0 a b c ./bin/app_v0: Symbol `_ZTV10StdGreeter' has different size in shared object, consider re-linking Hi, a! Hi, b! Hi, c!
-
Upgrade the library to
v3
make VARIANT=wrong/v3
-
Run the
v0
version ofapp
again./bin/app_v0 a b c ./bin/app_v0: Symbol `_ZTV10StdGreeter' has different size in shared object, consider re-linking Hi, a! Hi, b! Hi, c!
-
Upgrade the library to
v4
make VARIANT=wrong/v4
-
Run the
v0
version ofapp
again./bin/app_v0 a b c ./bin/app_v0: Symbol `_ZTV10StdGreeter' has different size in shared object, consider re-linking Segmentation fault
Oops!
Note: all versions of libhello
have a compatible API: the app source is the same, and compiles with every version of the library.
But app_v0
(linked with v0
version of libhello
) can not run with v4
version of libhello
. Why? Because the ABI has changed.
ABI
Application Binary Interface: set of interfaces available at the run time including, but not limited to:
- Formats of executables and shared libraries
- Calling conventions (which arguments are passed via registers/stack, scratch registers versus saved registers),
- List of public
symbols
: functions, methods, data - Data layout: member pointers, alignment, size of structures
- Name mangling scheme
vtable
location and layouttypeinfo
pointer(s) location and layout
and so on, see Itanium C++ ABI, x86 psABI
Why shared libraries should be versioned
Library interfaces (API, ABI) are protocols, they must be versioned just like any other protocol (think of network protocols, DB schemas, etc).
Unlike network protocols dynamic libraries have two interfaces:
- API: interface available at the compile time. Used by
- programmer when writing the code
- compiler when compiling the code
- ABI: interface available at the run time. Used by
- the OS when starting/running the binary
- the dynamic linker when resolving undefined functions
The pitfal: a compatible change in API (such the app code can be recompiled with a new version of the library without any changes) might yield an INCOMPATIBLE change of the ABI (so the binary linked with a previous version of the library won’t work with a new version of a dynamic library).
Shared library versioning == ABI versioning. The ABI version has NOTHING TO DO with the software release number (as in apache version 2.4.18 supports HTTP version 1.1).
ELF
Executable and Linkable Format
- Consists of the header and arbitrary number of
sections
- Two mandatory tables:
- program header table: describes program
segments
- section header table: describes the file
sections
- program header table: describes program
Segment: continous region of the process address space. Section: continous region of the ELF file.
Typical sections:
.text
the program code.rodata
string constants.data
global variables.bss
zero-initialized variables (arrays).interp
path to the run time linker (ELF interpreter)
See Executable and Linkable Format for more details.
Tools for examining ELF: objdump, nm
-
Dump all headers
objdump -x /bin/bash
-
Which shared libraries are required for a binary/library
objdump -p /bin/bash | grep NEEDED NEEDED libreadline.so.7 NEEDED libdl.so.2 NEEDED libc.so.6
-
Which dynamic symbols are exported/referenced by a shared library
nm -B -D -C /usr/lib64/libstdc++.so.6
T
exported symbol from the.text
section – function, methodU
undefined symbols (presumably should be defined inNEEDED
DSOs)W
weak exported symbolsV
weak objects
(see man nm
for more info)
ABI evolution
Breaking ABI is easy
- Remove or unexport exported class(es).
- Change type hierarchy in any way (add, remove, or reorder base classes).
- For a template classes: change the template arguments (add, remove, reorder).
-
For virtual methods:
- Add a virtual method to a class which has no other virtual methods or virtual bases.
- Add new virtual method to non-leaf class (in particular to class which is designed to be derived from by library clients).
- Change the order of virtual methods in the class declaration.
- Override existing virtual method which is not in the primary base class
- Remove a virtual method, even if it’s a reimplementation of a virtual method from the base class
- Override an existing virtual function if the overriding function has a covariant return type for which the more-derived type has a pointer address different from the less-derived one (usually happens when, between the less-derived and the more-derived ones, there’s multiple inheritance or virtual inheritance).
-
Changing a method/function signature:
- changing any types of the arguments in the parameter list, including changing const/volatile qualifiers of existing parameters
- changing const/volatile qualifiers of the method/function
- extending a method with another parameter, even if it has a default value
- changing access rights (say, from
private
topublic
) - changing the return type in any way
Backward compatible ABI changes
- Add new class(es).
- Add or remove friend declarations to classes.
- Add new non-virtual methods (including constructors).
- Add a new enum to a class.
- Remove private non-virtual functions if they are not called by any inline functions (and have never been).
- Reimplement virtual functions defined in the primary base class (first non-virtual base class, or first non-virtual parent of the base class, etc) IF it’s safe for prior versions to call implementation in the base class rather than in derived ones.
- When overriding methods with a covariant return type must have the same pointer address as the less-dervied one.
For a more detailed list see KDE ABI policy
SONAME ABI versioning
Every library no matter how carefully designed breaks ABI at certain point. How to properly inform users (programs as opposed to humans) about an incompatible ABI change?
Goals:
- Avoid relinking client apps/libraries on compatible changes
- Clearly mark incompatible changes
- Application which need incompatible versions of library can coexist
Idea: decouple the protocol/library name (SONAME
) from the file name. Example: apache supports HTTP 1.1. Just because new version of apache has been released doesn’t mean the (HTTP) protocol has changed. When the binary is linked with a shared library it’s the SONAME
of the library which gets recorded as a dependency:
objdump -p /usr/bin/vim | grep NEEDED | grep libX11
NEEDED libX11.so.6
SONAME
is similar to a protocol name (“HTTP”, “X11”) and version, in general it does NOT match the library filename (libX11.so.6.4.0
)
objdump -p /usr/lib64/libX11.so.6.4.0 | grep SONAME
SONAME libX11.so.6
$ ls -1 -l /usr/lib64/libX11.so.6
lrwxrwxrwx 1 root root 15 Jun 8 2021 /usr/lib64/libX11.so.6 -> libX11.so.6.4.0
$ ls -1 -l /usr/lib64/libX11.so*
lrwxrwxrwx 1 root root 15 Jun 8 2021 /usr/lib64/libX11.so -> libX11.so.6
lrwxrwxrwx 1 root root 15 Jun 8 2021 /usr/lib64/libX11.so.6 -> libX11.so.6.4.0
-rw-r--r-- 1 root root 1318584 Jun 8 2021 /usr/lib64/libX11.so.6.4.0
libX11.so
used by the compile time linker only (-lX11
), usually this symlink points to the latest availableSONAME
version of the library (libX11.so.6
).libX11.so.6
–SONAME
symlink, used by the dynamic linker, points to the latest COMPATIBLE version of the librarylibX11.so.6.4.0
– the actual DSO (shared library), it’s revision is4
, and patchlevel version is0
Why such indirection? Historically UNIX’es had troubles writing files with public read-only mappings, hence the upgrade procedure was to
- install newer version into a different file (named after the revision and the patchlevel version)
- change the
SONAME
symlink to point to the newly installed file
This way the processes which use the previous version of the library can continue uninterrupted, and the newly started processes will use the upgraded library.
Rules of the game
- When making a change which does not affect the ABI:
- patchlevel++;
- When making a backward compatible ABI change:
- revision++;
- patchlevel = 0;
- When making an incompatible ABI change:
- SONAME++;
- revision = patchlevel = 0;
Note: changing SONAME for no good reason is a bad practice and is not appreciated by users.
Practical implementation: CMake
set_target_properties(libhello PROPERTIES SOVERSION X VERSION X.Y.Z)
Practical implementation: libtool
In attempt to be portable libtool makes things even more confusing:
LT_CURRENT
: the most recent ABI version supported by the libraryLT_REVISION
: sort of patchlevel versionLT_AGE
: number of compatible ABIs, that is,LT_CURRENT-LT_AGE
is the oldest backward compatible ABI version supported by the library
libhello_la_LDFLAGS = -version-info $(LT_CURRENT):$(LT_REVISION):$(LT_AGE)
Tools for ABI checks
Note: the output should be taken with a grain of salt, there are both false positives and false negatives!
sudo apt-get install vtable-dumper abi-dumper abi-compliance-checker
make VARIANT=ok/v0
abi-dumper -o ABI-0.dump -lver 0 lib/libhello.so.0.0.0
make VARIANT=ok/v4
abi-dumper -o ABI-4.dump -lver 4 lib/libhello.so.4.0.0
abi-compliance-checker -l foo -old ABI-0.dump -new ABI-4.dump
xdg-open file://`pwd`/compat_reports/foo/0_to_4/compat_report.html
Notice that the tool hasn’t catched ABI breakage, although examining vtables reveals the incompatibility
vtable-dumper lib/libhello.so.0.0.0 > v0_vtbl.txt
vtable-dumper lib/libhello.so.4.0.0 > v4_vtbl.txt
vimdiff v0_vtbl.txt v4_vtbl.txt
Advanced: versioned symbols
SONAME ABI versioning
_ is inconvenient: a single incompatible change requires SONAME bump, which forces re-linking the client apps (to use the new version of the library), even if the app in question hasn’t been using the class (function) which has changed in an incompatible manner.
Just like a server can support multiple versions of the protocol a shared library can support several ABI versions. Linux’ and Solaris’ linkers support versioning of individual symbols
objdump -p /bin/bash | sed -rne '/^Version References:/,$ { p }'
Version References:
required from libdl.so.2:
0x09691a75 0x00 09 GLIBC_2.2.5
required from libc.so.6:
0x06969191 0x00 10 GLIBC_2.11
0x06969194 0x00 08 GLIBC_2.14
0x0d696918 0x00 07 GLIBC_2.8
0x06969195 0x00 06 GLIBC_2.15
0x0d696914 0x00 05 GLIBC_2.4
0x09691974 0x00 04 GLIBC_2.3.4
0x0d696913 0x00 03 GLIBC_2.3
0x09691a75 0x00 02 GLIBC_2.2.5
- When adding a new function, mark them with a new version
- When changing an existing function
foo
in a incompatible manner:- rename existing function to
foo_old
- write the new code into
foo_new
- export
foo_new
asfoo
version N+1, where N is a previous version offoo
- export
foo_old
asfoo
version N - set the default version of
foo
to N+1
- rename existing function to
- Increment the patchlevel version of the library
extern "C" int foo_new(int a, int b, int c);
extern "C" int foo_old(int a, int b);
__asm__(".symver foo_old,foo@LIBFOO_0");
__asm__(".symver foo_new,foo@@LIBFOO_1");
Advantages:
- dependency on specific compatible version of the ABI can be recorded
- backward compatibility can be maintained over a long time
Disadvantages:
- It’s tricky, especially for C++ libraries with non-trivial class hierarchies (in fact the only C++ library which uses ELF symbol versioning is GCC’s libstdc++)
Example: GNU libstdc++
WARNING: this informationg might be obsolete
Myth: in order to be compatible with GCC version X.Y.Z a shared library needs to be built with exactly same version of GCC.
Fact: GCC’s libstdc++ backward compatible from GCC 3.4.x to very recent GCC’s, see GCC ABI policy. Thus
- A binary compiled with GCC X and linked with
libstdc++6
will run with GCC Y’slibstdc++6
, where 3.4.x <= X <= Y <= 6.x - A library compiled with GCC X and linked with
libstdc++6
can be used to build binaries/libraries with GCC Y, where 3.4.x <= X <= Y <= 6.y - When using a library compiled with GCC <= 4.9.x to link with a code built with GCC’s >= 5.0 that code might need an additional compile time option:
-D_GLIBCXX_USE_CXX11_ABI=0
- GCC’s >= 5.0
libstdc++6
is bi-ABI: it supports both C++98 ABI and C++11 ABI. It’s possible to pick the ABI version at the compile time with-D_GLIBCXX_USE_CXX11_ABI=0
independently on the language standard version.
To re-iterate: a C++ library compiled with GCC 4.4.x can be used with code compiled with GCC >= 5.0 as long as that code is either C++98-only, or compiled with the -D_GLIBCXX_USE_CXX11_ABI=0
option (some C++11 features might be unavailable, though).
Описание
Что такое ABI, SONAME, или версионирование разделяемых библиотек в Linux