Making a portable CPython interpreter
In the Scrypted home video platform, Python is one of the scripting languages used for implementing Scrypted plugins, predominantly used by plugins that provide motion detection, object detection, and other AI inference services (a byproduct of the heavily reliance on Python in existing machine learning communities). As such an ubiquitous language, Python has interpreters available for nearly every conceivable operating system and hardware architecture, though these distributions can vary in Python version, Python implementation (e.g. CPython vs PyPy), installation method, compatibility with Scrypted plugins, or even permissions required to install on the host. Being able to easily bundle a vetted Python distribution with a Scrypted installation would reduce the amount of variability that this critical dependency has on the platform.
Enter portable-python. This project is my attempt at configuring and packaging a distribution of CPython for Linux, Windows, and MacOS, across different architectures, that fulfills the needs of Scrypted’s growing set of Python plugins while still being suitable for general-purpose use. CPython was chosen as it is the one used by existing Scrypted plugins. The rest of this blog documents my experiences with building this portable distribution, as well as the technical challenges and design choices made to address them. This is by no means a testament into the most “correct” or “superior” method of achieving this goal - if there are better ways, please let me know by raising an issue in the repo.
Evaluating static linking
Since Scrypted is primarily a nodejs program, I was initially inspired by the
ffmpeg-static project’s method of using an npm install hook and dynamically detecting, at install time, the
target OS and architecture so the correct ffmpeg build can be selected and
downloaded. A statically linked ffmpeg makes this straightforward to do,
since a static binary removes any dependency on the host’s shared libraries and,
more importantly, makes the binary relocatable. In other words, the ffmpeg binary
can be installed to anywhere on the filesystem, and it will still function properly. This is
important for an installer like ffmpeg-static to work - a nodejs package
could exist anywhere on the filesystem, and could be installed by an unprivileged
user account, so the included ffmpeg must not depend on hardcoded file paths or
require installing anything into privileged system paths such as /usr
.
(Technically, static linking is not enough to solve the problem for more complex
applications - more on that in a later section.)
It seems natural, therefore, to consider building a statically linked CPython interpreter. To do this, I chose to use python-cmake-buildsystem,
leveraging its simple configuration options to build all standard Python extensions and
link dependencies statically on Linux, Windows, and MacOS. While it was easy to
get static CPython built with this buildsystem, a significant downside is that
statically linking libc on Linux renders dlopen
error-prone and no longer portable.
dlopen
is an important dynamic loader function used by the Python ctypes
module as well as any Python modules with native extensions, so
ensuring it works properly is important. Therefore, for portable-python, static
linking is no longer a suitable approach - a dynamically linked executable is required.
Loading shared libraries
Binary executable files are stored on disk in certain standardized formats, which grant the operating system a way to determine how the executable is laid out in memory and how it should be executed. Executables are in ELF format on Linux, Mach-O on MacOS, and PE on Windows.
A dynamically linked CPython interpreter, especially
one built with the full set of standard library modules, depends on may shared libraries.
Some examples of such
libraries are libssl and libcrypto for the ssl
module and liblzma for the lzma
module. Compiled executables contains references to shared libraries
in its headers, and when the executable starts running, the operating system’s dynamic linker (a program
responsible for loading the executable and resolving symbols at runtime) reads these
headers to determine which libraries to load.
For PE executables on Windows, the DLL search path includes the executable’s source directory. This is convenient for portable-python, since any dependencies just need to be copied to the directory containing the CPython interpreter.
Both ELF and Mach-O executables search for shared libraries in a fixed number of default system directories (see docs for Linux and MacOS). This won’t work for portable-python, since there’s no guarantee that the shared library dependencies will exist on the host. There are environment variables that could be set to instruct the dynamic linker to search other paths, but expecting the end user to set those variables before using portable-python is bad UX.
Borrowing a technique used by auditwheel (a tool that creates Python wheels
with all dependency shared libraries copied into the wheel itself), I considered replacing all
references to shared libraries to a path relative to the executable. Conveniently, both ELF
and Mach-O support constructing relative paths by using $ORIGIN
in ELF and @executable_path
in Mach-O to resolve
the path to the executable file. Replacing the references can then be done with
patchelf on Linux and install_name_tool on MacOS. Solving this
should be simple with a script that enumerates all shared libraries and
patches them, right?
Unfortunately, in my testing, patchelf proved to be unreliable and buggy. Patching each shared library individually sometimes results in a broken executable, unable to be loaded by the dynamic linker. Behavior across different architectures also varied, as the same version of patchelf may work for x86_64, but then break the executable on arm64.
Readers familiar with dynamic linking may be aware that there is a simple and obvious solution
to this, one that I initially overlooked: rpath
. rpath
, or “run-path,” is a way to specify,
as part of the executable, additional directories to look for shared libraries - without the
need to export environment variables! Better yet, rpath
also supports $ORIGIN
and
@executable_path
, meaning it can be used to reference relative directories! This proved to be
the key solution to bundling shared libraries in portable-python: All dependencies could
be simply installed with the interpreter under the same folder structure, and rpath
tells
the dynamic linker to look for libraries in the installation directory, no matter where it is
on the filesystem.
Relaxing the application prefix
It turns out, pointing an executable to its shared libraries is only half the battle. Inside
the program binary, there could be hardcoded references to the path where it expects to be
installed. This is where prefix
comes into play.
When compiling applications on Unix systems, many build script generators (like autoconf and
cmake) take in a prefix
flag. This parameter tells the build scripts where the application
will eventually be installed - for example, a prefix
set to /usr/local
will install
executables under /usr/local/bin
, libraries under /usr/local/lib
, and so on.
As previously discussed, setting lookup paths for shared libraries can be done easily with rpath
. What
about other resources and assets required by CPython, such as Python files part of the standard
library? prefix
is used to resolve these. When CPython is compiled, the prefix
value is stored as part of the
sys
module, with additional references to the path inside sysconfig
. This prefix
is also used by Python wheel builders to
point native extension compilers to headers shipped with the Python distribution through
static pkgconfig scripts.
To solve this, I made minor modifications to the CPython source code and
Python standard library to rewrite hardcoded prefix
values with dynamic
runtime resolution. I also made modifications to dynamically generate pkgconfig scripts, allowing native extension compilers to find headers properly. Additionally,
pip
, the standard Python module installer, comes with a CLI script that references the path
to the Python interpreter - the shebang of this script also needed to be patched.
Finally, one dependency on prefix
that could not be easily patched is OpenSSL, a dependency
for CPython’s ssl
module. When installed,
OpenSSL places a certificate bundle in a path under its prefix
. This certificate bundle contains
many well-known certificates for trusted public certificate authorities, and is used by
OpenSSL to verify server certificates for any SSL connection (e.g. when connecting to a site
over HTTPS). As a workaround, I chose to include certifi
(a Python package containing
Mozilla’s trusted certficate bundle) with the portable-python install, and patch
the ssl
module to look for trusted certificates under where certifi
is installed. The
result works well and allows programs to connect to HTTPS sites without manually supplying
trusted certificates.
Compiling for different architectures
Scrypted is supported on x86_64 and arm64 for both Linux and MacOS, and x86_64 for Windows. As such, portable-python needs to provide binaries that work on each of these platforms. Builds are produced on GitHub Actions, which, at the start of the portable-python project, only provided x86_64 runners (though as of this writing, MacOS arm64 runners are now available). The project needed some method of producing arm64 distributions that will still run on GitHub Actions.
On MacOS, I wanted to produce universal binaries, which are binaries that contain both x86_64
and arm64 code. This means that a single distribution would be able to run on both
architectures without using Rosetta translation. For the most part, this was straightforward to do by adding -arch x86_64 -arch arm64
to CFLAGS
or setting CMAKE_OSX_ARCHITECTURES=arm64;x86_64
in cmake. Some dependencies,
such as libffi, were less compliant with these flags, and I had to build separately for x86_64
and arm64, then use lipo to merge them together. Finally, to ensure that the result will
work with older MacOS versions, I set the environment variable MACOSX_DEPLOYMENT_TARGET=10.9
to tell the compiler to produce binaries compatible with OS X Mavericks.
On Linux, there are two common approaches to compiling for alternative architectures: cross compilation and Docker with QEMU emulation. Cross compilation is fast since the compiler runs natively on the build host, but all libraries and dependencies required to build the program must also exist for the target architecture. These libraries and dependencies are typically laid out in a sysroot, a directory structure that mirrors the one found in a host running the target architecture.
Docker, on the other hand, is able to run containers of a different architecture than the host through QEMU user-mode emulation. This is great for compilation, since the compiler running within Docker sees a proper host filesystem (no sysroot needed) and can emit binary code that it thinks is “native.” However, as can be expected of emulation, this is quite slow.
Like MacOS, I wanted to ensure that the Linux CPython builds are compatible with older Linux distributions. Generally, this backward compatibility is done at the glibc level, since glibc is required by most programs. The library’s symbols are versioned, and great care is taken by its authors to ensure backward compatibility, so a program built against one glibc release will work for any glibc released afterwards. Taking inspiration from Python’s manylinux2014 platform tag, which specifies that binary Python wheels should target glibc 2.17 (as of this writing, 65% of packages on PyPI target this), portable-python is also compiled for glibc 2.17.
The easiest way to ensure glibc compatibility is to use Docker and compile inside a CentOS 7 container. This is the approach that manylinux and cibuildwheel take. Early iterations of portable-python also took this approach.
Even though using a CentOS 7 container is convenient, I do not believe it is a sustainable solution for building programs compatible with older Linux distributions. One reason is the upcoming CentOS 7 EOL date. Another is the difficulty of compiling modern programs - CentOS 7 ships with an ancient gcc 4.8.5, and only some architectures get gcc 10 through Red Hat’s devtoolset-10 package. Modern programs may use newer C++ standards, requiring newer compilers. Though newer compilers can be built for CentOS 7 (I’ve built devtoolset-10 on armv7l before), the cost of maintaining these compilers will grow over time.
The solution currently employed by portable-python is to use the zig compiler. Built on
LLVM, the zig compiler exposes two commands, zig cc
and zig c++
, which are drop-in
replacements for C and C++ compilers. Additionally, zig’s use of LLVM allows it to target
different architectures and cross compile with ease. Finally - and this is the key feature -
zig is able to target a particular glibc release version while compiling. With this combination,
CPython and its dependencies can be cross compiled for a variety of architectures, targeting
glibc 2.17 without maintaining any extra containers or sysroot filesystems. Check out this
awesome writeup by Andrew Kelley that expands more on the capabilities of zig cc
.
Conclusion
Building portable-python has been an interesting experience, and I’ve learned much from the process. If you use the Scrypted NVR desktop app, chances are that some of your Scrypted plugins are running on portable-python builds. To get a copy of portable CPython, check out the project repo, where you can download zips from the releases or install via the nodejs installers.