10 minute read

In the Scrypted home video platform, Python is one of the scripting languages used for implementing Scrypted plugins, predominantly used by plugins that provide motion detection, object detection, and other AI inference services (a byproduct of the heavily reliance on Python in existing machine learning communities). As such an ubiquitous language, Python has interpreters available for nearly every conceivable operating system and hardware architecture, though these distributions can vary in Python version, Python implementation (e.g. CPython vs PyPy), installation method, compatibility with Scrypted plugins, or even permissions required to install on the host. Being able to easily bundle a vetted Python distribution with a Scrypted installation would reduce the amount of variability that this critical dependency has on the platform.

Enter portable-python. This project is my attempt at configuring and packaging a distribution of CPython for Linux, Windows, and MacOS, across different architectures, that fulfills the needs of Scrypted’s growing set of Python plugins while still being suitable for general-purpose use. CPython was chosen as it is the one used by existing Scrypted plugins. The rest of this blog documents my experiences with building this portable distribution, as well as the technical challenges and design choices made to address them. This is by no means a testament into the most “correct” or “superior” method of achieving this goal - if there are better ways, please let me know by raising an issue in the repo.

Evaluating static linking

Since Scrypted is primarily a nodejs program, I was initially inspired by the ffmpeg-static project’s method of using an npm install hook and dynamically detecting, at install time, the target OS and architecture so the correct ffmpeg build can be selected and downloaded. A statically linked ffmpeg makes this straightforward to do, since a static binary removes any dependency on the host’s shared libraries and, more importantly, makes the binary relocatable. In other words, the ffmpeg binary can be installed to anywhere on the filesystem, and it will still function properly. This is important for an installer like ffmpeg-static to work - a nodejs package could exist anywhere on the filesystem, and could be installed by an unprivileged user account, so the included ffmpeg must not depend on hardcoded file paths or require installing anything into privileged system paths such as /usr. (Technically, static linking is not enough to solve the problem for more complex applications - more on that in a later section.)

It seems natural, therefore, to consider building a statically linked CPython interpreter. To do this, I chose to use python-cmake-buildsystem, leveraging its simple configuration options to build all standard Python extensions and link dependencies statically on Linux, Windows, and MacOS. While it was easy to get static CPython built with this buildsystem, a significant downside is that statically linking libc on Linux renders dlopen error-prone and no longer portable. dlopen is an important dynamic loader function used by the Python ctypes module as well as any Python modules with native extensions, so ensuring it works properly is important. Therefore, for portable-python, static linking is no longer a suitable approach - a dynamically linked executable is required.

Loading shared libraries

Binary executable files are stored on disk in certain standardized formats, which grant the operating system a way to determine how the executable is laid out in memory and how it should be executed. Executables are in ELF format on Linux, Mach-O on MacOS, and PE on Windows.

A dynamically linked CPython interpreter, especially one built with the full set of standard library modules, depends on may shared libraries. Some examples of such libraries are libssl and libcrypto for the ssl module and liblzma for the lzma module. Compiled executables contains references to shared libraries in its headers, and when the executable starts running, the operating system’s dynamic linker (a program responsible for loading the executable and resolving symbols at runtime) reads these headers to determine which libraries to load.

For PE executables on Windows, the DLL search path includes the executable’s source directory. This is convenient for portable-python, since any dependencies just need to be copied to the directory containing the CPython interpreter.

Both ELF and Mach-O executables search for shared libraries in a fixed number of default system directories (see docs for Linux and MacOS). This won’t work for portable-python, since there’s no guarantee that the shared library dependencies will exist on the host. There are environment variables that could be set to instruct the dynamic linker to search other paths, but expecting the end user to set those variables before using portable-python is bad UX.

Borrowing a technique used by auditwheel (a tool that creates Python wheels with all dependency shared libraries copied into the wheel itself), I considered replacing all references to shared libraries to a path relative to the executable. Conveniently, both ELF and Mach-O support constructing relative paths by using $ORIGIN in ELF and @executable_path in Mach-O to resolve the path to the executable file. Replacing the references can then be done with patchelf on Linux and install_name_tool on MacOS. Solving this should be simple with a script that enumerates all shared libraries and patches them, right?

Unfortunately, in my testing, patchelf proved to be unreliable and buggy. Patching each shared library individually sometimes results in a broken executable, unable to be loaded by the dynamic linker. Behavior across different architectures also varied, as the same version of patchelf may work for x86_64, but then break the executable on arm64.

Readers familiar with dynamic linking may be aware that there is a simple and obvious solution to this, one that I initially overlooked: rpath. rpath, or “run-path,” is a way to specify, as part of the executable, additional directories to look for shared libraries - without the need to export environment variables! Better yet, rpath also supports $ORIGIN and @executable_path, meaning it can be used to reference relative directories! This proved to be the key solution to bundling shared libraries in portable-python: All dependencies could be simply installed with the interpreter under the same folder structure, and rpath tells the dynamic linker to look for libraries in the installation directory, no matter where it is on the filesystem.

Relaxing the application prefix

It turns out, pointing an executable to its shared libraries is only half the battle. Inside the program binary, there could be hardcoded references to the path where it expects to be installed. This is where prefix comes into play.

When compiling applications on Unix systems, many build script generators (like autoconf and cmake) take in a prefix flag. This parameter tells the build scripts where the application will eventually be installed - for example, a prefix set to /usr/local will install executables under /usr/local/bin, libraries under /usr/local/lib, and so on.

As previously discussed, setting lookup paths for shared libraries can be done easily with rpath. What about other resources and assets required by CPython, such as Python files part of the standard library? prefix is used to resolve these. When CPython is compiled, the prefix value is stored as part of the sys module, with additional references to the path inside sysconfig. This prefix is also used by Python wheel builders to point native extension compilers to headers shipped with the Python distribution through static pkgconfig scripts.

To solve this, I made minor modifications to the CPython source code and Python standard library to rewrite hardcoded prefix values with dynamic runtime resolution. I also made modifications to dynamically generate pkgconfig scripts, allowing native extension compilers to find headers properly. Additionally, pip, the standard Python module installer, comes with a CLI script that references the path to the Python interpreter - the shebang of this script also needed to be patched.

Finally, one dependency on prefix that could not be easily patched is OpenSSL, a dependency for CPython’s ssl module. When installed, OpenSSL places a certificate bundle in a path under its prefix. This certificate bundle contains many well-known certificates for trusted public certificate authorities, and is used by OpenSSL to verify server certificates for any SSL connection (e.g. when connecting to a site over HTTPS). As a workaround, I chose to include certifi (a Python package containing Mozilla’s trusted certficate bundle) with the portable-python install, and patch the ssl module to look for trusted certificates under where certifi is installed. The result works well and allows programs to connect to HTTPS sites without manually supplying trusted certificates.

Compiling for different architectures

Scrypted is supported on x86_64 and arm64 for both Linux and MacOS, and x86_64 for Windows. As such, portable-python needs to provide binaries that work on each of these platforms. Builds are produced on GitHub Actions, which, at the start of the portable-python project, only provided x86_64 runners (though as of this writing, MacOS arm64 runners are now available). The project needed some method of producing arm64 distributions that will still run on GitHub Actions.

On MacOS, I wanted to produce universal binaries, which are binaries that contain both x86_64 and arm64 code. This means that a single distribution would be able to run on both architectures without using Rosetta translation. For the most part, this was straightforward to do by adding -arch x86_64 -arch arm64 to CFLAGS or setting CMAKE_OSX_ARCHITECTURES=arm64;x86_64 in cmake. Some dependencies, such as libffi, were less compliant with these flags, and I had to build separately for x86_64 and arm64, then use lipo to merge them together. Finally, to ensure that the result will work with older MacOS versions, I set the environment variable MACOSX_DEPLOYMENT_TARGET=10.9 to tell the compiler to produce binaries compatible with OS X Mavericks.

On Linux, there are two common approaches to compiling for alternative architectures: cross compilation and Docker with QEMU emulation. Cross compilation is fast since the compiler runs natively on the build host, but all libraries and dependencies required to build the program must also exist for the target architecture. These libraries and dependencies are typically laid out in a sysroot, a directory structure that mirrors the one found in a host running the target architecture.

Docker, on the other hand, is able to run containers of a different architecture than the host through QEMU user-mode emulation. This is great for compilation, since the compiler running within Docker sees a proper host filesystem (no sysroot needed) and can emit binary code that it thinks is “native.” However, as can be expected of emulation, this is quite slow.

Like MacOS, I wanted to ensure that the Linux CPython builds are compatible with older Linux distributions. Generally, this backward compatibility is done at the glibc level, since glibc is required by most programs. The library’s symbols are versioned, and great care is taken by its authors to ensure backward compatibility, so a program built against one glibc release will work for any glibc released afterwards. Taking inspiration from Python’s manylinux2014 platform tag, which specifies that binary Python wheels should target glibc 2.17 (as of this writing, 65% of packages on PyPI target this), portable-python is also compiled for glibc 2.17.

The easiest way to ensure glibc compatibility is to use Docker and compile inside a CentOS 7 container. This is the approach that manylinux and cibuildwheel take. Early iterations of portable-python also took this approach.

Even though using a CentOS 7 container is convenient, I do not believe it is a sustainable solution for building programs compatible with older Linux distributions. One reason is the upcoming CentOS 7 EOL date. Another is the difficulty of compiling modern programs - CentOS 7 ships with an ancient gcc 4.8.5, and only some architectures get gcc 10 through Red Hat’s devtoolset-10 package. Modern programs may use newer C++ standards, requiring newer compilers. Though newer compilers can be built for CentOS 7 (I’ve built devtoolset-10 on armv7l before), the cost of maintaining these compilers will grow over time.

The solution currently employed by portable-python is to use the zig compiler. Built on LLVM, the zig compiler exposes two commands, zig cc and zig c++, which are drop-in replacements for C and C++ compilers. Additionally, zig’s use of LLVM allows it to target different architectures and cross compile with ease. Finally - and this is the key feature - zig is able to target a particular glibc release version while compiling. With this combination, CPython and its dependencies can be cross compiled for a variety of architectures, targeting glibc 2.17 without maintaining any extra containers or sysroot filesystems. Check out this awesome writeup by Andrew Kelley that expands more on the capabilities of zig cc.

Conclusion

Building portable-python has been an interesting experience, and I’ve learned much from the process. If you use the Scrypted NVR desktop app, chances are that some of your Scrypted plugins are running on portable-python builds. To get a copy of portable CPython, check out the project repo, where you can download zips from the releases or install via the nodejs installers.