One of the most important early decisions when building a Linux distribution is the scope of supported hardware. The distribution's default compiler flags are significant for hardware-platform compatibility. Programs that use newer CPU instructions might not run on older CPUs. In this article, I discuss a new approach to building the x86-64 variant of Red Hat Enterprise Linux (RHEL) 9 and share Red Hat's recommendation for that build.
Background of the x86-64 microarchitecture levels
The GNU C Library (glibc) offers a way to load optimized libraries that use additional hardware features that might not be present on all systems. Originally, this mechanism was designed to support perhaps one or two alternative library implementations, in addition to the default (fallback) implementation that is usually installed in the /usr/lib64
directory. However, the power-set construction involved in the library lookup mechanism poorly matches current platforms with a long list of optional CPU features. We see this especially on the x86 architecture, where many optional features have been added over the years (see the Wikipedia article for the CPUID instruction for a list). The plethora of choices poses a problem not only for the dynamic linker but also for programmers. Until recently, there has been little guidance on what CPU features to assume in optimized libraries. GCC and glibc disagree on the definition of feature sets, and the glibc selection mechanism is vendor-specific.
In the summer of 2020, AMD, Intel, Red Hat, and SUSE collaborated to define three x86-64 microarchitecture levels on top of the x86-64 baseline. The three microarchitectures group together CPU features roughly based on hardware release dates:
- x86-64-v2 brings support (among other things) for vector instructions up to Streaming SIMD Extensions 4.2 (SSE4.2) and Supplemental Streaming SIMD Extensions 3 (SSSE3), the POPCNT instruction (useful for data analysis and bit-fiddling in some data structures), and CMPXCHG16B (a two-word compare-and-swap instruction useful for concurrent algorithms).
- x86-64-v3 adds vector instructions up to AVX2, MOVBE (for big-endian data access), and additional bit-manipulation instructions.
- x86-64-v4 includes vector instructions from some of the AVX-512 variants.
We've documented the three levels in detail in the x86-64 psABI supplement. The upcoming GCC version 11 and LLVM version 12 releases will support them in -march=
arguments. Patches to augment the glibc
dynamic loader with a new mechanism (without the power-set construction) have been incorporated into glibc
under the glibc-hwcaps
moniker. These changes are expected to be part of the upcoming 2.33 release of glibc
.
Architectural considerations for RHEL 9
Historically, the x86_64 Red Hat Enterprise Linux userspace has been built to match the original AMD K8 baseline minus the AMD-specific 3Dnow! parts. That decision has held up to and including the latest version of Red Hat Enterprise Linux 8. However, due to kernel-driver removals, old hardware (such as systems with first-generation Opteron CPUs) are unlikely to run Red Hat Enterprise Linux in any useful fashion. There are also significant power requirements when running older hardware.
So far, we've been able to utilize new CPU features via mechanisms like IFUNCs, function multi-versioning, or loading alternative implementations via dlopen
, which could be automated with the ongoing glibc-hwcaps
work. Each of these approaches applies only to specifically designated blocks of code. The remainder of the distribution still does not employ additional CPU features, so those parts of the CPU are essentially dormant.
As a welcome side-effect of defining the x86-64 microarchitecture levels, we now have a convenient language for discussing the architectural baseline for Linux distributions: We can stay the course and use the original K8 baseline or we can apply one of the three later levels.
Recommendations for RHEL 9
We believe that x86-64-v2 is the appropriate choice for Red Hat Enterprise Linux 9. Our recommendation is based on the following observations:
- Virtual machine models that artificially mask x86-64-v2 CPU features despite host support are comparatively easy to adjust.
- The next level, x86-64-v3, is not available because we intend to build one unified distribution for the x86-64 architecture.
- The new server-class CPUs released in 2020 do not implement the AVX instruction set.
- AVX instructions are also unavailable in certain software implementations (although the
valgrind
tool supports them). The lack of emulation support could constrain developers targeting Red Hat Enterprise Linux 9.
As in previous Red Hat Enterprise Linux versions, we will continue to support other CPU features (beyond x86-64-v2) via IFUNCs and function multi-versioning. We might also use the glibc-hwcaps
mechanism to load optimized versions of libraries. As usual, these plans could change at any time before the release of Red Hat Enterprise Linux 9.
Currently, the changes do not apply to Fedora outside of Fedora ELN.
Support for other architectures
Together with its hardware partners, Red Hat regularly reviews the architectural baselines for all architectures. For IBM POWER and IBM Z, we have historically updated the baseline for every major release. For example, Red Hat Enterprise Linux 8 requires POWER little-endian (ppc64le) and—for the s390x architecture—a z13 or later CPU.
Conclusion
This article described the criteria guiding Red Hat’s approach to choosing an x86-64 microarchitecture level for Red Hat Enterprise Linux 9. Our recommendation, x86-64-v2, will support additional vector instructions (up to SSE4.2 and SSSE 3), the POPCNT instruction for data analysis, and the CMPXCHG16B instruction for concurrency algorithms.
Last updated: February 5, 2024