This is part 2 of a 3-part series on Clang. Read part 1 here.
A popular technique modern compilers use to improve the runtime performance of compiled code is to perform computations at compile time instead of at runtime. However, constant expressions need to be evaluated at compile time for a variety of reasons. To help resolve this problem, I've been working on improving the Clang's constant interpreter. Here's a look at just how much work has been done since my previous article in November 2022:
$ git log --after=2022-11-23 | grep -i "\[clang\]\[Interp\]" | wc -l
$ 308
A good chunk of those are NFC (no functional change) commits, which I like to split out from functional changes. All of those patches contain tons of small changes and refactorings, so in this article I concentrate on the largest and most important changes. The new constant expression interpreter is still experimental and under development. To use it, you must pass -fexperimental-new-constant-interpreter
to Clang.
Uninitialized local variables
This change feels like it landed an eternity ago, but it was in December 2022. Before C++20, local variables in constexpr
functions had to have an initializer.
In C++20, the constant interpreter needs to handle uninitialized local variables and diagnose if the program tries to read from them.
In the new interpreter, this information is handled via an InlineDescriptor
struct, which simply contains one bit for the initialization state of the variable. Every local variable is preceded by an InlineDescriptor
now, so checking whether a local variable is initialized is as simple as reading that one bit:
┌─────────────────┬─────────┐
│InlineDescriptor │ Data... │
└─────────────────┴─────────┘
This sounds simple enough, but as you can see from the Phabricator review linked above, it required quite a few changes in a lot of places because the assumption so far has been that local variables are always initialized.
Support for floating-point numbers
This change seems like it's been supported for even longer, but the new interpreter now supports floating-point numbers.
To support this, there is now a PrimType
called PT_Float
, which is used for all floating types, most typically float
and double
. It is backed by a new Floating
class, which represents one such value. A Floating
variable is basically a wrapper around LLVM's APFloat
class, which does exactly the same thing.
For integer primitives, the PrimType
fully defines the type. It specifies both the bit width as well as the signedness; e.g., PT_Sitn32
is a signed 32-bit integer. For floating-point values, that is not the case, so we need more data when working with floating-point values. Typically, that means we need to pass the fltSemantics
around so we know if we have a double
or float
value (or any of the many other floating-point types). In other cases, we need to pass on the RoundingMode
. If you've worked with LLVM's APFloat
before, both of those are probably well-known to you.
In practice, the new interpreter gained new opcodes for floating-point operations, like Addf
, Subf
, and etc:
class FloatOpcode : Opcode {
let Types = [];
let Args = [ArgRoundingMode];
}
def Addf : FloatOpcode;
def Subf : FloatOpcode;
And when generating bytecode, we need to check whether we're dealing with a floating-point type or not.
The "AP" in APFloat
means "arbitrary precision" and to support this use case, each APFloat
variable may heap-allocate memory to save excessively large floating-point values. This poses a particular problem for the new interpreter, since values are allocated in a stack or into a char array, the byte code. So without special support, this results in either uninitialized reads or memory leaks. To support this, the new interpreter has special (de)serialization code to handle Floating
variables.
Handling floating-point values correctly was an important step forward, since parts that make them special will also apply to other types that are yet to come, like arbitrary-bitwidth integers (think _BitInt
or 128-bit integers on hosts that don't support them).
Initializer rework
One of the larger changes I implemented since the last article is reworking initializers.
Previously, we had visitRecordInitializer()
and visitArrayInitializer()
functions which initialized the pointer on top of the stack. For _Complex
support, I've added an additional visitComplexInitializer()
function, but that never got merged. These functions all handled a few types of expressions differently than the normal visit()
function. In short, the difference was that visit()
created a new value, while visit*Initializer()
initialized an already existing pointer with the values from the provided expression.
However, this caused problems in some cases, when the AST contained an expression of a composite type that was not initializing an already existing pointer. We had no way of differentiating these cases when generating byte code.
In the new world, the byte code generator contains more fine-grained functions to control code generation:
visitInitializer()
: This sets an internalInitializing
flag totrue
. When generating bytecode, we can check that flag and act accordingly. If it is true, we can assume that the top of the stack is aPointer
, which we can initialize.discard()
: Evaluates the given expression for side-effects, but does not leave any value on the stack behind.visit()
: The oldvisit()
function is still being used but will now automatically create a temporary variable and callvisitInitializer()
to initialize it instead, if the given expression is of composite type (and a few other restrictions). This ensures thatvisit()
always pushes validPrimType
to the stack.delegate()
: Simply passes the expression on, keeping all the internal flags intact. This is a replacement for the previous pattern ofreturn DiscardResult ? this->discard(E) : this->visit(E)
.
Invalid expressions
Even though every new C++ version supports more and more constructs in constant contexts, there are still some constructs that aren't supported. For those, we've added a new Invalid opcode that simply reports an error when interpreted.
Such an opcode is necessary since we can't reject a constexpr
function right away when generating bytecode for it and encountering such an expression. For example, the following function can be executed just fine in a constant context, even though the throw statement is not supported in a constant context:
constexpr int f(bool b) {
if (b)
throw;
return 1;
}
static_assert(f(false) == 1); // Works
static_assert(f(true) == 1); // Doesn't
Builtin functions
Clang has tons of builtin functions (starting with __builtin
), many of which are also supported during compile time. Since the last article, the new interpreter has gained support for quite a few of them, mostly floating-point builtins like __builtin_fmin()
:
static_assert(__builtin_fmin(1.0, 2.0) == 1.0);
Most of the builtin functions are not hard to implement, but they go a bit against what the new interpreter does: generate (target-dependent) byte code. Instead, we have to do the computations on target-independent values and then convert them to target-dependent values again. This is most interesting for the size of types (e.g., int
isn't always 4 bytes).
Support for complex numbers
C and C++ have a commonly supported language extension called "complex numbers." You might remember them from math class. For our purposes, the most interesting part is that they consist of two components: real and imaginary.
Here's a small demo in case you've never seen them:
constexpr _Complex float F = {1.0, 2.0};
static_assert(__real(F) == 1.0);
static_assert(__imag(F) == 2.0);
Because they always consist of exactly two elements, we model them as a two-element array and don't create a special PrimType
. As an example, implementing the __real
unary operator from the example above can be done by simply returning the first element of the array:
case UO_Real: { // __real x
if (!this->visit(SubExpr))
return false;
if (!this->emitConstUint8(0, E))
return false;
return this->emitArrayElemPtrPopUint8(E);
}
This will push a floating-point value equal to the first element in the array on the stack.
Of course, these are the simple operations that need to be supported for complex types. Arithmetic operations are still a work in progress. I have a series of patches for complex types that are already finished and approved to be pushed, but I'm trying to hold them back until I'm sure the design works out. This is important because the design carries over to the implementation of vector types and fixed-point types.
Google Summer of Code
As a side note, I have also been busy this past year mentoring a GSoC student, Takuya Shimizu, who improved Clang's diagnostic output.
You can read more about his changes and the improvements in Clang 17's diagnostic output in general in his blog post.
Smaller additions and future work
The remaining changes aren't as interesting, but here are a few:
- Global variables of record and array type are now (recursively) checked for initialization.
- Implement missing mul, div and rem compound assign operators.
- Implement switch statements.
- Implemented builtin functions:
__builtin_is_constant_evaluated()
,__builtin_assume()
,__builtin_strcmp()
,__builtin_strlen()
,__builtin_nan()
,__builtin_nans()
,__builtin_huge_val()
,__builtin_inf()
,__builtin_copysign()
,__builtin_fmin()
,__builtin_fmax()
,__builtin_isnan()
,__builtin_isinf()
,__builtin_isinf_sign()
,__builtin_isfinite()
,__builtin_isfpclass()
,__builtin_fpclassify()
,__builtin_fabs()
,__builtin_offsetof
. - Support for logical and/or operators.
- Support for C++ range-for loops.
- Support for destructors.
- Support for function pointers (and calling them).
- Track frame depth (including diagnostics).
- Support for virtual function calls.
- Support for lambda expressions.
- Support for
SourceLocExprs.
My work in the following months will concentrate on supporting more constructs we need to support standard headers. This includes in particular 128-bit integers and IntegralToPointer
casts. As always, I'd like to use this opportunity to thank all the reviewers who spend so much time reviewing my many patches. This includes especially, but not exclusively, Aaron Ballmann, Corentin Jabot, Erich Kaene and Shafik Yaghmour.