In the previous article, I discussed the benefits of C and C++ language restrictions in optimized code. In this second half, I present a variety of programming language exemptions and compiler extensions that developers can use to get around aliasing restrictions more or less safely. I will also discuss the common pitfalls of aliasing, both resulting from the extensions as well as from misuses of standard language constructs, and illustrate common problems these pitfalls might cause.
Exceptions to the rules
The restrictions imposed by the aliasing rules I introduced in Part 1 might seem relatively lenient, but many use cases would be impossible without exceptions to them, especially in system-level code. The C and C++ languages only codify two of these exemptions; the remaining ones are implementation-defined extensions found in popular compilers. Because the expressiveness and power of the implementation-defined extensions tend to be prioritized over safety and the ability to detect mistakes, their use is often fraught with peril—at least as much as, if not more than, using the standard mechanisms. In this section, I will introduce both programming language exemptions and compiler extensions. I will explain how these exceptions to the rules of aliasing are commonly misused and the consequences of abusing them.
Aliasing by character types
Both C and C++ provide an exemption to the type-aliasing requirement introduced in Part 1. This exemption allows you to copy objects by calls to functions like memcpy
and memmove
, or their user-defined equivalents. This exemption says that, in addition to its own type, an object of any type may have its value accessed by an lvalue of unsigned char
or any other narrow character type, although the exemption is best limited to unsigned char
. (In recent C++ the special library type std::byte
can also be used.)
We can use an lvalue of unsigned char
to prevent the optimization you saw in Part 1: Accesses via pointers to incompatible types. Consider the modified version here (also see Access by character type):
int f (int *a, long *b) { int t = *a; for (int i = 0; i != sizeof *b; ++i) ((unsigned char*)b)[i] = 0; return *a - t; // must not be folded }
In this case, the compiler cannot fold the return
expression because the function would be valid if called with b
equal to a
. However, using restrict
when declaring the a
and b
pointers would make calling f
with overlapping objects invalid. Doing that would re-enable the optimization opportunity.
This exemption keeps compilers from making assumptions about functions accessing objects via pointers to incompatible types. It does so even in a context where it's safe to assume pointers point to distinct objects, such as when pointers point to distinct types. For example, consider that we might also use the for
loop in this example in another function. A compiler would have to avoid folding the return
expression if f
were to call this other function, even if *b
were also read in f
.
The permission for unsigned char
to access objects of other types doesn't lift the constraint imposed by the restrict
keyword, however. Using the restrict
keyword with the pointer a
implies that a
and b
either do not overlap or that if they do, g(b)
doesn't modify *a
via *b
or any other means.
The next example shows how at least one compiler leverages this restriction to fold the subtraction in the return
expression (see Transitivity of restrict qualifier):
void g (void *); int f (int * restrict a, void * b) { int t = *a; g (b); // can be assumed not to modify *a return *a - t; // can be folded to zero }
This optimization is possible regardless of what type b
points to, or whether b
is declared with the restrict
keyword. Note, however, that for clarity, it's best to declare restrict
for all pointer arguments that you intend to be subject to this restriction.
Common initial sequence
As I mentioned earlier, the requirement to access every object by an lvalue of its type also rules out accessing a member of one struct
using a pointer to another struct
, even if both members have the same type. However, it turns out that this form of aliasing can be useful between members of the same union. To enable this use case, C and C++ offer the special exemption that when two struct
s are members of the same union, accessing their common initial sequence is valid. Programmers can use this special exception to modify the initial members of otherwise incompatible struct
types, as shown here (see Common initial sequence):
struct A { int num, a[2]; }; struct B { int cnt, a[4]; }; union U { struct A a; struct B b; }; int f (struct A *a, struct B *b) { int t = a->num; ((union U*)b)->b.cnt = 0; // may change a->num return a->num - t; // cannot be folded }
In this case, it's valid to call f
with both arguments pointing to the same object. Conversely, what if a compiler was compiling a call to a function that took pointers to distinct struct
s as arguments? In that case, the compiler would have to assume that the call could modify the initial sequence of the struct
members.
Implementation notes
One thing to note is that the common initial sequence consists of members that have compatible types. For arrays, the compatible type includes their size. The sequence ends with the first occurrence of a pair of members whose types are not strictly compatible. The sequence in the example consists of just the two members a->num
and b->cnt
. It doesn't extend to the first two elements of the a->a
and b->a
arrays, however, although they have the same type. The reason is that the types of the arrays are not compatible—int[2]
is not compatible with int[4]
, or even with int[]
, for that matter.
Another thing to note is that the common initial sequence rule isn't consistently interpreted by all implementers. As a result, different compilers might disagree about how to handle access to initial members of a union. In the most conservative interpretation, the mere definition of a union type implies that any accesses to struct
objects that share a common initial sequence should be assumed to alias unless proven otherwise. On the other side, the most restrictive interpretation says that the access must involve a cast to the union type. The latter implementation, which is shown in the example, is the safest approach for portable code. The GNU Compiler Collection (GCC) uses this interpretation.
Type punning via a union
The common initial sequence exemption says that when two struct
s are members of the same union, accessing their common initial sequence is valid. As an extension of this rule, it is also acceptable to access an object of one type via an lvalue of another type, if both types are members of the same union. This kind of access—reading an object by an lvalue of a type other than that of its stored value—is called type punning. It's permitted in standard C but disallowed in C++.
GCC supports type punning in both languages with a restrictive interpretation similar to what it offers for the common initial sequence. In this case, the access expression must involve the union type. See the comments about type punning in the GCC manual.
Attribute may_alias
By extending the exemption for unsigned char
, it is possible to use GCC's may_alias
type attribute to define a type that is exempt from type-based aliasing restrictions. Like unsigned char
, we can use lvalues of a type declared with this attribute to access objects of any type (see Access via a may_alias
type):
int f (int *a, long *b) { int t = *a; typedef __attribute__ ((may_alias)) long AliasLong; *(AliasLong*)b = 0; return *a - t; }
Used this way, may_alias
prevents the optimization found in the example of Accesses via pointers to incompatible types. The compiler can't fold the return
expression in this example because it's now valid for b
to point to the same object as a
.
Aliases and weak symbols
To support low-level programs and libraries, GCC and compatible compilers provide several extensions that make it possible to define aliases for functions, as well as variables. Like may_alias
, you will see that these extensions typically take the form of attributes, or sometimes of pragmas.
Attribute alias
When applied to a declaration of a function or variable, the attribute alias
tells the compiler that the declared symbol provides an alias, or alternate identity, for the symbol being named. The named symbol is known as the alias target. The target must be defined in the same translation unit as the alias; the alias itself can only be declared, it cannot be defined. Typically, especially in libraries, the alias is declared as an ordinary symbol, without the attribute. This declaration is placed in a header, which a program can include. The target usually is not declared in a public header.
Aliases are most commonly used to provide an alternative name for a function, but they also work for variables. For instance, because in the example below b
is declared as an alias—or as another name for the array a
—the compiler can no longer fold the return
expression to zero (see Attribute alias on a variable):
int a[8]; extern __attribute__ ((alias ("a"))) int b[8]; int f (int i, int j) { int t = a[i]; b[j] = 0; // modifies a return a[i] - t; // cannot be folded }
The alias declaration is a definition even when also declared extern
. The alias target must be defined in the same translation unit as the alias declaration. Consequently, in the general case, there is no way to declare an alias in a header to let the compiler know that the two refer to the same symbol. This fact violates one of the basic principles of the C and C++ object models, that distinct declarations must designate distinct entities.
If a compiler relies on this principle (as they all inevitably do), using the alias
attribute can lead to surprising results. As an example, say that both the alias and the target are declared in a header and used in a program. The compiler might fold the subtraction in the return
statement to zero in a function. This would be just like f
in the example above, but in a different source file.
Aliases are useful and important, but without extreme care, using them can lead to subtle bugs.
Attribute weak
The weak
attribute is similar to the alias
attribute: It declares that the function or variable it is attached to denotes a weak symbol, which may (but need not be) defined elsewhere in the program. If the symbol is not defined, its address is equal to null. Like alias
, the weak
attribute is also intended primarily to provide a mechanism to declare "special" library functions.
The typical example is the malloc
family of functions, which Unix-based implementations of the C library allow programs to replace with alternatives of their own. In this case, the "strong" definition would be used in place of the weak one. Specifying the weak
attribute on a variable declaration has the same meaning as it would on a function. Unlike alias
, however, weak symbols need not be defined. In those instances, the address of such an undefined weak symbol (either function or variable) is null, and so using such a symbol must be preceded by a test for its address being non-null.
This rule is in conflict with the C and C++ standards, which require that the address of every function and object in a program must be non-null. However, as long as the tested declaration is known to be a weak symbol (meaning, it has the attribute weak
), compilers will not use the standard requirement to remove such a test.
You can see an example in the following function, where a
is not declared as a weak symbol, and so is considered to declare an ordinary or strong symbol. The tests for a
are removed, but the test for b
is retained (also see Testing address of symbols for equality to null):
extern int a[8]; extern __attribute__ ((weak)) int b[8]; int f (int i, int j) { int t = a ? a[i] : 0; // replaced by 'int t = a[i];' if (b) // test emitted b[j] = 0; // may modify a return a ? a[i] - t : 0; // folded to zero }
Weak declarations
Any symbol can be declared weak
. If we were to declare a
to be weak
in a different file, eliminating the tests again would lead to surprising results. Consequently, if one declaration declares a symbol weak, they all should. Compilers tend to translate programs one source file at a time, so issuing warnings for code that does otherwise is rarely feasible.
Additionally, because an external symbol can also be declared to be an alias, if b
were declared an alias for a
in another source file, the results would be surprising, and likely incorrect. To illustrate the risks, imagine that we compiled the above file and linked it to a complete program with a file containing the following declarations:
#include <stdio.h> int b[8] = { 0, 1 }; extern __attribute__ ((alias ("b"))) int a[8]; int f (int, int); int main (void) { int n = f (1, 1); printf ("%i %i\n", b[1], n); }
Although the complete program compiles and links with no warnings, when we run it, it behaves as if a
and b
were distinct objects, even though they are one and the same.
Zero-length arrays
In contrast to the extensions discussed so far, GCC's zero-length array feature isn't meant to provide an escape hatch from aliasing rules. Rather, it's an ancient mechanism designed to get around the absence of flexible-array members, which were first introduced in C99. The goal of both a zero-length array and a flexible-array member is to declare a structure with a size that is determined at runtime. It allows the last member of such a structure to be an array with an unspecified number of zero or more elements. However, unlike a flexible-array member, which must always be the last member of a structure object, zero-length arrays are accepted in any context—even serving as interior structure members that are followed by other members.
With the exception of padding, an access to an element of an interior zero-length array is actually an access to a subsequent member. This is not an intended feature, but rather a consequence of overly permissive design. Compilers can (and GCC does) assume that such overlapping accesses do not take place. As a result, GCC 10 uses the new -Wzero-length-bounds
warning to diagnose accesses to interior zero-length arrays.
The next example illustrates both the invalid assumption that array accesses may alias other members of the same object, as well the warning that detects it (also see Aliasing by zero-length array):
struct A { int n, a[0]; }; struct B { struct A a; int x; }; int f (struct B *p, int i) { int t = p->x; p->a.a[i] = 0; // -Wzero-length-bounds return p->x - t; // can be folded to zero }
Opting out of aliasing rules
As the examples so far show, C and C++ outline exact requirements about the identity of symbols and objects in programs. Programs that abide by these requirements benefit by reducing the number of memory accesses required to reload unchanged values. But what about programs that were not written with these requirements in mind? Legacy software and poorly written code both omit aliasing requirements, albeit for different reasons. Is there some way for these programs to opt-out of aliasing rules?
Although you might think the answer would be yes, it turns out that, for the most part, the answer is no. GCC and compatible compilers do provide the -fno-strict-aliasing
option, but it only applies to a subset of the rules; namely, type-based aliasing. The -fno-strict-aliasing
option doesn't prevent GCC from making other assumptions I've discussed, including those about the identity of objects and the absence of other forms of aliasing (such as with zero-length arrays).
The price of aliasing exemptions
As with most exceptions to the rules, permission for other types to access objects of any type comes at a price. A particular use of the C++ std::string
container nicely illustrates this problem. Given that std::string
is little more than a wrapper for a character pointer—specifically, const char*
—the compiler assumes that any modification to a std::string
object could potentially modify any object that may be reachable by that pointer in the program. The only exception would be if a compiler could track the value of the wrapped pointer (or, using a more technical phrase, track its provenance) and prove otherwise. This rule holds even though the class guarantees that the wrapped pointer never points to anything but an internal buffer, which is managed by the object.
You can see this pitfall illustrated in the next example, where we would like the compiler to fold the return
expression to zero. Because of the exemption, no compiler is able to do it (also see Access by std::string aliases anything):
#include <string> int x; int f (std::string &str) { int t = x; str = ""; // assumed to alias x return t - x; // not folded to zero }
The price, in this case, is an efficiency penalty, and the problem isn't limited to std::string
. It affects any C or C++ container type that embeds an internal pointer that it uses to access data. With pointers to other types such as int*
, the scope of the problem is limited to objects of just the compatible types. So, in this example, we must assume that std::vector<int>
modifies any reachable variable of type int
.
Besides the loss of efficiency, exemptions from otherwise tight rules have another consequence: They limit the ability to detect and diagnose coding bugs. For instance, by accepting declarations of zero-length arrays even when they are followed by another member, compilers introduce the possibility of bugs into any code that accesses such members (also see Access by zero-length array):
struct A { int x, a[0], y; }; int f (struct A *a) { int t = a->y; a->a[0] = 123; // overlaps with a->y return t - a->y; // folded to zero }
With the exception of Visual C++, which doesn't support the extension, all tested compilers fold the return
expression in this example to zero. Yet, when called with the address of an object whose member y
is set to any value but 123
, the function returns an unexpected result: also zero. GCC 10 is the only compiler that detects this likely bug; it does so by issuing the -Wzero-length-bounds
warning. Unfortunately, GCC 10 is impotent against instances of the same bug when the zero-length array is the last member of a struct
sub-object, which is then followed by another member in some other struct
.
This is not a problem with standard flexible-array members. The C language requires those to be defined last in the outermost enclosing struct. That also means that a struct with a flexible array member cannot be used to declare a member of another struct. Regrettably, GCC accepts such invalid uses as another extension, with the possibility of causing the same bug (see Aliasing access by flexible array member):
struct A { int x, a[]; }; struct B { struct A a; int y; }; int f (struct A *a, struct B *b) { int t = b->y; a->a[1] = 123; // overlaps with b->y return t - b->y; // folded to zero }
Warnings to detect aliasing bugs
None of the popular compilers tested for this article detects any of the aliasing bugs we've discussed. However, GCC and compatible compilers expose two warning options, which are designed to detect disjoint subsets of these bugs: -Wstrict-aliasing
and -Wrestrict
. -Wstrict-aliasing
is a multi-level option designed to detect basic violations of type-based aliasing rules. -Wrestrict
, on the other hand, detects overlapping accesses by restrict
-qualified pointers in a subset of string-manipulation functions known to GCC, as well as passing the same pointers to restrict
-qualified arguments in user-defined functions.
Additionally, GCC 10 includes the new -Wzero-length-bounds
warnings, which are used to detect accesses to zero-length arrays.
All three warnings are included in GCC's -Wall
. While current implementations still leave much room for improvement, they are a sign that compiler implementers are moving in the right direction and attempting to detect these bugs that are otherwise hard to catch.
Conclusion
In this two-part series, you have learned that carefully following aliasing rules in C and C++ can benefit runtime efficiency. At the same time, it's quite easy to bypass the rules, either by necessity or by mistake. As many of the examples highlight, these exceptions often come with considerable risks. Using them incorrectly or carelessly can lead to bugs that are hard to find.
Using exemptions means that the compiler can only rely on the rules if it can prove that no exemption was exercised. Otherwise, the compiler must conservatively assume that the rules have been bypassed. Given that compilers have limited visibility into programs, such conservative assumptions typically result in suboptimal performance. This is especially unfortunate when we consider that most code does follow the rules, and only a small fraction of it uses exemptions.
Historically, C and C++ compilers were developed with the philosophy of trusting the programmer. As a result, few resources were devoted to verifying that code meets the underlying assumptions of our optimizations. This picture is starting to change as a result of widely publicized bugs. Still, compilers are only slowly adding checks before optimizing, to verify that the code is valid and that no exemptions have been misused. The efficacy of such checks also tends to be quite limited. Almost none have visibility into whole programs so that they can analyze at most one source file at a time.
To maximize the benefit of aliasing rules and minimize the risks of falling into the many traps I've outlined in this series, I recommend writing code that strictly follows the rules and avoids relying on exemptions and extensions, unless it is necessary. Also, I suggest using the -Wall
, -Wextra
, or equivalent compiler options to enable warnings and resolve all of their instances.
Keep in mind that compilers are improving in their ability to uncover problems with every release. Always upgrade to the newest compiler version as early as it is feasible. Finally, if you find a bug that you think your compiler should be able to detect, submit a test case to the GCC Bugzilla (first-time submitters should read how to report bugs or enhancement requests) and ask the compiler's implementer to diagnose it. With increasing sensitivity to the consequences of undefined behavior, the chances are that someone will make an effort to ensure that bug is detectable in a future version of the compiler.
Last updated: June 25, 2020