Radiant Log #012

A fixed point has been reached

Earlier this week, the Radiance compiler reached a fixed point.

In compiler bootstrapping, a fixed point is when two consecutive stages of self-compilation produce bit-for-bit identical binaries. In other words, compiling the Radiance source code with the Radiance binary yields a new binary equal to the one we started with. Using this output to compile the source again keeps producing identical output. This is the fixed point.

Getting there requires the compiler to be fully deterministic, i.e. a pure function from input to output. Failing that, it will keep on producing different outputs and a fixed point will never be reached.

Let’s take a look at the bootstrapping process for Radiance. Each stage – one round of self-compilation – yields a new compiler binary, named s0, s1, s2 and so on. Stage 0 is the bootstrapping compiler itself.

Output Description
s0 Bootstrapping compiler, written in C99
s1 Radiance Stage 1 compiler, written in Radiance, compiled by s0
s2 Radiance Stage 2 compiler, written in Radiance, compiled by s1
s3 Radiance Stage 3 compiler, written in Radiance, compiled by s2

Let’s walk through the bootstrapping process. The first stage is to build radiance.s0, the bootstrapping compiler – a small ~10K LOC C implementation of Radiance that serves only to produce the next stage, s1.

> make bin/radiance.s0

cc   ast.c => ast.o
cc   desugar.c => desugar.o
cc   gen.c => gen.o
cc   io.c => io.o
cc   module.c => module.o
cc   options.c => options.o
cc   parser.c => parser.o
cc   radiance.c => radiance.o
cc   ralloc.c => ralloc.o
cc   riscv.c => riscv.o
cc   scanner.c => scanner.o
cc   strings.c => strings.o
cc   symtab.c => symtab.o
cc   resolver.c => resolver.o
cc   gen/data.c => gen/data.o
cc   gen/emit.c => gen/emit.o
..
ok   bin/radiance.s0

Once s0 is produced, we can use it to compile the Radiance implementation of the compiler, producing s1.

> bin/radiance.s0 radiance.rad -o bin/radiance.s1
ok   bin/radiance.s1

The s1 output is the Radiance (self-hosting) implementation of the compiler, compiled with the bootstrapping compiler (s0).

With s1 built, we can now produce s2, which is just like s1, except we use the Radiance implementation (s1) instead of the bootstrapping compiler. This is the first self-compilation.

> bin/radiance.s1 radiance.rad -o bin/radiance.s2
radiance: generating code ( bin/radiance.s2 ) ..
radiance: ok ( bin/radiance.s2 )

> diff bin/radiance.s1 bin/radiance.s2
Binary files bin/radiance.s1 and bin/radiance.s2 differ

Even though s1 and s2 have the same source code, the s1 and s2 binaries are different from each other because they weren’t produced by the same compiler.

The next step is to use s2 to produce s3. If all goes well, s2 and s3 should match, since they are both produced by semantically equivalent compilers built from the same source.

> bin/radiance.s2 radiance.rad -o bin/radiance.s3
radiance: generating code ( bin/radiance.s3 ) ..
radiance: ok ( bin/radiance.s3 )

> diff bin/radiance.s2 bin/radiance.s3 && echo "ok"
ok

And they match! We’ve reached the fixed point.

Since s3 is identical to s2, the fixed point was actually already reached at s2, but we needed to produce s3 to know for sure.

The compiler at s2 is stable: it can compile itself and reproduce itself identically. Reaching a fixed point is a stronger property than being simply self-hosting, as it proves that the compiler faithfully reproduces its own binary – a pure function with no hidden inputs that could influence the output.

The fixed-point binary (s2) now becomes the seed: a known-good compiler binary checked into the repository that can compile the current source. From here on, everyday development starts from the seed, not from the bootstrapping compiler.

On trust

We can now circle back to the trust problem discussed in Radiant Log #003, i.e. how is one to know that a compiler, even one that has a fixed point, doesn’t inject malicious code into its outputs?

By having a reproducible build process that starts with C source code, we simply need to audit the C code in addition to the Radiance code, and either build the bootstrapping compiler using a C compiler we trust, or reach the same fixed point using more than one C compiler. A backdoored compiler cannot survive independent compilation: if two independent build paths converge to the same fixed point, we have strong evidence the result is a faithful product of the source code.

To verify this, we run the bootstrap process with different C compilers and check that the fixed points match:

> make bin/radiance.s0 CC=clang
> bin/radiance.s0 radiance.rad -o bin/radiance.s1
> bin/radiance.s1 radiance.rad -o bin/radiance.s2
> sha256sum bin/radiance.s2
f405cc05c28e7e32b29086bf5a5e3314b0b69e3fa4e6ef8bbb0e2b74262dcda6

> make bin/radiance.s0 CC=gcc
> bin/radiance.s0 radiance.rad -o bin/radiance.s1
> bin/radiance.s1 radiance.rad -o bin/radiance.s2
> sha256sum bin/radiance.s2
f405cc05c28e7e32b29086bf5a5e3314b0b69e3fa4e6ef8bbb0e2b74262dcda6

If the hashes are equal, we have high confidence that the build process was not compromised and the binary is a faithful product of the source.

What’s next

Having reached this point, we can now focus on planned language features that weren’t strictly required for implementing the compiler. Things like generics, affine types, RAII, etc.

These features will be implemented in the next iteration of the Radiance compiler. When a change is breaking, meaning the current seed can no longer compile the new source, the seed must be updated to a binary that understands the new features. We update the seed by finding the fixed point and checking in the converged binary, so that the repository always maintains the invariant: the checked-in seed can compile the checked-in source.

Meanwhile, the bootstrapping compiler (s0) is effectively frozen in time. It need not change anymore, with the advantage of only needing to be audited once and requiring no future maintenance. It still serves a role in verification: anyone can rebuild from C source and check that they reach the same fixed point as the checked-in seed.

← Back

Radiant is an ongoing research project in personal computing. If this resonates with you, get in touch.