Formal Verification in Ironclad

Formal verification in Ironclad stands for the group of processes and checks that we make the kernel’s code go thru to ensure Absence of Runtime Errors (AoRTE) and the correctness of its components. The main tool that we use to do so is SPARK, a verifiable, Ada subset.

Approach and goals

The goal with Ironclad is having as formally verified of a kernel as possible, a goal that we continuously work towards, while trying not to compromise the growth in features of the project.

Verification is an expensive and exhaustive process, and its complexity increases the more code there is to verify. Since we have a small team working on Ironclad, the way we balance this big cost while trying to make Ironclad a useful kernel is to focus on new features, and then once they are working and stable, we verify them. This leads to Ironclad having less verified code than it would otherwise have if that was the first and foremost priority, but we believe this approach makes Ironclad be “worth verifying”. What is the usecase of verifying a kernel that cannot run any software because the features are just not there! At the end of the day, our limited manpower makes it so we have to pick our battles carefully, and is perfectly suitable to the pre-alpha stage Ironclad is in right now.

As a consequence, Ironclad is not 100% verified code. A majority of its architecture-independent code lies somewhere within SPARK’s spectrum, a lot of it on the lower levels of verification, with the more security-critical and user-facing parts verified to higher SPARK verification levels. Architectural code is much less verified.

SPARK, how it works, and verification levels

SPARK is a formally verifiable Ada subset that allows us to easily and conveniently verify sections of our Ada codebase, while having them be interoperable with the non verified parts. SPARK has a long history of use in the aerospace, defense, and transportation industries, and most importantly, has great FOSS checker options, like GNATProve, the checker we use with Ironclad.

SPARK essentially consists of contracts and constraints written on pre-existing Ada code which, alongside the requirements SPARK puts on the code itself, form a specification of behaviour the code is mathematically checked against. GNATProve uses a cocktail of deductive frameworks to do this checking, like Why3, CVC4, and Z3. No separate specifications are needed as part of SPARK, since these contracts and conditions form in themselves the specification.

Examples of these contracts are:

procedure Increment (X : in out Integer)
   with Pre => X < Integer'Last; --  Ensuring the increment cannot overflow.

subtype Prime is Integer range 2 .. IntegerLast
   with Dynamic_Predicate => (not (for some N in 2 .. Prime - 1 => Prime mod N = 0));
   --  Ensuring only primes may be assigned to this type.

procedure Swap_Bad_Depends (X, Y : in out Integer)
   with Depends => (X => Y, Y => X),
        Post    => X = Y'Old and Y = X'Old
   --  Ensuring the state of X and Y is only affected by each other, and that
   --  the correct swap indeed takes place.

SPARK code, after analysis, is classified in 5 levels building on top of each other, depending on the quality of their verification, what is being verified, and the assurances done with this verification, these levels from less to more verification are:

Level Meaning
Stone Valid SPARK,serves as an intermediary level during adoption, and allows the code to interface with other SPARK code
Bronze Correct data initialization and flow, but no assurance regarding soundness or AoRTE
Silver Full AoRTE, meaning no incorrect data states, memory leaks, or other circumstances occur, but no logical soundness guarantees
Gold Builds on top of silver with specifications and key logical properties checked by GNATProve
Platinum Full verification of logical specifications, purely at the discretion of the developer to determine.

Verification progress

Ironclad is divided on different subsystems, which can be seen in the source code as subdirectories and Ada packages. Here is an overview of the project:

Subsystem SPARK’s stone level progress SPARK’s silver/gold level progress Verification quality
arch Done Pending on arch ports Pending on arch ports
arch/riscv64-limine Not started Pending on stone level Pending on stone level
arch/x86_64-limine Not started Pending on stone level Pending on stone level
cryptography Done Done Full AoRTE
devices Done Mostly done Sporadic small errors
ipc Done Mostly done Full AoRTE selectively
lib Done Done Full AoRTE
memory Done Not started Pending on gold level
networking Done Done Full AoRTE
userland Done Mostly done Syscalls not verified
vfs Done Partial Full AoRTE at places

Obtaining verification reports

If you would like to analyze the formal verification status and information yourself, Ironclad is checkable locally using gnatprove. Further instructions are available on the project’s README.