Formal verification in Ironclad stands for the group of processes and checks that we make the kernel’s code go thru to ensure Absence of Runtime Errors (AoRTE) and the correctness of its components. The main tool that we use to do so is SPARK, a verifiable, Ada subset.
The goal with Ironclad is having as formally verified of a kernel as possible, a goal that we continuously work towards, while trying not to compromise the growth in features of the project.
Verification is an expensive and exhaustive process, and its complexity increases the more code there is to verify. Since we have a small team working on Ironclad, the way we balance this big cost while trying to make Ironclad a useful kernel is to focus on new features, and then once they are working and stable, we verify them. This leads to Ironclad having less verified code than it would otherwise have if that was the first and foremost priority, but we believe this approach makes Ironclad be “worth verifying”. What is the usecase of verifying a kernel that cannot run any software because the features are just not there! At the end of the day, our limited manpower makes it so we have to pick our battles carefully, and is perfectly suitable to the pre-alpha stage Ironclad is in right now.
As a consequence, Ironclad is not 100% verified code. A majority of its architecture-independent code lies somewhere within SPARK’s spectrum, a lot of it on the lower levels of verification, with the more security-critical and user-facing parts verified to higher SPARK verification levels. Architectural code is much less verified.
SPARK is a formally verifiable Ada subset that allows us to easily and conveniently verify sections of our Ada codebase, while having them be interoperable with the non verified parts. SPARK has a long history of use in the aerospace, defense, and transportation industries, and most importantly, has great FOSS checker options, like GNATProve, the checker we use with Ironclad.
SPARK essentially consists of contracts and constraints written on pre-existing Ada code which, alongside the requirements SPARK puts on the code itself, form a specification of behaviour the code is mathematically checked against. GNATProve uses a cocktail of deductive frameworks to do this checking, like Why3, CVC4, and Z3. No separate specifications are needed as part of SPARK, since these contracts and conditions form in themselves the specification.
Examples of these contracts are:
procedure Increment (X : in out Integer)
with Pre => X < Integer'Last; -- Ensuring the increment cannot overflow.
subtype Prime is Integer range 2 .. Integer’Last
with Dynamic_Predicate => (not (for some N in 2 .. Prime - 1 => Prime mod N = 0));
-- Ensuring only primes may be assigned to this type.
procedure Swap_Bad_Depends (X, Y : in out Integer)
with Depends => (X => Y, Y => X),
Post => X = Y'Old and Y = X'Old
-- Ensuring the state of X and Y is only affected by each other, and that
-- the correct swap indeed takes place.
SPARK code, after analysis, is classified in 5 levels building on top of each other, depending on the quality of their verification, what is being verified, and the assurances done with this verification, these levels from less to more verification are:
| Level | Meaning |
|---|---|
| Stone | Valid SPARK,serves as an intermediary level during adoption, and allows the code to interface with other SPARK code |
| Bronze | Correct data initialization and flow, but no assurance regarding soundness or AoRTE |
| Silver | Full AoRTE, meaning no incorrect data states, memory leaks, or other circumstances occur, but no logical soundness guarantees |
| Gold | Builds on top of silver with specifications and key logical properties checked by GNATProve |
| Platinum | Full verification of logical specifications, purely at the discretion of the developer to determine. |
Ironclad is divided on different subsystems, which can be seen in the source code as subdirectories and Ada packages. Here is an overview of the project:
| Subsystem | SPARK’s stone level progress | SPARK’s silver/gold level progress | Verification quality |
|---|---|---|---|
| arch | Done | Pending on arch ports | Pending on arch ports |
| arch/riscv64-limine | Not started | Pending on stone level | Pending on stone level |
| arch/x86_64-limine | Not started | Pending on stone level | Pending on stone level |
| cryptography | Done | Done | Full AoRTE |
| devices | Done | Mostly done | Sporadic small errors |
| ipc | Done | Mostly done | Full AoRTE selectively |
| lib | Done | Done | Full AoRTE |
| memory | Done | Not started | Pending on gold level |
| networking | Done | Done | Full AoRTE |
| userland | Done | Mostly done | Syscalls not verified |
| vfs | Done | Partial | Full AoRTE at places |
If you would like to analyze the formal verification status and information yourself, Ironclad is checkable locally using gnatprove. Further instructions are available on the project’s README.