Welcome to elle’s documentation!¶
Overview¶
Elle is a compiler focused on generating code for the Ethereum Virutal Machine (EVM). What sets is apart from other Ethereum compiler technologies is that it is foundationally verified: that is, it is implemented inside of Isabelle, a proof assistant that enables programmers to state and prove mathematical theorems about their code. In the case of Elle, a theorem exists stating that the behavior of the code output by the Elle compiler matches the behavior of its input (see Correctness).
Assuming we trust the model of Elle’s source language (an LLL/Yul/WASM-like structured programming layer on top of EVM) and the semantics of EVM (drawn from the Eth-Isabelle project, we can have complete confidence that Elle generates EVM programs that match the programmer’s intent: that is, that behave the same way that input programs are supposed to behave.
In the rest of this documentation, we’ll cover :ref:how to install Elle <installation>, how to use Elle’s FourL frontend <usage_> as an end user to compile smart contracts written in the LLL language into EVM bytecode. Next, we’ll dive into the details of Elle’s source-level representation, covering its syntax <syntax_> and its formal semantics <semantics_>. Next, we’ll talk about the internals of the implementation <implementation_> of the Elle compiler (along with the FourL frontend) as well as its correctness proof <correctness_>.
Elle is intended to be supplanted by Gazelle, a . As such, Elle itself is unlikely to see significant changes at this point. Nonetheless, Elle as it exists is a useful system: it can be used to compile real-world LLL smart contracts; namely Dan Ellison’s Echo smart contract, Ben Edgington’s LLL-ERC20 token contract, and the LLL implementation of the ENS registry. For these examples, see the tests directory.
Installation¶
Contents
Elle’s installation is quite straightforward if all you want to do is use the FourL frontend as described in Usage of the LLL Frontend, and somewhat more involved if you want to be able to generate, build, check, or modify the implementation and proofs of Elle.
Getting Elle¶
In the rest of this file, let >
stand for a bash-compatible shell
prompt (that is, lines starting with >
are to be understood as input)
Clone Elle from the Github repo as follows:
> git clone https://github.com/mmalvarez/eth-isabelle
Later in this file, we assume that Elle is checked out into
a directory called eth-isabelle
. (This repository has this name because
it is a fork of Yoichi Hirai’s eth-isabelle <https://github.com/pirapira/eth-isabelle>
project specifying EVM in the Isabelle proof assistant.)
Since Elle is no longer under active development, the master
branch should
be stable enough to build reliably. For a version of Elle that is more guaranteed
to build successfully, try the ITP2019
branch, which contains an artifact
submission for the 2019 edition of the Interactive Theorem Proving conference:
> cd eth-isabelle
> git checkout ITP2019
Installation as an End-User of Elle¶
The following dependencies are required to build llllc
, the
FourL command-line interface to the FourL compiler frontend that makes
use of Elle’s verified compiler.
- Ocaml (tested with v4.05.0)
- OcamlBuild
- OCaml Zarith
On an Ubuntu-like system, these can be installed as follows:
> apt-get install ocaml ocamlbuild libzarith-ocaml libzarith-ocaml-dev
Once these dependencies are installed, navigate to eth-isabelle/elle/generated
and run make
. This should succeed and generate a file called llllc
.
When run on an lll file, it will print (to standard output) a hexadecimal
representation of the bytecode produced by the compiler for that smart contract
(similar to Solidity LLL’s command-line tool, but with fewer options and error
messages). Files to run llllc
on can be found in the eth-isabelle/elle/tests
directory.
For more details on using llllc
, see Basic Usage.
Installation for Modifying and Examining Elle¶
The Elle git repository includes the file eth-isabelle/elle/generated/FourL.ml <https://github.com/mmalvarez/eth-isabelle/blob/master/elle/generated/FourL.ml>. This file is generated from the formal Isabelle model contained in the rest of the Elle repository, and is all that is needed to build a working executable version of Elle/FourL as described in end-user-installation.
In order to work with the formal model directly, Isabelle itself is needed as a dependency. Elle requires Isabelle 2018, which can be downloaded here<https://isabelle.in.tum.de/website-Isabelle2018/index.html> (binaries for Linux, MacOS, and Windows are provided).
Once Isabelle is installed, the user will need to set up Lem, a framework used to generate the some of the Isabelle specifications used by Elle. In order to do this, first run the following, to update the Lem submodule contained in Elle’s git repository:
> cd eth-isabelle
> git submodule init
# output snipped
> git submodule update lem-installation
# output snipped
we have tested with the version of Lem having the Git hash of 0927743c1bd31d7bba20a54260ba4c4dd3ce49e9
.
Newer versions should also work. Older versions may not support generating code compatible with Isabelle2018.
In order to build Lem, run the following:
> cd lem-installation
> make
# output snipped
If this succeeds, it will generate an executable called lem
. Add it to your path, and ensure the
it will look for its libraries in the correct place, by running the following:
> `export PATH=$PWD/lem-installation/bin:$PATH`
> `export LEMLIB=$PWD/lem-installation/library`
Finally, navigate back to the root of the repository (eth-isabelle
), and run the following to
build the .thy
files that Elle depends on of from their Lem sources:
> make lem-thy
Examining Elle Sources¶
Isabelle allows .thy
files representing formal models and
proofs to be grouped together into sessions. Sessions make it easier
to automate the process of compiling Isabelle developments, as well
as allowing for caching the results of compilation and proof-checking
so that work does not need to be repeated each time Isabelle is
re-opened. Elle contains a session called ElleCorrect
, which
packs together all the files containing Elle’s correctness proofs into
a single session file.
However, in order to be able to
step through the proofs contained in the ElleCorrect
session,
it’s better not to run the ElleCorrect
session, since. Therefore,
to examine Elle’s proofs,
run Isabelle-Jedit, with the HOL
session
isabelle jedit -d ./lem -l HOL
For some proofs (particularly the more complex ones in elle/ElleCorrect
)
you will need to increase the editor’s limit on the number of allowed tracing
messages (or else the proofs will pause and appear to get stuck). To do this,
navigate through the Isabelle/JEdit menus as follows
Plugins > Plugin Options > Isabelle > General > Editor Tracing Messages
Increase this value to 30000.
The most interesting proofs are in eth-isabelle/elle/ElleCorrect
. The final
correctness theorems for the compiler are elle_alt_correct*
in eth-isabelle/elle/ElleCorrect/ElleAltSemantics.thy
For more details on the structure of Elle, see Implementation.
Recreating FourL.ml¶
The command-line binary version of the Elle-based FourL compiler depends on
FourL.ml
, an Ocaml file that is produced from a formal Isabelle model
via Isabelle’s built-in extraction mechanism. As such, FourL.ml can be regenerated
from Elle’s sources, provided Isabelle is installed. This can be done as follows:
> isabelle jedit -d ./lem -d ./elle -l ElleCorrect
This will open the ElleCorrect
session (building this session for the first time
can take some time - as much as a couple of hours on a 16Gb machine). Once this session
is done being processed, open the file eth-isabelle/elle/ElleCorrect/FourLExtract.thy
.
If that file is processed to the end (which can be forced by moving the cursor to the end of the file)
it will create a new version of eth-isabelle/elle/generated/FourL.ml
, which can then be built as
described in end-user-installation.
Usage of the LLL Frontend¶
Contents
In order to provide an convenient interface for programmers to work with Elle’s verified compilation system, Elle provides a frontend called FourL that allows users to translate programs written in LLL to EVM bytecode. Unlike other implementations of LLL, FourL uses Elle’s verified translation algorithm as a core layer, ensuring that address resolution and label scoping are handled properly. (For more details about what exactly Elle handles, see Implementation and Correctness).
Basic Usage¶
After building the llllc
frontend as described in Installation as an End-User of Elle,
> cd eth-isabelle/elle/generated
> ./llllc ../tests/if.lll
6001600a576003600d565b60025b
This hex-code is output in the same format as Solidity LLL, and can be used with
other existing tooling for deployment, testing, and static analysis.
(For instance, ganache
and web3.js have been used for testing
eth-isabelle/elle/tests/echo.lll
)
Supported LLL Constructs¶
FourL supports a large subset of LLL, but does not support the entire language. The supported constructs are listed below. For more information about the meaning these and other LLL commands, see the lll documentation
Command | Notes/Caveats |
---|---|
seq |
Unlike Solidity LLL, sequencing does not clean up the stack after push instructions |
if |
Expands to Elle control-flow |
when |
Expands to Elle control-flow |
unless |
Expands to Elle control-flow |
for |
Expands to Elle control-flow |
returnlll |
Supported only in a special case: when a single returnlll instruction occurs at the end of the constructor to return the code for the contract body |
lit |
Implemented using push rather than codecopy. As such, only supports up to 32-bit constants. |
+/add |
|
- |
|
* |
|
div |
|
exp |
|
/ |
|
% |
|
sha3 |
|
keccak256 |
|
& |
|
| |
|
^ |
|
~ |
|
shr |
|
&& |
|
|| |
|
! |
|
= |
|
!= |
|
> |
|
< |
|
<= |
|
>= |
|
mstore |
|
mload |
|
return |
|
stop |
|
calldataload |
|
calldatacopy |
|
calldatasize |
|
callvalue |
|
caller |
|
sstore |
|
sload |
|
log0-log4 |
|
event0-event4 |
|
revert |
Support for new constructs can be added by modifying the
list of FourL macros (default_lll_funs
) in
eth-isabelle/elle/FourL.thy.
This will require regenerating FourL.ml
as described in Installation for Modifying and Examining Elle.
Debugging Failed Compilation¶
Unfortunately, the current version of Elle lacks detailed error reporting. Compilation either succeeds, in which case bytecode is output, or it fails, in which case a failure cause is not reported. This is one aspect of Elle that needs to be corrected in its next incarnation, a generalized compiler called Gazelle.
One option is simply to try to try to identify minimal error cases by writing smaller lll programs and trying to understand the cause of the failure.
Another, more advanced option for understanding failures in the Elle/FourL compiler involves running the compiler inside of the Isabelle proof assistant as described in running-compiler-in-isabelle. In this way, one can run different phases of the compiler separately to identify where exactly the error is happening. This requires setting up Isabelle and Lem as described in Installation for Modifying and Examining Elle.
Inspecting Bytecode¶
To help inspect the output of llllc
, you may find it useful to use the
EVM bytecode parser contained in eth-isabelle:
eth-isabelle/parser/hexparser.rb
You’ll need an installation of Ruby (tested with 2.5.1p57) to use this tool. It takes hex bytecodes like those
output by llllc
(or other compilers for Ethereum, such as Solidity LLL) on standard input and outputs
(on standard output)
a series of mnemonics describing the opcodes in the input.
Elle Syntax¶
Contents
(Note: this documentation page is adapted from an entry in the Elle Github wiki)
Goal: Compiling Structured Code to EVM¶
The Elle source language, also known as Elle-Core, captures structured programming abstractions and enables their translation to Ethereum EVM bytecode through a verified compiler (whose details are described in Implementation).
What exactly is meant by structured programming? It corresponds to some features of languages that we usually take for granted: the ability to perform sequencing, if-statements, and loops in a predictable way that enables us to reason about different sub-components of a program separately, and then combine the results together soundly.
More concretely, suppose we have the following rather contrived program P1, that pushes two values onto the stack (the following is pseudocode for EVM bytecode):
; program P1
push 0x00
JUMPDEST
pop
push 0x01
If run from the beginning, this code is innocuous: the JUMPDEST instruction will have no effect, the first PUSHed value will be POP’d back off, and we will end up with 0x01 at the top of the stack. However, if execution starts from the JUMPDEST, we have a problem: if the stack is empty beforehand, running just the latter 3 lines of P1 will cause a stack underflow and halt execution. This can happen, for instance, if P1 resides at (code-buffer address) 0xA0, and we have a possible path through the program that goes through the following code-snippet, sub-program P2:
; program P2
push 0xA1 ; address of JUMPDEST of p1
jump
We might want to prove that P1 will never cause a stack underflow. However, if P2 begins running with an empty stack, it will call into P1 in an unintended way that will cause the machine to crash. The fundamental problem here is that jumps in EVM are always based on absolute addresses, meaning there is no way to protect your code from another part of the program that happens to know the address of an internal JUMPDEST to which jumping would violate your code’s invariants.
Instead, we can enforce a structured programming discipline - eschewing explicit jumps (i.e., “considering goto harmful”) and instead using conditionals and loops to describe branching control flow. By denying users of the source-language (Elle-Core) the ability to do arbitrary jumps, we are able to compose sub-programs in predictable ways.
De Bruijn Indices and Structured Programming¶
Elle-Core provides a very general kind of structured programming, able to express the usual structured constructs such as if and while, but also more intricate control-flow structures, while maintaining the guarantee that “internal” labels within a sub-program cannot be improperly accessed through a notion of scope.
Elle’s approach to representing scope is inspired by a practice from the programming-languages community called *de Bruijn indices* that provide a convenient way to describe scoped variables (it is also similar to the approach taken by WebAssembly). For an example of how this looks in a traditional context, consider a function that takes two parameters, discards the first, and returns the second. In the lambda calculus, we write this as
(\ x . (\ y . y))
However, we have chosen arbitrary variable names, which brings inconveniences. Nothing stops us from writing
(\ y . (\ x . x))
which expresses the same function. Perhaps more concerning, we could have equally written
(\ x . (\ x . x))
which also corresponds to the same function, because of the scoping convention of the lambda calculus: if we bind a variable name twice, the innermost binding takes precedence.
De Bruijn indices provide a clever trick to eliminate this ambiguity of naming and get a more canonical representation of functions that does not depend on specific variable names. The idea is that, because inner bindings take precedence, we can always describe variables in relative terms: each variable is uniquely distinguished by how many levels up in the syntax tree that variable was bound. So our example above becomes
(\ . (\ . #0))
#0
returns the second (innermost) parameter (we are zero-indexing). Had we wanted to return the first parameter instead, we would have written
(\ . (\ . #1))
With Elle, we do something similar, applying this same notion of scoping discipline to our labels. Each sequencing node (sequencing together Elle subprograms) creates a new context in which a new jump-target (label) can be described. Specifically, sequence nodes in Elle can have exactly zero or one label node poiting up to them, which corresponds (if there is one) to the JUMPDEST instruction that will be the target of this scope’s jump. Jumps work similarly, specifying their targets based on which scope they will jump to (“jump n” means “jump to the label bound in the scope n levels up in the syntax tree”) .
For instance, here is Elle pseudocode for an IF statement:
seq [
seq [
push 0x01
jumpI #0
push 0x02
jump #1
label #0
push 0x03
label #1
]]
Note that this approach provides the locality that we need: two disjoint Seq nodes will have no way of referencing each other’s corresponding bound label.
Elle-Core Syntax¶
To cut to the chase, here is the syntax definition for the Elle-Core language, as implemented in Isabelle:
type_synonym idx = nat
datatype ll1 =
L "inst"
(* de-Bruijn style approach to local binders *)
| LLab "idx"
| LJmp "idx"
| LJmpI "idx"
(* sequencing nodes also serve as local binders *)
| LSeq "ll1 list"
Label Resolution in Elle¶
Hopefully I’ve convinced you that de Bruijn indices are a convenient way to represent the binding structures Elle needs to handle. Next I’m going to describe how we translate this code (that is, syntax trees of type ll1) into EVM bytecode.
Our first step is to calculate locations (referred to in the codebase as quantitative annotations, or qan) for each instruction in our program. The idea is as follows. EVM instructions take up a certain number of bytes as specified in the EVM specification. Seq constructs do not take up any space other than the space taken up by their members. Label constructs take up one byte (the size of a JUMPDEST instruction). Jump instructions take a variable number of bytes, depending on the length of the address to jump to - this number of bytes starts at 2 (one for the JUMP itself, one for the PUSH instruction that puts its address onto the stack) but increases to accommodate the size of the address (the PUSH payload) that is actually calculated.
Once we have locations computed for all of our syntax-tree nodes, we begin examining the binding structure. For each sequence node, we examine all LLab nodes descended from it. If an LLab node is descended from an LSeq node at a distance of n, and that LLab’s parameter (a natural number representing the index) is equal to n-1 (remember that we are using zero-indexing), the LLab’s location within the tree rooted at that LSeq node is recorded.
If more than one such LLab is found for any one LSeq node, the compiler fails, as the user has given an invalid program. After all, it would not do to have the following (this is real Elle code this time, not pseudocode):
LSeq [
LJmp 0,
LSeq [
LLab 1,
L (PUSH_N [0])
],
LLab 0
]
The root LSeq node in this example has two labels “pointing upward” to the same sequence node. This creates an unacceptable ambiguity: to which label should the jump dispatch control flow? Depending on which we pick, we would have have added either 1 or 0 elements to the stack, so clearly they have different behavior. Elle will fail to compile code if it detects this condition.
Resolving Jump Addresses¶
Of course, we’re not quite done: we still have to compute the addresses that each of our LJmp nodes will jump to (i.e., what value will be pushed onto the stack before the jump). At this point things get tricky. To save space, we want to minimize the number of bytes we push for each jump. Thus, we begin with the optimistic assumption that each jump target’s address will be represented with one byte. With this assumption, we begin looking up the addresses of the labels corresponding to each jump and attempting to fit them into the number of bytes we have allocated.
If we ever fail to fit an address in the space we have allotted, we increase the number of bytes allocated to that jump by 1. Then we recalculate the addresses of all the Elle syntax nodes that must now be shifted, forget all the addresses of jumps we have so far resolved, and then begin the process again. Forgetting the previously resolved jumps is necessary, as their targets’ addresses may have changed as a result of the 1-byte adjustment we just made.
Once we have resolved all jump addresses successfully, we have reached a form where we can quite easily write out our program as a sequence of EVM instructions. This forms the bytecode output by the Elle-Core compiler.
Conclusion¶
In this post, I have described the syntax of Elle-Core, the intermediate representation of the Elle system enabling structured programming, and its translation to EVM. The goal of the Elle project is to formally verify this translation. The translation itself is described in Implementation, and additional details about the verification can be found in Correctness.
Elle Semantics¶
Contents
(Note: this documentation page is adapted from an entry in the Elle Github wiki
Goal: A Formal Meaning for Elle Programs¶
This document is intended to describe the semantics of the Elle-Core intermediate representation, the code that is translated to EVM via the Elle system using a verified procedure. What it means for this procedure to be “verified” is precisely that the behaviors of the source programs according to their semantics (described here) match the behaviors of the compiled EVM code, according to the EVM semantics described in Eth-Isabelle.
Before continuing, it is worth noting that this semantics uses the original EVM semantics to describe the behavior of individual instructions. However, all higher-level control flow is described by the Elle semantics, capturing the key abstraction Elle is intended to provide (structured control flow).
In the formal Isabelle development implementing Elle, this is actually described as a *big-step operational semantics*. For clarity of exposition here (i.e., to make the presentation more comprehensible for people not already familiar with inductive semantics) I will describe it here as a “non-deterministic” interpreter.
The big-step version of the semantics can be found in elle/ElleCorrect/ElleAltSemantics.thy
, here
Elle’s semantics is nondeterministic “in theory” but not “in practice”. What I mean by this is that while source programs can describe nondeterministic behaviors, the Elle compiler will refuse to compile them into EVM programs (this is important, as the EVM is by necessity a deterministic virtual machine). Additionally, it is relatively straightforward to characterize which Elle programs are deterministic (a predicate called “valid3” captures precisely this condition, discussed in more detail in :ref: valid3<valid3>).
Elle’s interpreter uses a logical program counter which consists of an index into a syntax tree. In particular, this is expressed as a list of natural numbers, describing the path taken through the tree at each syntax node (that is, “[0,1,2]” means “the root’s first child’s second child’s third child”; these paths are zero-indexed). In the codebase these lists are referred to as childpaths or cp for short. Most of the correctness proof revolves around showing that this logical program counter advances in such a way that the Elle program’s behavior matches that of the EVM program with its standard (integer) program counter.
In the code below, “@” is list concatenation.
Informal Description of Semantics¶
Instructions¶
- Instruction nodes carry an EVM instruction that is guaranteed not to be a JUMP (i.e. no arbitrary modifications of the original program counter)
- If an instruction node is at the end of the tree (there is no next node), we return the result of running that instruction in the EVM semantics
- If the instruction is not at the end, we run the EVM instruction to get a new state, advance the logical program counter to the next instruction, and continue executing the Elle semantics from there
Jumps¶
Jump nodes in Elle carry a “depth” parameter, pointing to a particular Sequence node a certain number of levels “up” in the tree.
Label nodes in Elle also carry a “depth” parameter, which also points “up” the tree a certain number of levels to a Sequence node for which that label is a corresponding jump target
For an Elle program to be valid, each Sequence node must have exactly one (or zero) label nodes “pointing up” to it
- For valid Elle programs, the semantics of a jump involves moving the logical program counter to the location of the single label corresponding to the target of the jump
Elle’s semantics also describes execution of invalid programs, with more than one jump target
- In these cases, Elle’s execution nondeterministically splits, creating “parallel” executions that separately move the program counter to each of the jump targets, and continue executing from there.
- This behavior cannot be realized by the EVM, and is prevented by checks in Elle’s compilation process that assure that programs with multiple labels will not compile successfully to EVM
Conditional Jumps¶
- If the head of the EVM stack in the current state is zero (i.e. the conditional jump will not execute in EVM) and there is no next node in the tree, we are done and can return the current state (after subtracting gas for the jump instruction)
- If the head of the EVM stack is zero and there is a next node, we run the unsuccessful jump (subtracting the gas), increment the logical program counter to point to the next node in the tree, and then continue executing
- If the head of the EVM stack is nonzero (the conditional jump will execute) the semantics are the same as that of a jump (other than the different gas cost) - see above.
Sequences¶
- If a sequence is empty (has no sub-nodes), and there is no next node after the current sequence node, we are already done executing and can return the current state
- If a sequence is empty and there is a next node, we advance the logical program counter to that node and continue executing
- If a sequence is non-empty, we begin executing from that sequence’s first child
Interpreter (Pseudo)Code¶
Function get (root, cp) {
if (cp == []) return root;
else {
if (node_type(root) != Seq) return null;
else {
if(nth (root, head(cp)) = null) return null;
else return get(nth (root, head(cp)), tail(cp));
}
}
}
Function getnext (root, cp) {
if (cp == []) return null;
cp' = butlast (cp);
cpl = last (cp);
if (getnext (root, (cp'@(cpl+1))) = null) {
return getnext (root, cp');
}
else {
return (cp'@(cpl + 1));
}
}
// Function get_label_cp(root)
// returns locations of all labels pointing up to root
Function ellesem (root, cp, state) {
switch (get (root, cp)) {
case null: return emptyset;
// instructions carry which EVM instruction to execute
case Inst(i):
if(getnext (root, cp) = null) return evm_sem i state;
else return ellesem (root, getnext (root, cp), evm_sem i state);
// labels are jumpdests
case Label(d):
if(getnext (root, cp) = null) return evm_sem JUMPDEST state;
else return ellesem (root, getnext(root, cp), evm_sem JUMPDEST state);
// jumps carry a depth - how many scopes up to jump to
case Jump(d):
ctxpath = take((length cp - d), cp); //take all but last d elements
context = get(root, ctxpath);
s = get_label_cp context;
return set{cp' | ellesem(root, cps, state)};
case JumpI(d):
// if we should jump
if(hd (evm_stack (st)) != 0) {
ctxpath = take((length cp - d), cp); //take all but last d elements
context = get(root, ctxpath);
s = get_label_cp context;
return set{cp' | ellesem(root, cps, state)};
}
else {
if(getnext (root, cp) = null) return state;
else return ellesem(root, getnext(root, cp), state);
}
// sequences carry a list of sub-nodes
// we jump to all labels pointing to the Seq node "d" levels up
case Seq(l):
if(l == []) {
if(getnext (root, cp) = null) return state;
else return (ellesem(root, getnext(root, cp), state));
}
// if the list has children, run its first child
else return ellesem(root, cp@[0], state);
}
}
Notes on Interpreter¶
The “jump” and “jumpI” cases in the above code explicitly return sets of states, which captures the nondeterminism of the semantics. All other (deterministic) cases can be considered to be implicitly returning singleton sets containing the single next state (for clarity I have left these implicit).
Again, the actual semantics of Elle programs are phrased somewhat differently, using an inductive relation rather than
an explicit interpreter. While it would be possible to encode this interpreter directly in Isabelle and explicitly
prove that it matches the inductive semantics given in elle/ElleCorrect/ElleAltSemantics.thy
, this has not been done
for Elle, although it is planned for Elle’s successor, Gazelle. Nonetheless,
the intepreter is likely easier to read and understand for most programmers not used to seeing formal semantics.
Implementation¶
This page describes some salient aspects of how the Elle compiler is implemented. Details about the implementation of the correctess proof are deferred until Correctness.
In Elle Syntax, we described the syntax of the Elle language from the user’s
perspective. However, internally, Elle uses a series of annotations to
describe Elle programs at various stages of compilation. These can be found
in elle/ElleSyntax.thy
. This more elaborated representation takes the form
of general datatype
with extension points (type parameters) into which we can insert annotations later,
along with the specific syntax extensions
used at various stages of the compiler.
The Elle compiler proceeds in several phases, outlined in the remainder of this document. Note that links in this
file are to the (frozen) ITP2019
branch of the repository, to ensure consistency of line numbers as master evolves.
Though the line numbers may change, the general ideas should not.
Phase 1 - Generating Size Annotations¶
As a first step, the Elle compiler generates a pair of integer annotations for each node in the syntax tree given to the compiler as input. These annotations correspond to the range of bytes taken up by the code that will be generated from the syntax tree in the program buffer. These are calculated as one would expect: instructions are simply the length of the encoding of the instruction as bytecode, labels correspond to the lengths of EVM JUMPDEST instructions, and sequence nodes have lengths equal to the sum of the lengths of all their children.
The implementation of this compiler phase can be found
here, in elle/ElleCompiler.thy
.
After this phase (and throughout the Elle compiler thereafter), syntax
trees will be proven to conform to the following predicates
ll_valid_q
and ll_validl_q
(on
elaborated Elle syntax trees and lists of elaborated
Elle syntax trees, respectively)
(the size-annotations are referred to as “quantitative annotations”,
and abbreviated by “q” or “(x, x’)”, throughout). These predicates
can be found here, in
elle/ElleCorrect/Qvalid.thy
.
Note that, other than for the s
annotations on JUMP and JUMPI,
none of the syntax tree node annotations have any impact on the size of the
generated code corresponding to a node (thus, no effect on the inference rules
for ll_validl_q
). This makes sense, as they correspond purely to compile-time
artifacts that are not present in the generated code.
Phase 2 - Finding Labels¶
In the second phase of compilation, we enforce the invariant that
each Seq node has exactly 0 or 1 descended labels, as defined
according to the ll3'_descend
predicate, which can be found in
elle/ElleCorrect/Valid3.thy
, here.
For this phase of the compiler, we traverse the Elle syntax tree.
At each Seq
node, we scan the sub-tree for all descended Lab
nodes with an index “pointing back up” at the Seq
node we are
currently considering. If we find no such nodes, we mark the
Seq
node with an annotation indicating there is no label
(an empty list). Otherwise, we take the first Lab
node we
find (in a preorder traversal),
annotate it as having been “consumed” by a Seq
node,
and store the path to that label at the Seq
node. In order to
guard against multiple labels “pointing up” at the same Seq
scope, this compiler pass fails if it ever encounters a Lab
node
that has not been consumed in its top-level traversal of the tree
(since such a node corresponds either to a nonexistent scope,
or to a scope which already has a label corresponding to it).
The compiler pass is implemented in the function ll3_assign_label
in
elle/ElleCorrect/ElleCompiler.thy
, here
For reasoning about this phase and subsequent phases,
we use the aforementioned``ll_descend`` predicate, which relates
two Elle syntax trees and one list of natural numbers. This
predicate captures situations in which the syntax tree
l2
can be found as a descendant of l1
by treating
k
as a path through the tree, selecting which child-node
to choose at each step.
After the second phase of compilation (if successful),
syntax trees will be proven to conform to the
inductive predicate ll_valid3'
, which can be found
here, also in
elle/ElleCorrect/Valid3.thy
. This predicate essentially corresponds to the
intuition that each Sequence node has exactly
zero or one labels referencing it, and that the locations
of these labels are annotated on the sequence nodes.
Phase 3 - Resolving Jumps¶
Once we have located the unique corresponding label (or determined the nonexistence of such a label) for each sequence node in the second phase, we need to calculate target addresses for each jump node based on the locations of those labels.
This process involves, essentially, a fixed-point calculation over the Elle syntax tree,
in order to ensure that sufficient space has been allocated to store the needed address
at each Jump
and JumpI
node.
This process is captured by the function
process_jumps_loop
, which can be found in elle/ElleCorrect/ElleCompiler.thy
,
here.
process_jumps_loop
makes use of two auxiliary functions. The first is
process_jumps
, which captures one iteration of the size checks involved in
the jump-resolution process. process_jumps
returns one of three cases of
result: either Success
if all jumps have sufficient size to store their corresponding
addresses, Fail
if there is an un-recoverable error (such as invalid input)
and Bump
if a node in the tree has a jump-target size that needs to be incremented.
At each Seq
node, process_jumps
checks the node’s annotation to see if there
is a corresponding label. If there isn’t one, process_jumps
scans all descended
Jump
nodes to make sure there are no descended jumps that point to that Seq
node
(as such jumps would have no target), failing if it finds any.
If there is a label according to the annotation,
process_jumps
looks up that label to find its address
(failing if there isn’t one to be found), and then
runs on that sequence node’s descendants (doing an in-order traversal)
to find all jumps that point to the scope corresponding
to this sequence node. If any are found, process_jumps
checks the space allocated to that jump
node against the space required to encode the address from the label that was looked up previously
for the Seq
node.
If there is enough space, process_jumps
continues scanning the
tree for other jumps corresponding to the same Seq
node and performing the same check,
ultimately returning Success
if all of them have enough space. Otherwise, it returns
the absolute location (as a childpath
)
of the first Jump
node without enough space in the form of a
Bump
result.
(For Seq
nodes descended from the root, process_jumps
first performs these checks for
jumps pointing up to the outer Seq
node, then recursively performs the same checks on the
descended Seq
node.)
The second auxiliary function used by process_jumps_loop
is inc_jump
, which takes
a path (corresponding to a Jump
node) returned by process_jumps
and increments its size,
adjusting the size annotations of the rest of the tree in the process as appropriate.
To avoid a complicated termination argument for
process_jumps_loop
(functions in Isabelle need to be proven to terminate
or they become very inconvenient to reason about),
the execution of
this function is “fuelled” (termination is justified by a decreasing natural-number argument,
which is decremented once each time the loop is run - thus, once per time a Jump
node’s
size needs to be incremented).
If this fuel parameter is 0
, process_jumps_loop
returns a failure (None
)
Otherwise, runs process_jumps
on the root
of the Elle syntax tree given as an argument. If process_jumps
returns
Success
, process_jumps_loop
returns the input syntax tree as nothing
needs to be done. If process_jumps
returns Failure
,
process_jumps_loop
also fails (returns None
). Otherwise, if process_jumps
returns Bump
,
process_jumps_loop
calls inc_jump
on the child-path returned by
process_jumps
, and then calls process_jumps_loop
on the same arguments
(with fuel parameter decremented).
The correctness of process_jumps_loop
is established by a series of
validation passes that happen after it runs. However, process_jumps_loop
is proven directly to produce valid_q
results from valid_q
inputs.
Additionally, we define a function, get_process_jumps_fuel
, which
calculates a sufficient amount of fuel to ensure that process_jumps_loop
terminates on its input (although this is not formally established with a proof).
By the end of running process_jumps_loop
, we have a syntax tree that should obey
the predicate ll4_validate_jump_targets
. This predicate essentially makes sure that
the indices of jump nodes (which point to the sequence node corresponding to the jump; i.e.,
to the scope the jump’s target is in) correspond to a scope whose label has an address
matching the address stored at the jump node (which is the address that will ultimately be
written out to bytecode).
The definition of ll4_validate_jump_targets
can be found in
elle/ElleCorrect/Valid4.thy
, here.
The Big Picture¶
At this point, we have produced a syntax tree that is valid as an
ll4
syntax tree, yet meets all of the predicates described above.
In its final form, ll4
contains all the information needed to generate
concrete EVM machine code, including concrete addresses. At this point,
codegen'
is used to emit a list of bytes corresponding to the
output bytecode. (codegen
can be found in elle/ElleCompiler.thy
,
here)
An additional validation
step is used after this point to ensure that all jumps are encodable
in EVM (that is, their addresses are at least 1 byte and not more than 32
bytes). Code for these extra validators can be found
in elle/ElleCorrect/ElleAltSemantics.thy
, here.
Examples of how all these pieces may be put together into a single verified
compilation pipeline can be found in elle/ElleCompilerVerified.thy
(here)
In the next section, Correctness, we will sketch the process by which the generated EVM instructions are proven correct with respect to the input program, making use of the information contained in these intermediate predicates.
Correctness¶
Here we describe the theorem establishing the correctness of the Elle compiler, and sketch its proof.
Note that links in this
file are to the (frozen) ITP2019
branch of the repository, to ensure consistency of line numbers as master evolves.
Though the line numbers may change, the general ideas should not.
In Implementation, we discussed the steps taken by the Elle compiler when translating code from the Elle language to EVM, and the specifications proven about the code generated after each step. In this section we will discuss the proofs that formally establish these invariants for each step, as well as the final theorem about the end-to-end process of compilation that ties them all together.
- For each phase of the compiler (other than jump resolution), we prove that only annotations with no impact on the semantics of the program when translated to EVM are modified. This enables us to lift results about the outputs of compilation passes to results about their input programs.
- Syntax trees generated by
ll_phase1
satisfy the predicatell_valid_q
, so long as they only contain valid instructions according to theinst_valid
predicate (a predicate that rules out use of instructions that would violate Elle’s guarantees, such as arbitrary jumps not governed by Elle’s scoping mechanism). This is proved by a relatively straightforward induction on the structure of the tree. - We prove that subsequent passes of the compiler preserve
the
ll_valid_q
predicate. Because only annotations without effect on the generated code are changed in most passes, this is relatively straightforward. For jump resolution, we additionally prove that expanding jump nodes to allow for larger addresses preservesll_valid_q
via a lemma about the effect of increasing the size of a jump by 1 (ll_bump
). - We prove that the output of the second pass of the
compiler meets the predicate
ll_valid3
. This is done by means of a verified validator pass that runs after the second pass of the compiler. This pass essentially supplies an executable version ofll_valid3
: it iterates over the structure of the tree, gathering the set of label nodes that point up to each sequence node. Each sequence node annotation is then checked against the gathered label nodes to ensure the annotations match (that is, either there are 0 nodes pointing up at the sequence node in question and the annotation is[]
, or the annotation is a nonempty list corresponding to the child-path of the single node that points up to the sequence node that holds the annotation). Otherwise, the validation pass fails and the compiler produces no output. Verification of this pass involves a relatively straightforward induction on the structure of the input tree (strictly speaking, on the structure of the proof that the input tree is valid according toll_valid_q
, which mirrors the syntactic structure of the tree.) - For the third phase of the compiler (resolving and resizing jumps) we make use of another validation pass, which runs after the body of the compiler. This validator is proven to only accept input code which satisfies the predicate given in (TODO: link to code).
- After this final compilation phase, we run a validator to ensure
that the lengths of encoded jump addresses match the length annotations
on the
Jump
andJumpI
nodes; that is, that all addresses we are going to encode fit exactly into the space we have allocated for them. - Finally, we run one last validator to sanity-check the jumps contained in the final code output by Elle. This validator checks to ensure that jumps are encodable in EVM (that is, that no jumps to addresses less than 1 byte or more than 32 bytes in length are present in the code).
- At this point, we can easily dump the final, fully annotated version of the Elle syntax tree to EVM instructions. It is these instructions that the final proof of Elle’s correctness will reason over.
For details of the theorem statement, see elle_alt_correct
in elle/ElleCorrect/ElleAltSemantics.thy
.
The Corrcetness Theorem¶
This correctness theorem
can be paraphrased as follows: suppose we have an Elle
program t
that ends in a state st'
when started in state st
at node
cp
(under the Elle semantics). If the t
is valid under the valid3
predicate as well as passing jump-targets validation),
the result of dumping this Elle program to EVM bytecode yields a program that steps
from ir
(with program counter set to the program counter corresponding to the
start of the instruction pointed to by st
), it will end in a state that
will differ from ir'
only in the value of the program counter (unless insufficient
interpreter fuel has been given to the EVM interpreter, in which case a larger
value of fuel would yield such a final state - this is what is captured by the predicate
program_sem stopper prog fuel net (setpc_ir st targstart) = InstructionToEnvironment act vc venv
).
program_sem
comes from the original eth-isabelle
project on which Elle is based.
Setting Up the Proof¶
Programs produced by the Elle compiler meet the validity predicates valid3'
and pass the jump-target validator, so this means that when the output of the
Elle compiler is run in EVM semantics, the result will always be the same as
the result given by running the source Elle program under Elle’s semantics.
This establishes one direction of simulation: that Elle programs are simulated
by the EVM programs output by the Elle compiler under the conditions given above.
Because the state spaces for the input and output program semantics are
virtually the same, and both languages are deterministic (assuming the Elle
program in question is valid, but the compiler will produce None
as an
output in the case that it is not), proving this one direction is sufficient
to establish the stronger bisimulation results that are common in the
compiler-correctness literature -
informally the argument is as follows: suppose we have that the EVM
semantics steps from st
to st'
for some program prog
, that
prog
was produced from an Elle program eprog
, and that st
has
a program counter value of pc
. Unless pc
is the address-value of
the middle of a multi-word instruction (see below), pc
will be the address
of the start of an instruction, meaning there will be at least one node
in eprog
with a left-side annotation equal to this address.
Since the
Elle semantics is deterministic for valid programs, and eprog
must be valid
(or the compiler would not have produced any output program), we know that there
must exist some st''
such that eprog
steps from st
to st''
under the
Elle semantics with a childpath pointing to any node with an annotation matching
the value of pc
. (There may be more than one such node, if the node is the
first element of a sequence, but the sequence semantics implies that the
semantics of running the Elle program from any of these nodes will produce the
same result). Now, we have two cases. If st'' = st'
(with the possible exception
of the final program-counter values being different), we have the converse result
that we wanted to prove and are done. Otherwise, we can apply the correctness
theorem of Elle to get that prog
steps from st
to st''
under the EVM
semantics, a contradiction since st' != st''
(beyond just differences in program counter values) and the EVM semantics is
deterministic.
As alluded to above, there are some EVM states which do not have a corresponding state in the Elle semantics for a particular program, but these states correspond to instruction-pointer values that are invalid in the sense of not being the address of the beginning of any instruction. An EVM program executing from the beginning should never reach such a program-counter value, so it is safe not to consider such states if what we care about ultimately is the behavior of whole EVM programs run from the beginning (which it is).
Correctness of Elle: Proof Sketch¶
In order to establish the correctness theorem of Elle, we use the
induction principle for the elle_alt_sem
predicate.
This requires us to prove 11 goals (corresponding to the
11 cases in the semantics).
For the first case, we need to prove that instructions at the end of an Elle program behave the same way as running the same instruction at the end of the corresponding EVM program. This amounts to a straightforward case-analysis on possible EVM instructions, in order to demonstrate that running the instructions (followed by
elle_halt
) yields the same result as running the same instruction at the end of the corresponding EVM program.For the second case, we need to prove that program executions beginning with instructions not at the end of the program behave the same way as their corresponding EVM programs. This proof is similar to the first case, except that instead of running
elle_halt
at the end and comparing the final results, we need to appeal to our inductive hypothesis (which says that running the Elle program and EVM programs after the given instruction yield the same result.) In order to apply our inductive hypothesis, we need to prove that the states match after executing a single instruction, which is similar to the beginning to the proof of the first case.The third case is for label nodes at the end of Elle programs, its proof is almost identical to the first case (minus the exhaustive case-analysis of EVM instructions).
The fourth case is similar to the second case in the same way that the third case is to the first: again we need to appeal to an inductive hypothesis about running the remainder of the program, and doing so involves reasoning about the execution of a single JUMPDEST instruction (as in the third case).
The fifth case involves reasoning about the effects of the
Jump
Elle instruction. The core of this proof is a case-analysis on the three disjunctive cases of the specification for jump-target validity. Since we are talking about a whole tree (rather than a tree in context) we only have two cases: one where the jump in question points up to the context corresponding to the root of the tree (aSeq
node) and one where the jump points up to an intermediate node which is, in turn, a descendant of the root.In both cases, the
ll_valid3
predicate is used to show that, because the label node corresponding to the jump node’sSeq
node is unique, the Elle semantics selects a single jump target which corresponds to the jump target given by the EVM semantics of the compiled code.Additionally, some (rather tedious) reasoning is needed to prove that when the target address of the jump is serialized in to an EVM stack value and then deserialized again (to calculate the new program-counter value in the EVM semantics) the value is preserved. This mostly boils down to proving that the address in question does not overflow a 256-bit EVM integer, which is guaranteed by the fact that the address is not more than 32 bytes (shown by an additional validation pass run at the end of the compiler).
The second case is largely similar to the first, except that some extra reasoning needs to be done to show that the descended
Seq
node also satisfies thell_valid3
predicate, and then to translate the results of the reasoning on the subtree in which the jump is taking place back up to a statement about the meaning of the jump in the context of the overall syntax tree descended from the root node. For details, the reader can refer to our Isabelle formalization.The sixth case involves reasoning about the effects of the
JumpI
Elle instruction when the conditional jump is taken. This case is similar to the fifth case, except that a bit of additional reasoning is needed to prove that the jump is indeed taken.The seventh case involves reasoning about
JumpI
in the case where the conditional jump is not taken and theJumpI
instruction in question is at the end of the code. This case is similar overall to cases 1 and 3, since the semantics ofJumpI
where the jump is not taken are not dissimilar to that of the label case (decrement gas by the correct amount, increment the program counter)The eighth case involves reasoning about
JumpI
in the case where the conditional jump is not taken and theJumpI
instruction in question is not at the end of the code. This case is similar overall to cases 2 and 4.The ninth case is the case of executing an empty sequence node at the end of an Elle program. Because the empty sequence has no effect on the machine state (nor does running an empty series of EVM instructions), this corresponds to showing that the effects of running
elle_halt
matches the effect of running at the end of an EVM program. Essentially this is an easier version of cases 1, 3, and 7.The tenth case corresponds to executing an empty sequence node somewhere other than the end of an Elle program. As with cases 2, 4, and 8, this involves applying an inductive hypothesis stating that the execution behaviors of the remainder of the program are the same between the Elle an EVM versions when started in the same state. Because an empty sequence of instructions leaves the state unchanged in both the Elle and EVM versions of the program, the hypothesis almost immediately applies.
The eleventh case corresponds to running a nonempty sequence in Elle. In this case, we get an inductive hypothesis about the effect of running the given Elle program starting at the first element of that sequence. We need to show that the address in the output code corresponding to the start of the sequence in Elle is the same as that of the start of its first element, which is straightforwardly proved using auxiliary lemmas about the behavior of the
ll_valid_q
predicate on lists. Once we have this, we can apply our inductive hypothesis to complete the proof.
It should be noted that we prove two different versions of this lemma.
The first, elle_alt_correct
talks about correctness in cases where the execution of the
Elle and EVM programs terminate successfully; the second,
elle_alt_correct_fail
, deals with the
case where the
execution ends in a “crashed” state (either because the EVM stack limit was exceeded,
the EVM was not supplied enough gas to complete the execution,
or an invalid instruction was reached).