The Mu Micro VM¶
This tutorial is about the Mu micro virtual machine.
A micro VM is a minimalist virtual machine as an underlying substrate for high-level programming language implementations. Mu is designed for languages that are concurrent, JIT compiled and use garbage collection.
Language Implementor’s Tutorial¶
This part is suitable for those who wish to implement a programming language on top of Mu.
Introduction¶
The Mu project addresses the difficulties in implementing managed languages, which include those languages that have a garbage collector and likely run on a virtual machine.
What Makes Language Implementation Difficult?¶
We identify three major concerns that make language implementation difficult, namely concurrency, just-in-time compiling and garbage collection.
Each of them is already difficult enough individually. But when handling all three at the same time, their combined complexity will multiply.
For example, in a multi-threaded JIT-compiled garbage-collected program, when GC is triggered, the GC thread needs to ask all mutator threads running in JIT-compiled machine code to pause at the nearest GC-safe points (or “yieldpoints”), where stack maps are available to identify all references held by variables in the stack. In this task, the yieldpoints are inserted by the JIT-compiler, which is aware of GC. The stack maps are generated by the JIT-compiler, too. The handshake between mutators and the GC is very difficult to get right, too. Yieldpoints involve all three concerns mentioned above.
Because they are difficult, language implementations either omit them or implement them naively.
Take CPython, the official Python implementation, as an example:
- lack of concurrency: In CPython, there is a global interpreter lock (GIL) which must be obtained for any Python thread to run. This makes the execution of all multi-threaded Python programs sequential.
- lack of JIT-compiling: CPython is interpreter-only and does not perform run-time optimisation based on JIT-compiling and type inference. A simple computation-intensive program may be up to 20 times slower than an equivalent C program. For comparison, PyPy, a Python implementation that does JIT-based optimisation, can run as fast as unoptimised C (GCC with -O0) in some cases.
- naive garbage collector: CPython uses a naive reference counting garbage collector. It increments or decrements the count every time a reference is created or destroyed. This naive algorithm is known to be more than 30% slower than a naive mark-sweep GC in total execution time.
In reality, many languages designs are implementation driven. Difficulties in implementation will result in bad decisions being baked into the design and hamper the evolution of the language in the future.
Take PHP as an example. PHP was never supposed to be a programming language, but a set of pre-processing macros for personal home pages. As it evolves, it uses naive reference counting GC, copy-by-value semantic for arrays, and has reference types. An optimisation was also made to use copy-on-write to avoid copying. In 2002, a user found a problem involving arrays and references. Fives days later, the PHP developers decided that properly fixing the bug will put a considerably slowdown on the PHP performance and this behaviour is documented.
Up till now in 2015, it has been 12 years since the problem was spotted. The problem can still be reproduced in today’s PHP.
In conclusion, concurrency, JIT and GC are difficult. The difficulties lead to bad language designs and implementations, and should be abstracted out in a layer.
Why Micro Virtual Machines?¶
There are basically two ways to implement a managed language.
- Building a VM from scratch. This is difficult because the developers have to address all of the three concerns. In reality, there are many successful projects taking this approach, such as PyPy, HHVM, V8 and so on. Most don’t address all three concerns, and no code is shared between those projects.
- Targeting an existing virtual machine, such as the JVM or the .NET CLR, which has high-quality implementations. The problems are semantic gaps and huge dependencies. Existing VMs are designed for particular kinds of languages. For example, the JVM is designed for static object-oriented languages. Optimisations usually done by JVM do not work for dynamic languages like Python. Jython, for example, still performs in the same order of magnitude as CPython. Moreover, using JVM introduces many unnecessary dependencies on Java-related packages.
Both approaches use monolithic virtual machines. Each such VM handles all aspects of the runtime of the language. For example, the JVM handles concurrency, JIT and GC, but also class loading, run-time type information, object-oriented programming (including virtual methods), and it comes with a comprehensive Java standard library. Since the VM is huge, it is difficult to build from scratch. Since it is monolithic, it is difficult to reuse its parts for other languages.
We propose an alternative concept: micro virtual machines. Analogous to microkernels in the operating system literature, a micro virtual machine only contains the parts that are absolutely necessary to be handled in the core of the VM. We suppose those parts are concurrency, JIT and GC. There is a separate program, called a client, sitting on the top of a micro virtual machine, interacting with the micro virtual machine and handling other (mostly language-specific) parts, in a similar fashion as servers interacting with the microkernel.
A micro virtual machine must be low-level and minimal.
- Being low-level means being close to the machine and thus minimise the semantic gap.
- Being minimal means it must push jobs that are not essential to the higher
level, that is, the client. This is a separation of concern.
- The micro virtual machine itself will be easier to design and implement. The minimalism also makes it practical to create a formally verified VM.
- The client is not bound to the assumptions made by the low-level VM and can implement the language with maximum flexibility. The client has more responsibility, too, but it can still rely on the micro virtual machine for the three major concerns that are extremely difficult.
The minimalism pushes as much job as possible to the client side, potentially making the language implementer’s job harder than targeting traditional VMs like JVM. To mitigate this limitation, libraries can be provided to assist the implementation of certain kinds of languages, like dynamic languages, functional languages, object-oriented languages and so on. Libraries are not part of the micro virtual machine. The library can be a framework or a package that is part of the client; it can also be pre-written code snippets.
The Mu project is a concrete micro virtual machine, in the same way seL4 is a concrete microkernel.
Separating the Specification and the Implementation¶
Mu separates its specification and its implementations, making it possible to create many different implementations for different purposes. In theory, there may be a simple proof-of-concept implementation, a high-performance implementation for productional use, an extensible modular implementation for researching, a formally verified implementation for highly-assured applications, and so on.
The Mu specification defines the behaviour and the interface of Mu.
Currently, Mu has two implementations:
- Holstein, the reference implementation, is a simple Mu implementation and allow early evaluators to do experiment with the interface of Mu. The simplicity allows Mu reference implementation to be agilely changed when the specification changes. It is not a high-performance implementation.
- Zebu, the fast implementation, is developed from the first day to be fast. It is written in Rust, and has a optimizing compiler and a high-performance garbage collector. It is currently implemented as an ahead-of-time compiler. Some functionalities are still in development.
Getting Started¶
Where is Everything?¶
The specification is the canonical source of information about the implementation-neutral parts of Mu.
This tutorial will use Holstein, the current reference implementation. Keep in mind that it is not high-performance. It is interpreted and single-threaded, does excessive checking and is very slow.
Get Mu¶
Go to https://gitlab.anu.edu.au/mu/mu-impl-ref2 and follow the README file. That repository also contains a sample factorial program compiled from an RPython client and a loader to run it. Read the README file.
Pick your language¶
The client of the reference implementation can be written in Scala or in C. If
you use Scala, you can just use the classes in the microvm-refimpl2
repository since the reference implementation is already written in Scala. If
you use C, there is a C binding in the cbinding
directory. Read the README
file for instructions. There is also a sample client program test_client
.
The following part of the tutorial assume you use Scala. Don’t worry if you use C because the interface is quite similar.
Basic Interaction¶
You are writing a Mu client, which controls the Mu micro VM via the Mu client interface, usually simply called “the API”.
In Scala, the class uvm.refimpl.MicroVM
, which implements the MuVM
struct in the specification, represents a micro VM instance. You can start Mu by
creating a MicroVM
instance:
val microVM = MicroVM()
Just as simple as this. The instance also internally allocates memory for its heap. The default heap size is quite big, and is usually enough for experiment.
Mu Contexts¶
The client more often interacts with the micro VM via “Mu context”. They are
called “MuCtx” in the spec, and uvm.refimpl.MuCtx
in Holstein. A MuCtx is
a context created from a MicroVM instance. It can hold Mu values for the client,
access the Mu memory, load bundles and let the client perform many tasks on Mu.
Why not directly do these things on the MicroVM instance? Why add another layer? There are two reasons.
Reason 1: because Mu is designed to be multi-threaded. By design, multiple client threads can interact with the Mu micro VM concurrently.
Caution
In fact, unfortunately, the reference implementation is based on a single-thread interpreter. The program itself is not thread safe. Do not run more than one client threads in the reference implementation.
But Mu is designed for multi-threaded envirionments. The limitation in the implementation does not change the fact that Mu and its clients need to think in a multi-threaded way. A carefully designed interface will eventually allow a more efficient implementation.
Despite this limitation, the reference implementation can still run multiple Mu threads. Mu threads are interpreted one instruction for each thread on a round robin scheduler.
If multiple client threads were accessing the single MicroVM object concurrently, synchronisation (such as locks) must be employed to guarantee thread safety. This is where the “context” comes in. In Mu’s design, MuCtx instances are not allowed to be used by two threads concurrently, thus avoided many cases where synchronisation were necessary. For example, accessing the garbage-collected Mu memory via MuCtx does not need locking. A MuCtx instance can also hold a thread-local memory pool so that a client thread can allocate memory in the garbage-collected Mu heap without having to deal with the shared global memory pool every time unless the local pool is exhausted.
Reason 2: because the type system of Mu is usually very different from the client’s. Notably, the Mu’s type system contain object reference types which points to the Mu heap and must be traced by the garbage collector. MuCtx can hold Mu references for the client so that whenever garbage collection happens, it always knows what objects are still kept alive by a reference held for the client.
If you used JNI before, you may find this design familiar. In fact, this design is inspired by the JNI. JNI uses opaque “handles” to refer to Java object, so does the Mu API. However, for performance reason, opaque handles are not the only way to expose garbage-collected Mu objects to native programs. Mu has a more efficient but unsafe native interface which supports “object pinning”. That is an advanced topic.
Note
There is a difference between a Mu client and a native program.
A Mu client is a program that controls the Mu micro VM. In theory, a Mu client can be written in any languages, from C to Scala, Python, JavaScript, etc. The Mu API is the interface between the client and the micro VM. It includes the IR and the API, and the purpose is to control the micro VM.
A native program is a program that does not run in the Mu micro VM, and is
usually written in C or other unmanaged languages. libc is one such example.
The native interface involves pointer-based raw memory access and calling
conventions that allow Mu programs to call C programs and vice versa in some
specific ways. The main purpose is to make system calls (obviously a VM
that cannot read
or write
is almost useless), and to interact with
programs that do not run on Mu, including C programs and those written in
other languages.
Using Mu Contexts¶
You can create a MuCtx
instance by invoking the newContext()
method on
the MicroVM
instance:
val ctx = microVM.newContext()
and you need to close the context in order to release the resources it is holding inside:
ctx.closeContext()
The API, i.e. the methods of MicroVM
and MuCtx
, are defined by the API
chapter of the
specification. The scala binding
matches the spec.
Let’s see what MuCtx can do. You don’t need to understand all of them now, since they will be covered in more depth in later chapters.
Create contexts from MicroVM
instance:
val microVM = MicroVM()
val ctx = microVM.newContext()
Holstein can load Mu bundles from the text form. This method of loading bundle is deprecated in favour for the IR building API which avoids the text-form IR parser.
ctx.loadBundle("""
.typedef @i64 = int<64>
// more Mu IR code here
""")
MuCtx
can hold Mu values for the client. Mu values have a specific int size.
val handle1 = ctx.handleFromInt(0x123456789abcdef0L, 64)
val handle2 = ctx.handleFromInt(0x12345678L, 32)
val handle3 = ctx.handleFromInt(0x1234L, 16)
val handle4 = ctx.handleFromDouble(3.14)
val handle5 = ctx.handleFromPtr(ctx.idOf("@someType"), 0x7fff0000018L)
It can allocate objects in the Mu heap. The handle is held in ctx so that GC can find all of them.
val handle6 = ctx.newFixed(ctx.idOf("@someType"))
It can create Mu stacks and Mu threads
val hFunc = ctx.handleFromFunc(ctx.idOf("@some_function"))
val hStack = ctx.newStack(hFunc)
val hArg0 = ....
val hArg1 = ....
val hArg2 = ....
val hThread = ctx.newThread(hFunc, PassValues(Seq(hArg0, hArg1, hArg2)))
It can access the Mu memory
val hObjRef = ctx.newFixed(ctx.idOf("@int_of_64_bits"))
val hIRef = ctx.getIRef(hObjRef)
val hValue = ctx.load(MemoryOrder.SEQ_CST, hIRef)
It can introspect the stack states
val hStack2 = .....
val hCursor = ctx.newCursor(hStack2)
val funcID = ctx.curFunc(hCursor) // function ID
val hVars = ctx.dumpKeepalives(hCursor) // local variables
It can modify the stack states (a.k.a. on-stack replacement, OSR)
ctx.nextFrame(hCursor)
ctx.popFramesTo(hCursor)
val hFunc2 = ctx.handleFromFunc(...)
ctx.pushFrame(hFunc2)
Threads and Stacks¶
Mu programs are executed on Mu threads. A thread is the unit of CPU scheduling, and Mu threads are usually implemented mirroring operating system threads. Multiple Mu threads may execute concurrently.
Each Mu thread runs on a Mu stack. A stack, commonly known as a control stack, is the state of execution, represented in Mu as a list of frames. Each frame corresponds to a Mu function version, and records which instruction should be executed next and what are the values of local variables.
Mu clearly distinguish between threads and stacks. If you used traditional thread APIs, such as the Java or the PThread API, you may already have the mental model that “a thread has a stack, which has many frames, so threads and stacks are interchangeable”. But in Mu, the relation of stacks and threads is much more flexible. A thread can stop executing on one stack and resume another stack, which gives “coroutine” behaviours. Multiple threads can also share a much bigger stack pool and implement the M*N threading model.
In order to start executing a Mu program, the client should create a Mu stack
and a Mu thread. In order to stop executing, the Mu thread should execute the
@uvm.thread_exit
instruction.
Trap Handling¶
There is one special instruction, TRAP
, that needs special attention since
the beginning. During the execution of Mu programs, if a Mu thread executes a
TRAP
instruction, the thread temporarily detaches from its stack and gives
control back to the client. At any moment, there is one trap handler registered
in a Mu instance. A trap handler is a client function that will be called
whenever a TRAP
instruction is executed. The trap handler gains access to
the thread and the stack that caused the TRAP
.
Using the API, the client can to introspect the execution state of each of its frames, see the values of local variables, and even replace existing frames with new frames for new functions (this is called on-stack replacement, or OSR).
The trap handler is a great opportunity for the client to do many things. The
clever placement of TRAP
instructions and the implementation of the trap
handler is key to a good language implementation. Traps can be placed after
sufficient run-time statistics are collected so that the client can optimise the
program. Traps can also be used for lazy code loading, de-optimising
speculatively generated code, and debugging.
Scala API¶
The trap handler is registered by the setTrapHandler
method on the
MicroVM
instance.
microVM.setTrapHandler(theTrapHandler)
The trap handler is an instance of the uvm.refimpl.TrapHandler
trait.
trait TrapHandler {
def handleTrap(ctx: MuCtx, thread: MuThreadRefValue, stack: MuStackRefValue, watchPointID: Int): TrapHandlerResult
}
A new MuCtx
instance is created for this particular trap event. It is passed
to the trap handler as the first argument ctx
. The thread
and the
stack
argument are handles of the thread that executed the TRAP
, and the
stack it was bound to, respectively. These two handles are held by ctx
. The
watchPointID
argument is about “watch points”, which will be discussed
later.
You probably only need one trap handler per program, so it is recommended to register it after created the Mu instance.
Inside the trap handler, you can use any API functions.
The return value of the trap handler tells Mu “how the current thread should continue”. There are three options:
- Terminate the current thread.
- Rebind the thread to a stack.
- When rebinding, pass some values to the top frame and let it continue normally.
- When rebinding, raise an exception and continue exceptionally.
When rebinding, the stack could be the previous stack, i.e. the stack
argument of the trap handler, or a totally different stack. In the former case,
it will continue after the TRAP
instruction. In the latter case, the trap
handler swaps the thread to a different stack, so the thread will continue in a
totally different context.
The return type uvm.refimpl.TrapHandlerResult
has several cases:
abstract class TrapHandlerResult
object TrapHandlerResult {
case class ThreadExit() extends TrapHandlerResult
case class Rebind(newStack: MuStackRefValue, htr: HowToResume) extends TrapHandlerResult
}
abstract class HowToResume
object HowToResume {
case class PassValues(values: Seq[MuValue]) extends HowToResume
case class ThrowExc(exc: MuRefValue) extends HowToResume
}
So you can return one of the cases from the trap handler:
// Just for convenience
import TrapHandlerResult._
import HowToResume._
// Terminate the thread.
return ThreadExit()
// Rebind to the old stack, pass some values and contunue normally
// Assume "stack" is the argument of the handleTrap method
val v1 = ctx.handleFrom......(...)
val v2 = ctx.handleFrom......(...)
val v3 = ctx.handleFrom......(...)
return Rebind(stack, PassValues(Seq(v1, v2, v3)))
// Rebind to the old stack, pass an empty list of values and contunue normally
// Assume "stack" is the argument of the handleTrap method
return Rebind(stack, PassValues(Seq()))
// Rebind to the old stack, throw an exception.
// Assume "stack" is the argument of the handleTrap method
// In Mu, an exception is just an object reference.
val e = ctx.newFixed(......)
return Rebind(stack, ThrowExc(e))
// Rebind to a different stack, passing 0 values.
val func = ctx.handleFromFunc(...)
val stack2 = ctx.newStack(func)
return Rebind(stack2, PassValues(Seq()))
C API¶
In the C API, you should use mvm->set_trap_handler(mvm, handler, user_data)
to register the trap handler.
The signature of the trap handler is a bit complicated:
typedef void (*MuTrapHandler)(MuCtx *ctx, MuThreadRefValue thread,
MuStackRefValue stack, int wpid, MuTrapHandlerResult *result,
MuStackRefValue *new_stack, MuValue **values, int *nvalues,
MuValuesFreer *freer, MuCPtr *freerdata, MuRefValue *exception,
MuCPtr userdata);
The first four arguments ctx
, thread
, stack
and wpid
are the
same as Scala. The next seven arguments result
, new_stack
, values
,
nvalues
, freer
, freedata
and exception
are output arguments.
They allow the C program to encode the counterpart of TrapHandlerResult
.
Since the client in C needs to pass an array of values to Mu, it also needs to
tell Mu how to de-allocate that array because there is not a standard way to
de-allocate C objects. The userdata
is an arbitrary pointer the client
provided when registering the trap handler. This allows the trap handler to
depend on extra client-decided contexts, because C does not have closures.
See the trap handling section of the Mu Specification for more information about trap handling in C.
Working Example¶
The following Scala program will create a Mu micro VM and execute a simple Mu IR program. You can ignore details of the Mu IR now (except the TRAP instruction), but instead focus on the interaction between the client, the Mu thread, and the trap handler.
The order of execution is labelled from #1
to #26
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 | package tutorial
import uvm.refimpl._
object Interact extends App {
// Create the Mu instance
val microVM = MicroVM() // #1
// Implicitly convert names to IDs
implicit def idOf(name: String) = microVM.idOf(name) // #2
// Create the context
val ctx = microVM.newContext() // #3
ctx.loadBundle(""" // #4
.typedef @i64 = int<64>
.const @I64_1 <@i64> = 1
.funcsig @main.sig = (@i64) -> ()
.funcdef @main VERSION %v1 <@main.sig> { // #12
%entry(<@i64> %n):
%n2 = ADD <@i64> %n @I64_1 // #13
[%trap] TRAP <> KEEPALIVE (%n2) // #14
COMMINST @uvm.thread_exit // #21
}
""")
// Create the trap handler
val myTrapHandler = new TrapHandler {
def handleTrap(ctx: MuCtx, thread: MuThreadRefValue,
stack: MuStackRefValue, watchPointID: Int): TrapHandlerResult = {
// Create a cursor to introspect the stack
val cursor = ctx.newCursor(stack) // #15
val curInstID = ctx.curInst(cursor) // #16
ctx.nameOf(curInstID) match {
case "@main.v1.entry.trap" => { // #17
// Dump the keep-alive variables
val Seq(n2: MuIntValue) = ctx.dumpKeepalives(cursor) // #18
ctx.closeCursor(cursor)
// Print the value
val n2Int = ctx.handleToSInt(n2) // #19
printf("The value of n2 is %d.\n", n2Int)
// Return to Mu from the trap handler
TrapHandlerResult.Rebind(stack, HowToResume.PassValues(Seq())) // #20
}
}
}
}
// Set the trap handler
microVM.setTrapHandler(myTrapHandler) // #5
// Create the stack and the thread
val mainFunc = ctx.handleFromFunc("@main") // #6
val st = ctx.newStack(mainFunc) // #7
val fortyTwo = ctx.handleFromInt(42, 64) // #8
val th = ctx.newThread(st, None, HowToResume.PassValues(Seq(fortyTwo))) // #9
// Close the context
ctx.closeContext() // #10
// Let the reference implementation run
microVM.execute() // #11
// #22
}
|
There are some key steps:
#1
and#3
creates the Mu instance and a context, respectively.#4
loads the initial bundle using the MuCtx. You can directly embed the text-form Mu IR in your source code.#5
registers the trap handler. This handler will handle all traps in the future.- Using the MuCtx,
#6-9
creates a Mu thread from a given Mu function, whose name happens to be@main
. “main” is not a special name. When creating the thread, the initial argument, the 64-bit integer 42, is passed to the@main
function. At#10
, the MuCtxctx
is no longer useful and can be closed. - Once created, the Mu thread can execute. But the reference implementation
needs to call the
microVM.execute()
method#11
to execute all Mu threads in the only Scala (JVM) thread. - During the execution of the Mu thread, it hits the trap at line
#14
. The control then transfers to the trap handler. Note that theKEEPALIVE
clause specifies which local variables are eligible for introspection. In this case, it is only%n2
. - In the trap handler, the value of client-specified local variables (keep-alive
variables) are dumped at
#18
and printed. Since the previousADD
instruction#13
adds one to the number, it should print “43” here. - Then at
#20
, the trap handler re-binds the thread with the old stack, passing an empty list of values back to the TRAP. - Then it continues from the Mu function after TRAP
#14
. Thethread_stop
instruction at#21
stops the Mu thread. - In the reference implementation, the
execute()
function at#11
ends when the last Mu thread stopped. Then the example program quits. The specification does not specify how a Mu micro VM should end.
Summary¶
- A MicroVM instance is the heart of the Mu micro VM.
- The client interacts with the micro VM mostly via MuCtx. A context serves only one client thread. It holds Mu values, including garbage-collected object references.
- In Mu, threads and stacks are loosely coupled. Threads can swap from one stack to another.
- The
TRAP
instruction gives the control back to the client from an executing Mu thread. - To start everything: create a MicroVM, create a MuCtx, load a bundle, create a
stack and create a thread. The
MicroVM.execute()
API function is specific to the reference implementation.
Mu Intermediate Representation¶
The client generates code in the format of Mu Intermediate Representation, or Mu IR for short. The IR is the language in which programs are represented in the Mu micro VM. The Mu IR is defined by the Mu IR chapter of the Mu specification.
The structure of the Mu IR is an AST. Mu bundle has a text form for human readability. A bundle can also be built using the IR building API which calls into Mu to build an AST inside Mu. The IR building API is designed for productional setting. This tutorial will use the text-form API.
Bundle¶
The client submits one bundle (a piece of Mu IR code) at a time to the Mu micro VM.
ctx.loadBundle("""
// insert your bundle here
""")
A bundle defines many types, function signatures, constants, global cell and functions. The client may submit multiple bundles one after another.
After submitting, Mu knows about those things. Types can be used, global cells are allocated, and functions are callable.
A bundle looks like this:
// Type
.typedef @i64 = int<64>
// Function signature
.funcsig @i_to_i = (@i64) -> (@i64)
// Constant
.const @I64_1 <@i64> = 1
.const @I64_2 <@i64> = 2
// Global cell
.global @g_foo <@i64>
// Function declaration (no body)
.funcdecl @factorial <@i_to_i>
// Function definition (with body)
.funcdef @fibonacci VERSION %v1 <@i_to_i> {
%entry(<@i64> %n): // Basic block
%lt = SLT <@i64> %n @I64_2 // Instructions
BRANCH2 %lt %small(%n) %big(%n) // Instructions
%small(<@i64> %n): // Basic block
RET %n // Interaction
%big(<@i64> %n): // Basic block
%nm1 = SUB <@i64> %n @I64_1 // Instruction
%nm2 = SUB <@i64> %n @I64_2 // Instruction
%v1 = CALL <@i_to_i> @fibonacci (%nm1) // Instruction
%v2 = CALL <@i_to_i> @fibonacci (%nm2) // Instruction
%rv = ADD <@i64> %v1 %v2 // Instruction
RET %rv // Instruction
}
If you have used LLVM before, the Mu IR is the counterpart of LLVM modules.
Names¶
You may have noticed that there are names for almost all entities. In the Mu IR,
there are two kinds of names: global names and local names. Global names start
with @
and local names start with %
. The allowed characters in names are
[a-zA-Z0-9_.]
.
In fact, local names are just syntax sugars of some global names, that is, anything that has a name has a global name. This will be discussed later.
At this moment, you only need to know what a bundle may contain. Their details will be discussed in the following chapters.
Type System¶
The Mu type system is defined in the specification.
Like many programming languages and frameworks, Mu also has a type system.
The Mu type system is low level. There is no object-oriented programming concepts, such as class, inheritance, polymorphism. There is no high-level concepts such as strings, either. The language implementer is responsible to implement these high-level concepts.
But the Mu type system is also not too low level. Notably, unlike C, C++ or LLVM, the Mu type system still has object reference types in it, and the garbage collector is fully aware of the presence of them. The main idea is, as long as you use the Mu type system, and refer to heap objects using references, you can forget about garbage collection details, such as stack maps, GC-safe points, and read/write barriers.
Overview¶
Some of the types (actually type constructors, explained later) contain angular brackets. These are parameters to these types which may be integer literals, other types or function signatures.
The types in the Mu type system can be put into several categories:
- Scalar value types:
int<n>
,float
,double
,uptr<T>
andufuncptr<sig>
. These types represent plain values. - Scalar reference types:
ref<T>
,iref<T>
,weakref<T>
,funcref<sig>
,threadref
,stackref
,framecursorref
,irbuilderref
andtagref64
. These types refer to “things” in the Mu micro VM. All such references are opaque in the sense that their representation is implementation dependent. - Composite types:
struct<F1 F2 ...>
,array<T n>
,hybrid<F1 F2 ... V>
andvector<T n>
. These types combine simpler types into more complex types. - The void type
void
. It has only one use case: when used as the “referenced type” of references or pointers, it conveys the meaning of “reference/pointer to anything”.
This is a complete list of Mu types. All values of Mu come from one of these types.
Define Types and Function Signatures¶
Type definition¶
To define a type in Mu, you use the top-level type definition: .typedef
.
.typedef @i1 = int<1>
.typedef @i8 = int<8>
.typedef @i16 = int<16>
.typedef @i32 = int<32>
.typedef @i64 = int<64>
.typedef @float = float
.typedef @double = double
.typedef @ptri32 = uptr<@i32>
.typedef @foo.fp = ufuncptr<@foo_sig>
.typedef @refi32 = ref <@i32>
.typedef @irefi64 = iref<@i64>
.typedef @weakreffloat = weakref<@float>
.funcsig @foo.sig = (@i32) -> (@i32)
.typedef @foo.fr = funcref<@foo_sig>
.typedef @stackref = stackref
.typedef @threadref = threadref
.typedef @stkref = stackref
.typedef @thrref = threadref
.typedef @fcref = framecursorref
.typedef @ibref = irbuilderref
.typedef @struct1 = struct<@i32 @i32 @i32>
.typedef @struct2 = struct<@i64 @double>
.typedef @array1 = array<@i32 10>
.typedef @array2 = array<@i32 4096>
.typedef @hybrid1 = hybrid<@i64 @i32>
.typedef @hybrid2 = hybrid<@i8>
.typedef @hybrid3 = hybrid<@i64 @i64 @i64 @float>
.typedef @4xi32 = vector<@i32 4>
.typedef @void = void
Every type defined in the Mu IR has a name, which is on the left side of the
equal sign. All characters in [a-zA-Z0-9_.]
are legal. You can use the dot
.
arbitrarily in the name. So the dot in @foo.sig
does not mean anything
special to Mu.
On the right side of the equal sign is the type constructor: it constructs a type. Some type constructors take parameters while others do not.
What are type constructors?
If we imagine a Mu type as a Java or C++ object, then the type constructor
is like the constructor of such an object. int
is just the abstract concept
of integer, but int<32>
is a concrete 32-bit integer type. Similarly
ref<@i32>
constructs a reference type to @i32
:
.typedef @i32 = int<32>
.typedef @refi32 = ref<@i32>
Some type constructors, such as float
, double
, threadref
or
void
, do not take any parameters. You can consider them as C++/Java
constructors with an empty parameter list. You may have written new Object()
or new StringBuilder()
before. Similarly you define a concrete instance of
float
type in this way:
.typedef @float = float
.typedef @blahblah = float
, where the name @float
or @blahblah
are just names.
When types or function signatures are taken as argument, their names (such as
@i32
, @float
and @void
, not int<32>
, float
or void
) are
used. So the following are not accepted by Holstein:
.typedef @refi32 = ref<int<32>> // ERROR! int<32> must be defined separately.
.typedef @refvoid = ref<void> // ERROR! void must be defined separately.
.typedef @bar.ref = funcref<(@i32) -> (@float)> // ERROR! The signature must be defined separately.
But these are right:
.typedef @i32 = int<32>
.typedef @refi32 = ref<@i32> // Correct.
.typedef @void = void
.typedef @refvoid = ref<@void> // Correct.
.typedef @float = float
.funcsig @bar.sig = (@i32) -> (@float)
.typedef @bar.ref = funcref<@bar.sig> // Correct.
Note
So why does Mu force all types to be “constructed” at the top level? Well,
that’s what Holstein accepts now. There are alternative text Mu IR parsers that accept in-line types
such as ref<int<32>>
.
Actually, productional Mu and client implementations will use the IR building API. It will skip the text parsing phase completely.
The reason why Holstein was designed like that was to let the text match the
actual data structure of the IR. In the IR building API, each type is a
“node”. Types that have parameters (such as ref<T>
) refer to other
nodes by their IDs. Similarly, in the text form, such type constructors
refer to other types by names.
If you have used LLVM before, you may find that you can write types “directly”, “inline”, in the LLVM IR, such as:
%c = add i32 %a, %b
%f = fadd double %d, %e
%g = load i32* %x
But have a look at the C++ API of the LLVM:
Type *i32 = Type::getInt32Ty(ctx);
Type *i64 = IntegerType::get(ctx, 64); // alternative method
Type *floatTy = Type::getFloatTy(ctx);
Type *doubleTy = Type::getDoubleTy(ctx);
Type *voidTy = Type::getVoidTy(ctx);
Type *blahblah = Type::getFloatTy(ctx);
Type *ptri32 = Type::getInt32PtrTy(ctx);
Type *ptri64 = PointerType::getUnqual(i64);
In this API, the programmer still needs to refer to types by pointers to the types. So this API is more similar to having to define (or, at least, make pointers to) the types separately.
On the other hand, there is only 19 types in the Mu type system, among which only 6 do not take arguments. Even if the client programmer has to define each and every types, all common types can be defined in about 20 lines as above, and his/her pain ends there.
Function signature definition¶
A function signature defines the parameter types and the return types of a
function. It is defined by the .funcsig
top-level definition:
.typedef @i32 = int<32>
.typedef @float = float
.funcsig @sig1 = (@i32) -> (@float)
.funcsig @sig2 = (@i32 @i32 @i32) -> (@i32 @float)
.funcsig @sig3 = () -> (@i32)
.funcsig @sig4 = (@i32) -> ()
.funcsig @sig5 = () -> ()
.typedef @funcref1 = funcref <@sig1>
.typedef @ufuncptr1 = ufuncptr<@sig1>
On the left side of =
is the name of the signature. On the right side is the
function signature constructor. In Mu, a function takes 0 or more parameters and
return 0 or more values. It is written in the form (parameter types) ->
(return types)
.
A function signature is not a type. Unlike the C or C++ programming language, there is no “function type” in Mu. In fact, in C, if an expression has function type, it is implicitly converted to the pointer of that function. Mu takes the explicit approach: there are two types that use function signatures:
- The
funcref<sig>
type refers to a Mu function which has signaturesig
. - The
ufuncptr<sig>
type is a pointer that points to a native function that has signaturesig
.
When defining or declaring functions, such as:
.funcdecl @foo <@sig1>
.funcdef @bar VERSION %v1 <@sig2> {
...
%rv = CALL <@sig1> @foo (...) // arguments omitted
...
}
The names of the functions @foo
and @bar
has the funcref<@sig1>
and
the funcref<@sig2>
type, respectively, when used as a value.
Details¶
This section will only discuss the most important types. For more details, you can read the Type System section of the specification.
Integer and FP types¶
int<n>
is the integer type of n
bits. Like LLVM, the int
type is
fixed-length. For example, int<32>
is the 32-bit integer type.
.typedef @i32 = int<32>
.typedef @i64 = int<64>
It is also signedness-neutral: whether an integer is signed or not depends on
the operation, not the type. Most instructions, such as ADD
, SUB
,
MUL
, work correctly for both signed and unsigned integers. Some instructions
have signed and unsigned variants, such as SDIV
/UDIV
,
FPTOSI
/FPTOUI
.
Like LLVM, int<1>
is returned by most instructions that return Boolean
results, such as EQ
and SLT
.
.typedef @i1 = int<1>
float
and double
are the IEEE 754 single and double-precision floating
point number types, respectively.
.typedef @float = float
.typedef @double = double
Like LLVM but unlike some intermediate languages such as C minus minus, Mu does not use a single type (such as “bits32”) to hold both integers and FP numbers, because in modern machines integers and FP numbers are usually held in different kinds of registers.
References to the memory¶
ref<T>
is the object reference type. It refers to objects in the
garbage-collected Mu heap.
iref<T>
is the internal reference type. It refers to a memory
location, that is, a place in the Mu memory that can be loaded or stored. A
field of a heap object is a memory location.
Attention
“Memory location” does not mean “address”. Do not assume a Mu heap object or any other memory locations have addresses. This is very important in Mu. This will discussed in details in later chapters. The specification contains some explanation
Both ref
and iref
may be NULL
.
Note
Sorry for the billion-dollar mistake, but NULL
is
really easy to implement, and Mu is closer to the machine. The client, on
the other hand, should implement a decent language and help the programmers
prevent such mistakes.)
The <T>
type parameter is the type of the heap object it refers to.
For ref<T>
, the T
means it refers to a heap object of type T
. For
example, ref<@i32>
refers to a heap object of @i32
type, which we
previously defined as int<32>
:
.typedef @refi32 = ref<@i32>
.typedef @refdouble = ref<@double>
.typedef @link = ref<@link>
In the last line, @link
is recursively defined as ref<@link>
. It means
it refers to a heap object, whose entire content is an object reference to the
same type, or NULL
. It is very similar to the C definition: struct Link {
struct Link *next; }
. Mu does not need struct
to construct recursive
types.
For iref<T>
, the T
means it refers to a memory location of type T
.
So if you use the LOAD
instruction on an iref<@i32>
, you get a value of
type @i32
. You can also STORE
an @i64
value to a memory location
referred by an iref<@i64>
.
.typedef @irefi32 = iref<@i32>
.typedef @irefi64 = iref<@i64>
References to Mu functions¶
funcref<sig>
is the function reference type. It refers to a Mu
function. Whenever you call a Mu function, you call it with its function
reference. sig
is the function signature.
.funcsig @sig1 = (@i32) -> (@float)
.typedef @funcref1 = funcref <@sig1>
funcref
only refers to Mu functions. It cannot refer to C functions (that is
what ufuncptr
is for).
Like other references, funcref
can also be NULL
(sorry).
Aggregate types¶
Among all aggregate types, hybrid
is the only “variable-length” types. All
others are “fixed-length”.
Fixed-length aggregate types¶
struct<F1 F2 ...>
is the structure type. Like the struct type in
C, it has many fields of types F1
, F2
, ...
.typedef @struct1 = struct<@i32 @i32 @i32>
.typedef @struct2 = struct<@i64 @double>
Structs may contain other structs, arrays or vectors, but cannot contain themselves (otherwise it will be infinitely big). It must have at least one field. But it may contain references so that you can allocate many structs in the Mu heap, each refer to another object.
.typedef @ListNode = struct<@i64 @ListNodeRef>
.typedef @ListNodeRef = ref<@ListNode>
array<T n>
is the fixed-size array type. T
is the element type.
n
is an integer literal and it is part of the type. A particular
array<T n>
holds exactly n
instances of T
. For example, array<@i32
10>
contains exactly 10 @i32
values:
.typedef @array1 = array<@i32 10>
.typedef @array2 = array<@i32 4096>
Like structs, arrays may contain other structs, arrays or vectors, but not itself. It must have at least one element. Arrays of references are allowed.
vector<T n>
is the vector type. It is designed for single-instruction
multiple-data (SIMD) operations. Most modern desktop processors have SIMD
capabilities. Vectors are used in very different ways compared to arrays.
Vectors are usually small and are usually similar to the vector sizes supported
by the machine.
Even today, architectures still do not agree upon any particular vector sizes. Mu only mandate the following three vector types to be implemented:
.typedef @4xi32 = vector<@i32 4>
.typedef @4xfloat = vector<@float 4>
.typedef @2xdouble = vector<@double 2>
The hybrid¶
hybrid<F1 F2 ... V>
is a hybrid of a struct and an array. It starts
with a fixed part: F1
, F2
, ... which is like a struct. It is
followed by a variable part: an array of many elements of type V
.
.typedef @hybrid1 = hybrid<@i64 @i32>
.typedef @hybrid2 = hybrid<@i8>
.typedef @hybrid3 = hybrid<@i64 @i64 @i64 @float>
In the above example, @hybrid1
has one @i64
field in its fixed part, and
many @i32
elements in its variable part. @hybrid2
has an empty fixed
part, and its variable parts are many @i8
elements. @hybrid3
has three
@i64
fields in its fixed part, and many @float
elements in the variable
part.
hybrid
is the only variable-size type type in Mu whose size is determined
at allocation site rather than the type itself. A hybrid must be allocated by
special instructions, such as NEWHYBRID
, which takes not only the type but
also the length as its arguments.
%length1 = .....
%length2 = .....
%r1 = NEWHYBRID <@hybrid1 @i64> %length1 // @i64 is the length of %length1
%r2 = NEWHYBRID <@hybrid1 @i64> %length2 // @i64 is the length of %length2
In the above example, %r1
and %r2
refers to two different objects. Both
have type @hybrid1
, but the length of their variable parts are %length1
and %length2
, respectively.
Since the length cannot be determined by the type itself, it cannot be embedded in other aggregate types, not even other hybrids:
.typedef @some_struct = struct<@i64 @hybrid1 @hybrid2> // ERROR! cannot embed hybrids
Hybrid is the counterpart of the C99 structs with “flexible array elements”. In C99, you can write something like:
struct hybrid1 { int64_t f1; int32_t v[]; };
struct hybrid2 { int32_t v[]; };
struct hybrid3 { int64_t f1, f2, f3; float v[]; };
struct hybrid1 *p1 = malloc(sizeof(int64_t) + 1000*sizeof(int32_t));
struct hybrid1 *p2 = malloc(sizeof(int64_t) + 2000*sizeof(int32_t));
Once malloc-ed with enough memory, C can access the dynamically allocated “tail” elements.
The void type¶
void
means “anything”, and can only be used as the target of references or
pointers. For example:
.typedef @void = void
.typedef @refvoid = ref<@void>
.typedef @irefvoid = iref<@void>
.typedef @uptrvoid = uptr<@void>
ref<@void>
means the object reference can refer to any object.
iref<@void>
means the internal reference can refer to any memory location.
uptr<@void>
means it is... err... just a pointer, and has not been assigned
a type yet.
Mu does not have the concept of “inheritance”, but there are some “prefix rules”
so that a reference may refer to some more complex objects than its <T>
parameter. void
is just the “simplest” type: no content at all.
You can allocate heap objects of the void
type.
%r = NEW <@void>
Such objects have no contents, but each allocated void
object is different,
and compares equal (EQ
) to only itself.
In Java, such use is like Object o1 = new Object();
. There are some corner
cases where such objects can be used as a “key” to identify something.
Other types¶
stackref
, threadref
and framecursorref
refers to “special
things” in Mu: stacks, threads and frame cursors. You will need the first two to
start a Mu program, and need the third to perform stack introspection and
on-stack replacement.
weakref<T>
is the weak object reference type.
tagref64
is the tagged reference type. It uses some clever bit-magic to
reuse the NaN space of double
to represent a tagged union of double
,
int<52>
and a struct of ref<void>
and int<6>
.
uptr<T>
and ufuncptr<sig>
are untraced (raw) pointers. They are
defined to be represented as integers of the pointer size, which is
implementation-specific. For example, on a 64-bit implementation, it is 64 bits.
But if you want to perform pointer arithmetic, you need to convert them to
integers first.
You are unlikely to use raw pointers unless your program interacts with native programs (usually written in C). The garbage collector will not trace them: they are treated just like integers.
If you worked with x86 before, you may ask: Wait! Pointer is not just the address, but also its segment. Sorry, x86. But we see the trend is to move away from segmented architecture (x86_64 moved away from segments, too). For embedded systems that may have multiple address spaces, Mu is not designed for such systems, but supporting such architectures is an open topic.
Bonus section¶
Note
These contents should be moved to other chapters in the future. But if you are interested and patient enough, you can keep reading.
In fact, an internal reference refers to a “memory location” (discussed
in later chapters) of type T
. Memory location is a very important
concept in Mu. It is a location in the Mu memory that can hold a Mu value.
A field of an object is one kind of memory location. All memory accessing
operations, such as LOAD
and STORE
directly work on internal
references. This is different from JVM, where there are getfield
and
setfield
instructions that work on object references.
If you worked with C before, it is the counterpart of the concept of “object”. (What? You say C does not have “objects” but C++ does? Go ahead and read the C specification. In C, “object” means “a region of data storage” and does not mean object-oriented programming.) But the word “object” is used as a synonym of “heap object” in Mu. To avoid ambiguity, we use the word “memory location” instead.
Simple Functions¶
The constant and function syntax is defined in the Intermediate Representation chapter of the specification.
The Mu IR uses a variant of the static single assignment (SSA) or static single information (SSI) form. Mu IR has control flow graphs (CFG), each has many basic blocks, each then has many parameters and instructions. The main difference is that basic blocks take parameters, which are the counterpart of the PHI-nodes in SSA. At the end of a basic block, if it branches to another basic block, it must also specify the arguments to the destination, which are the counterpart of the SIGMA-nodes in SSI. There is no explicit PHI- or SIGMA-node. Instructions can only use global variables, the parameters of the basic block it is in, or the variables evaluated before that instruction in the same basic block. In other words, each basic block is a local scope and is like a single-exit “straight-line” function.
SSA Variable¶
In the Mu IR, as in SSA, a variable is defined in exactly one place and never redefined. For this reason, we still call them SSA variables since they are only assigned in one place.
In the Mu IR, variables are referred to by names, such as @foo
or %bar
.
Variables can be global or local. Global SSA variables are globally valid and never change. (Sorry but we still call them “variables”.) Local SSA variables are only valid within a basic block and gets a value every time it is evaluated. Oh, did I say variables are defined in one place? Yes, they are, but this does not prevent them from being assigned multiple times. That is what “static” single assignment means.
Constant Definitions¶
Constant definitions, i.e. .const
, are a kind of top-level definition.
They construct values using literals.
.typedef @i8 = int<8>
.typedef @i16 = int<16>
.typedef @i32 = int<32>
.typedef @i64 = int<64>
.typedef @float = float
.typedef @double = double
.typedef @refi64 = ref<@i64>
.const @I8_10 <@i8> = 10
.const @I16_10 <@i16> = 10
.const @I32_10 <@i32> = 10
.const @I64_10 <@i64> = 10
.const @MAGIC_NUMBER1 <@i64> = 0x123456789abcdef0
.const @MAGIC_NUMBER2 <@i64> = 0xfedcba9876543210
.const @MAGIC_NUMBER3 <@i64> = -0x8000000000000000
.const @F_PI <@float> = 3.14f
.const @D_2PI <@double> = 6.28d
.const @MY_CONSTANT_REF <@refi64> = NULL
On the left side of =
, there are the name of the constant (such as
@I8_10
) and its type (such as @i8
). On the right side it is, as you can
guess, the constant constructor.
To construct an integer, you can write it in the decimal form or the hexadecimal
form (add 0x
before). It may also have a sign. Since integers themselves in
Mu do not have signs, the integer literal is just used to encode the bit
pattern. Mu uses the 2’s complement representation for negative numbers, so
0xffffffff
and -1
are the same if both are 32-bit.
To construct a floating point number, you can write it in the decimal form, with
a decimal point (that is, 1.0, not 1), and append an f
for float or a d
for double. nanf
, +inff
and -inff
will construct NaN, positive
infinity and negative infinity of the float
type. Replace the last f
with d
and it will be the double
type.
.const @F_NAN <@float> = nanf // Mu will interpret it as an arbitrary NaN
.const @F_PINF <@float> = +inff // positive infinity
.const @F_NINF <@float> = -inff // negative infinity
If you are an FP number wizard, you can also explicitly specify the bit layout of an FP constant:
.const @D_1 <@double> = bitsd(0x3ff0000000000000) // +1.0
.const @D_2 <@double> = bitsd(0x400c000000000000) // +3.5
.const @D_PINF <@double> = bitsd(0x7ff0000000000000) // +inf
.const @D_NAN <@double> = bitsd(0x7ff0000000000001) // nan (one possible encoding)
You can define constants of general reference types (ref
, iref
,
funcref
, threadref
, stackref
and framecursorref
), too. But
the only possible constant value is NULL
.
.typedef @refi64 = ref<@i64>
.typedef @irefi64 = iref<@i64>
.funcsig @foo.sig = () -> ()
.typedef @foo.fr = funcref<@foo.sig>
.typedef @tr = threadref
.typedef @sr = stackref
.typedef @fcr = framecursorref
.const @NULLREF <@refi64> = NULL
.const @NULLIREF <@irefi64> = NULL
.const @NULLFR <@foo.fr> = NULL
.const @NULLTR <@tr> = NULL
.const @NULLSR <@sr> = NULL
.const @NULLFCR <@fcr> = NULL
That is, you cannot define a constant reference to any heap object.
Note
Why there is no constant references to objects?
First of all, constants, as the name suggests, never change. If a constant refers to an object, the object is immortal! But the reason why we use the heap is to use GC, which eventually recycles the object.
Secondly, from the implementation’s point of view, the advantage of using
constants is that they can exist as immediate values in machine
instructions, or be created by some machine code idioms (e.g. xor rax,
rax
makes rax 0, and the instruction decoder in modern processors (since
IvyBridge) can eliminate such “idioms” in the front end), rather than being
stored in the memory and loaded when needed (memory is slow nowadays
compared to 20 years ago). But if the type is object reference, perhaps the
only feasible way to implement such constant is to store it in the memory so
that copying GC can update it when the referenced object is moved.
Non-copying GC sucks, because the VM will eventually die of heap
fragmentation. (R.I.P. lighttpd. You know, C programmers are
responsible for memory management. If C’s malloc cannot manage the memory
well and kills long-running servers, we should use a VM with copying GC,
instead. If usual VMs perform too bad, that’s why we build the Mu micro VM.)
But if the GC ends up modifying the machine code to fix the reference, it
will be too painful.
If we really need some permanent global memory space, Mu has another
top-level definition: global cells, i.e. .global
(it will be discussed
in details when we talk about memory access). Global cells are memory
locations: they are mutable. They can be loaded and stored, and they are
permanent. Just store an object reference in a global cell and it has all
the benefits of constant references.
For other references, constant function reference is unnecessary because the name of the function is already a constant function reference. Stacks are similar to heap objects. Threads and frame cursors have their own lifecycles, so you can’t possibly create such constants that remain valid.
However, pointers are not references. They are just integers and can be constructed as integers.
.typedef @i64 = int<64>
.typedef @ptri64 = uptr<@i64>
.const @MY_POINTER <@ptri64> = 0x123456789000
.funcsig @bar.sig = () -> ()
.typedef @bar.fp = ufuncptr<@bar.sig>
// The address can be looked up by dlsym.
.const @MY_FUNCTION_POINTER <@bar.fp> = 0x7fff00001230
Mu support constants of non-hybrid composite types, too. A composite constant is constructed by referring to other constants. Please put as many fields/elements as there should be.
.typedef @i32 = int<32>
.typedef @struct1 = struct<@i32 @i32 @i32>
.typedef @array1 = array<@i32 2>
.typedef @vector1 = vector<@i32 4>
.const @I32_1 <@i32> = 1
.const @I32_2 <@i32> = 2
.const @I32_3 <@i32> = 3
.const @I32_4 <@i32> = 4
.const @S1 <@struct1> = { @I32_1 @I32_2 @I32_3 }
.const @A1 <@array1> = { @I32_1 @I32_2 }
.const @V1 <@vector1> = { @I32_1 @I32_2 @I32_3 @I32_4 }
.typedef @struct2 = struct<@struct1 @i64>
.const @S2 <@struct2> = { @S1 @I32_4 } // correct
.const @WRONG <@struct2> = { {@I32_1 @I32_2 @I32_3} @I32_4 } // ERROR: cannot nest braces. Define separately
But it is not recommended to use constants of composite types, unless they are small. Mu may not be able to allocate big values into registers, in which case it may perform stupid copying. The micro VM may not be smart enough to do too much optimisation.
Function definition¶
TODO