Retep

WASM101 | Runtime


The previous blog post explained how meaningful modules are constructed from the bytes in a WebAssembly (WASM) binary file. This time, we delve into the topic of the WASM runtime—how to execute these bytes on a computer. We’ll start with the most basic computational model.

Computational Model

The computational model of WASM is a stack machine.

Tip

Computational Model: The computational model describes how to compute outputs based on a set of input values (CPU instructions).

Computational models can be categorized into stack machines, accumulators, and register machines. The CPU instruction set varies depending on the computational model, as each model has its own way of accomplishing tasks.

Stack Machine

A stack machine is a computational model that uses a stack as its primary data structure, relying on the Last-In-First-Out (LIFO) principle to manage data. The stack machine maintains a stack, and all operands are assumed to be retrieved from the top of the stack. For example, to calculate the value of 1 + 2 in a stack machine, the instructions would be:

PUSH 1 // Push 1 onto the stack
PUSH 2 // Push 2 onto the stack
ADD    // Pop 2 from the stack, pop 1 from the stack, perform addition, and push the result 3 onto the stack

Stack machines have several advantages over accumulator and register machines:

  • Zero- or Few-operand Instructions: Stack machine instructions typically do not include addresses of operands (0-operand instruction set) since operands are popped from the stack.
  • Simplified Instruction Set: Due to using a stack to store data, the instruction set of stack machines is usually simpler than that of register machines.
  • Efficient Function Calls: Recursion and function calls are natural on stack machines because each function call’s local variables and return address can be naturally stored in the stack.

You can see the stack machine’s computational model in action with a simple WASM program:

(i32.add
	(i32.const 1) ; Push constant 1 onto the stack
	(i32.const 2) ; Push constant 2 onto the stack
) ; Execute the i32.add instruction to calculate the sum of the top two elements on the stack

Accumulator

An accumulator uses a special register (the accumulator) to store results. To compute the value of 1 + 2, the instructions for an accumulator would be:

LOAD 1  // Load 1 into the accumulator
ADD 2   // Add 2 to the value in the accumulator, no need for
STORE   // Store the value 3 in memory

Register Machine

A register machine uses multiple registers to store results. To compute the value of 1 + 2, the instructions for a register machine would be:

LOAD R1, 1     // Load 1 into register R1
LOAD R2, 2     // Load 2 into register R2
ADD R3, R1, R2 // Add the values in R1 and R2, and store the result in register R3

Value Types

The next question is, what elements will be stored in the stack of a stack machine? First, let’s introduce the value types defined by WASM, which fall into three categories:

  • Numeric Types: i32, i64, f32, `f64
  • Vector Types: v128
  • Reference Types: ref.null t, ref funcaddr, ref.extern externaddr Numeric types are straightforward, with 32-bit/64-bit integer/floating-point types. Numeric types do not differentiate between positive and negative; their meaning is determined by the operand. Vector types consist of 128-bit integer/floating-point arrays. This type was added later to support SIMD (Single Instruction, Multiple Data) operations, which can improve computational performance for multimedia processing, encryption algorithms, and scientific computing, among other scenarios. Reference types help WASM operate on complex data structures (such as host objects). Reference types are divided into three kinds: funcref for function references, allowing functions to be passed as arguments or stored in data structures; externref for external references, which can reference any host object; and null t representing a null reference of type t (either function reference or external reference).

Branching

Labels allow us to jump to a specific position in the code, enabling branching, looping, function calls, and other capabilities. Labels are commonly used in many programming languages for jumps. For example, in the following JavaScript example:

let str = "";

loop1: for (let i = 0; i < 5; i++) {
  if (i === 1) {
    continue loop1; // specify to continue the statement labeld as loop1
  }
  str = str + i;
}

console.log(str);
// Expected output: "0234"

Labels in WASM similarly mark a position in the code, allowing control flow based on operands. Specifically, each label type contains a continuous piece of code, indicated by its start and end positions (i.e., a start_pos and an end_pos). Labels come in two main uses:

  • Block: Marks a block of code. If a br instruction occurs within the block, execution continues after end_pos.
  • Loop: Marks a loop. If a br instruction occurs within the loop, execution returns to start_pos.
(block $my_block
	br $my_block
)
// <- br jumps to here


(loop $my_loop // <- br jumps to here
	br $my_loop
)

Info

loop statement and if statement are both realized by branch under the hood。

In WASM binary, there are no actual identifiers like $my_block. Instead, the br instruction takes a parameter l, which indicates the number of labels to pop from the stack. For example:

(loop // level 1 for this br
  (block // level 0 this br
    br 1
  )
)

Here, br takes the parameter 1, which means it pops two labels: the inner block and the outer loop. This behavior is similar to the JavaScript example where an inner if (as an inner block) continues the outer loop.

Loops

Here’s a loop that counts to three, including arithmetic, logical calculations, loops, variables, and more, summarizing the content of this blog post:

(module
  (func $count_to_three
    (local $i i32) ; Define a local variable named $i for counting

    ;; Set the initial value of the counter to 0
    (local.set $i (i32.const 0))

    ;; Start the loop structure
    (loop $loop_label          ;  **Here, a label is defined to mark the top of the loop**


      ;; Increment the value of the counter
      (local.get $i)            ; Get the current value of $i
      (i32.const 1)             ; Prepare the value to add
      (i32.add)                 ; Perform addition
      (local.set $i)            ; Update the value of $i  

      ;; Check if the counter has reached the upper limit
      (local.get $i)            ; Get the current value of $i
      (i32.const 3)
      (i32.lt_s)                ; Check if $i < 3

      ;; **If true, jump to $loop_label which points to the start of the loop, continue**
      (br_if $loop_label)

      ;; When $i >= 3, end the loop
    )
  )
)