150 lines
5.3 KiB
Markdown
150 lines
5.3 KiB
Markdown
# About
|
|
|
|
This is an exploratory project into virtual machines and assembly language. By
|
|
no means is this ready for production use or particularly well maintained. The
|
|
language is inspired by x86 and ARM assembly and does very little hand holding.
|
|
|
|
Checkout the `bin/example.wasm` example source to get an overview of the
|
|
language, or keep on reading!
|
|
|
|
# Design
|
|
|
|
## From Text To Runtime Behaviour
|
|
|
|
In order to turn the source text into executable code we use 3 passes:
|
|
- Pass 1: tokenization (syntax check) and preprocessing (substitution)
|
|
- Pass 2: interpretation (semantics check)
|
|
- Pass 3: execution (runtime check)
|
|
|
|
After pass 2 ties to the source code are lost, meaning that any error occurring
|
|
afterwards can be a bit cryptic as to where it originated.
|
|
|
|
## Notation
|
|
|
|
- `[operation][number type]`, e.g. `divi` for divide (div) integer
|
|
- `%[register]` for addressing registers
|
|
- `$[value]` for using literals/immediate values
|
|
- `;` for end of statement (mandatory)
|
|
- `[label]:` for labels
|
|
- `#[text]` for comments: any text is ignored till a newline (`\n`) is found
|
|
- `[[%register|$value]]` for accessing memory
|
|
- Elements must be separated by whitespace character
|
|
- Good: `add $2 $5 %A;`
|
|
- Bad: `add $2$5%A;`
|
|
|
|
## Examples
|
|
|
|
Divide register A by 5 and store the result in register A:
|
|
`divi %A $5 %A;`
|
|
|
|
Increment B until it is 10:
|
|
```
|
|
# Set B to zero
|
|
addi $0 $0 %B;
|
|
|
|
loop:
|
|
addi $1 %B %B;
|
|
lti %B $10;
|
|
jmp loop;
|
|
```
|
|
|
|
Read the integer at memory location `1024` into register A:
|
|
```
|
|
seti %A [$1024];
|
|
```
|
|
Remember not to use spaces inside the `[` brackets.
|
|
|
|
## Reserved Symbols
|
|
|
|
The following whitespace characters are used to separate symbols:
|
|
- space (` `)
|
|
- tab (`\t`)
|
|
- return carriage (`\r`)
|
|
- newline (`\n`)
|
|
|
|
The following characters are used as identifiers:
|
|
- dollar (`$`) for immediate (literal) values
|
|
- percentage (`%`) for register identifiers
|
|
- colon (`:`) for jump labels
|
|
- semicolon (`;`) for statement termination
|
|
- hash (`#`) for comments
|
|
- square brackets (`[` and `]`) for addressing memory
|
|
|
|
## Memory Model
|
|
|
|
The stack, with which you interact through pop/push operations, grows from
|
|
memory location 0 to the end of the memory. There is no strict checking on
|
|
whether your own memory operations through `[]` affect the stack: this is a
|
|
feature, not a bug. Keep in mind that the stack can underflow and overflow
|
|
and that the memory uses byte units (8 bits), whereas the registers are all
|
|
32 bits wide. This means that reading from location `$900` overlaps with 3
|
|
bytes when reading from location `$901` (the first byte of `$901` is the
|
|
second byte of location `$900`).
|
|
|
|
## Symbols
|
|
|
|
All symbols are reserved keywords and can therefore NOT be used as labels.
|
|
There is currently no strict checking, so be careful.
|
|
|
|
## Preprocessor
|
|
|
|
All preprocessor directives are prefixed by a `#`. Ill formed preprocessor
|
|
directives do not halt compilation, they are merely reported and then ignored.
|
|
|
|
- `DEFINE` replaces any occurrence of the first argument by the second argument.
|
|
The second argument may be empty, effectively deleting occurences of argument
|
|
one. Quotes are currently not supported and arguments are separated by
|
|
whitespace. If multiple defines exist for the same substitution the first
|
|
declared is used.
|
|
|
|
### Operands
|
|
|
|
- `addi` add the first to the second argument and store the result in the third
|
|
argument
|
|
- `subi` subtract the first from the second argument and store the result in
|
|
the third argument
|
|
- `divi` divide the first by the second argument and store the result in the
|
|
third argument
|
|
- `muli` multiply the first by the second argument and store the result in the
|
|
third argument
|
|
- `shli` shift left the first argument by the number of positions given by the
|
|
second argument and store the result in the third argument
|
|
- `shri` shift right the first argument by the number of positions given by the
|
|
second argument and store the result in the third argument
|
|
- `seti` set the first register argument to the second argument
|
|
- `int` calls the interrupt specified by the first (integer) argument
|
|
|
|
### Control Flow
|
|
|
|
- `jmp` jump to the label given by the first argument
|
|
- `call` put the next statement to execute on the stack and jump to the label
|
|
given by the first argument
|
|
- `ret` pop the the next statement to execute off the stack, e.g. returning to
|
|
the next execution statement before calling `call`
|
|
- `lti` execute next statement if argument 1 is less than argument 2 else skip
|
|
the next statement
|
|
- `gti` execute next statement if argument 1 is greater than argument 2 else
|
|
skip the next statement
|
|
- `eqi` execute the next statement if argument 1 is equal to argument 2 else
|
|
skip the next statement
|
|
|
|
## Memory
|
|
|
|
- `popi` pops the first value on the stack into the register specified as the
|
|
first argument
|
|
- `pushi` pushes the value on the stack from the register or immediate value as
|
|
the first argument
|
|
|
|
## Interupts
|
|
|
|
- [0..3] Output to STDOUT
|
|
- `0` put value of register A as ASCII character on stdout
|
|
- `1` put value of register A as decimal integer on stdout
|
|
- `2` put value of register A as hexadecimal integer on stdout
|
|
- `3` put the string pointed at by register A for the amount of characters
|
|
defined by register B on stdout
|
|
- [4..5] Input from STDIN
|
|
- `4` get a single ASCII character from STDIN and store it in register A
|
|
- `5` get a string of a maximum length determined by register B and store it
|
|
in the address specified by register A. After execution register B will
|
|
contain the number of characters actually read. |