A VM Translator is a program that takes VM code as input and spits out assembly code as output. That’s all well and good if you know what VM code even is.
The hardware side is where the first portion of the Nand to Tetris course picks up. I previously summarized of the topics discussed in the first six modules of the course.
VM Translation sits in the middle to the tail end of the software hierarchy. The Jack programming language, modeled after Java, uses a two-tier compilation. High-level languages are compiled to VM code as an intermediate step before being translated into a hardware-specific assembly language. The VM Translator performs the step of taking the VM code that a compiler spits out and returning assembly code. By splitting compilation into two steps, each can be made simpler.
Another benefit of two-tiered compilation is that it allows multiple programming languages to compile to the intermediate VM code and then share the same VM translation software. For example, Java, Groovy, Kotlin, Scala, and Clojure compile from their respective files into *.class files, which can then be run on the Java virtual machine (JVM) instead of all needing separate code to compile to assembly on different host platforms.
For the nand2tetris course, the VM Translator I built creates code for the Hack computer I made in the first half of this course.
The underpinning of the virtual machine is something called a Stack Machine. Yes, it’s a tad confusing to be introduced to more than one abstract thing called a machine. The critical insight here is that using the concept of a stack machine enables us to turn complex programs into a lower-level abstraction that is still sufficiently high-level to save mental energy when thinking about them. That’s the beauty of having different levels of abstraction; it helps shape the world in models that are simple enough to fit into our dumb monkey brains. Stack machines are Turing complete, which as Nisan and Schocken say,
“Any program, written in any high-level programming language, can be translated into a sequence of operations on a stack.”
A stack is a sequence of memory that can grow and shrink as needed. The two key operations on the stack are
The hack VM model is stack-based in which all of the operations take their operand either from the stack or store their results on the stack. The Hack VM uses eight different virtual segments. Each of the segments acts like it’s own stack and is accesed via push and pop commands.
Separating out memory into different virtual segments allows the meaning of different variables to be preserved. Memory segments make the virtual machine easier to reason about.
// virtual memory segments Segment Role --------------------------------------------------------------- argument Represents the function's arguments local Represents the function's local variables static Represents the static variables seen by the function constant Represents the constant values 0,1,2,3,..., 32767 this Used to store 'this' of the caller that Used to store 'that' of the caller pointer Used to manipulate this & that. More detail below temp Mapped directly on RAM locations 5-12
I’ve included the specification for what the VM commands are supposed to do below. These specs are directly from Ehe Elements of Computing Systems.
Push / Pop commands transfer data between the stack and memory segments.
push segment index: Pushes the value of segment[index] onto the stack, where segment is argument, local, static, this, that, pointer, or temp and index is a nonnegative integer.
pop segment index: Pops the top stack value and stores it in segment[index], where segment is argument, local, static, this, that, pointer, or temp and index is a nonnegative integer.
These commands perform arithmetic and logical operations.
// arithmetic-logical commands of the VM language cmd op comment ------------------------------------------------------- add x + y integer addition (two's complement) sub x - y integer subtraction (two's complement) neg -y arithmetic negation (two's complement) eq x == y equality gt x > y greater than lt x < y less than and x And y bit-wise And or x Or y bit-wise Or not Not y bit-wise Not
Branching commands facilitate conditional and unconditional branching.
label label: Labels the current location in the function’s code. Only labeled locations can be jumped to. The scope of the label is the function in which it is defined. The label is a string composed of any sequence of letters, digits, underscore, dot, and color that does not begin with a digit. The
label command can be located anywhere in the function, before or after the goto commands that refer to it.
goto label: Effects an conditional goto operation, causing execution to continue from the location marked by the label. The goto command and the labeled jump destination must be located in the same function.
if-goto label: Effects a conditional goto operation. The stack’s topmost value is popped; if the value is not zero, execution continues from the location makred by the label; otherwise, execution continues from the next command in the program. The if-goto command and the labeled jump destination must be located in the same function.
Function commands facilitate function call-and-return operations.
function functionName nVars: Marks the beginning of a function named functionName. The command informs that the function has nVars local variables.
call functionName nArgs: Calls the named function. The command informs that nArgs arguments have been pushed onto the stack before the call.
return: Transfers execution to the command just following the call command in the code of the function that called the current function.