Rosette dismemberment and re-composition

Background

The Rosette C++ VM was written in the early 1990's using techniques of the day. As such it is tightly integrated and difficult to understand.

We propose to separate the subsystems into more clear to understand, stand-alone software implementations that can be used individually and in sets to test and verify the Scala version of the VM.

We propose to create specifications for the following subsystems:

  • Data Types and Primitives
  • RBL Syntax
  • Opcodes
  • VM & Actor model
  • RBL Compiler
  •  

Action items

  • Refactor pointer piggy-backed data and reimplement as a pOb object that stores the piggy-backed data in separate member data items.
  • Remove 32-bit dependencies and replace with 64-bit capable
  • Understand Opcodes and Data Type interaction
  • Separate RBL Compiler into stand-alone implementation
  • Implement / re-implement data memory image dump
  • Understand and separate Tuple space implementation
  • Additional cleanups as they present themselves
  • Add Multi-threading
  •  

Pointer Meta-data

object_ptr<T>

    T * obj

    type_info typebits

    ...


Compiler separation

The RBL compiler within rosette is currently dependent on data structures within Rosette. The compiler needs to be made into a standalone module that can be run separately from the Rosette environment. This will require code and data references to be resolved in a separate "link" step.

Per Timm:

SymbolNode::initialize in src/Compile.cc should be the function.
Also SymbolNode::primNumber probably.

One difficulty with this is that the RBL compiler needs input from an environment. However it seems that this is only the case for symbol lookup.

During compilation the RBL compiler looks up a symbol in the environment and emits an opcode which contains the index to that symbol in the environment.
If the compiler can't find the symbol in the environment, it appends the symbol itself to the code object and emits an opcode to the code object which tells the VM to lookup the symbol during runtime.

If we change the compiler in a way where we force it to always do the runtime lookup and then successfully run boot.rbl with it, we should have certainty that we can separate the compiler from the VM.

What probably also helps you understand is (show-code (compile '(+ 1 2))) vs. (show-code (compile '($ 1 2))).
The first one does the lookup during compile-time while the second one does the lookup during run-time.