Rholang Interpreter

Introduction

Purpose:

This project is an experimental attempt to 'proceed more directly to the RhoVM Milestone' in the long term project plan.  Presently, the project plan involves the Rosette VM as the execution engine for Rholang contracts.  In a future milestone, the plan was to eliminate the Rosette VM and compile Rholang directly to machine code.  The current architecture involves 2 compilers as described in the (outdated )Implementation Architecture.  This architecture creates complexity and adds a significant amount of time to the project, as the team has to focus on the correctness of both the Rholang compiler and the Rosette RBL compiler. 


References:

(outdated )Implementation Architecture - Describes the compilation and execution path in RChain originally planned.

Term Normalization and Structural Equivalence. - Describes how normalization results in only needing a basic byte compare to achieve structural equivalence.

(deprecated) RSpace 0.2 Specification - Describes the storage component.

Definitions:

RBL - Rosette Base Language

Par - a class containing a list of Sends, Receives, Evals, News, and Expressions 

Scope

Implement a normalizer and interpreter for the Rholang language which will create Scala objects which will be then executed by the JVM directly.  

Rholang Interpreter & Task Queue

Queue[(Env, Par)]

Build a Continuation (par objects+ environment) queue where we will queue the normalized processes to be evaluated(?).  Use a thread pool of eval(?) workers.  Workers pull processes from the queue and evaluate(?) them.  The thread pool will provide an adequate level of non-determinism. 

Build an interpreter  - takes the Scala objects and runs them.  Has hooks to the storage layer for reads and writes.  It will need hooks to some system functionality - Print for a start.  future functionality tbd.   Multiple objects in a pool and will pull objects and execute them.   

Normalizer

(Described in order of declaration not doing)

Build a Normalizer that compiles Rholang into normalized Scala objects.  Sort order can be arbitrary but has to be total.  Nil is the identity, it doesn't put nils in (except when it is the only process) ex: the empty string is the identity for string concatenation.  Private processes can only be written by the interpreter.  When writing a private process, serialize a private flag into the first few bits, so the process could be inspected and identified as 'private'.  Pretty printing normalized code would help making the argument compelling.  Could also be used for testing.

Storage Layer

Storage layer will accept these pars and keys.  The keys will need to be hashed as LMDB will limit the size to 511 bytes.  Implement like a hash bucket.  For continuations, we will not use  Scala's de-limited continuations.  The continuation object will be the closure plus its' environment.  Storage layer will communicate via protobufs.  Leave it up to LMDB to determine when data is put to disk.



Use Cases

Provide a list of Use Cases that the software will address.

  • Demonstrate how de-referenced names are handled better
  • Sending on non-trivial names is possible. Ex: '@{ *x | *y }'

Design Considerations

Describe the issues which need to be addressed or resolved before attempting to devise a complete design.

  • Non-Determinism - We must preserve the ability to have non-determinism.  We intend on doing this by using a thread pool with the Java task queue.
  • We need to be able to run straight line code without any contention as a long trace. 
  • Garbage Collection for private names.  (Applies when a private name is created - and there are no processes that can write on that name, similarly if you create a send and no one can receive on the send).  How does this work. (Need to provide for this, not implement now)
    • Separate process for pruning of names - exists in the DB.
    • Is there a continuation that could read/write from this name?
    • Reference counting as an option to solve this problem?
  • Persistence of continuations is not possible in the evaluator. Every time a consume goes into the database we need to keep a flag indicating if the continuation needs to persist.  

Interfaces  

Describe how the software will interface with other software components.

System Interface

TODO - sort this section out.

Hardware Interface

Marshalled to  the JVM.  The JVM will generate the machine code for the system to execute.

Software Interface

Will software be used in creating the system? If so, indicate what software (name and version), how it is to be used, and any details about how it interfaces with the software being specified.

The Storage layer uses LMDB 

User Interface

What is the user interface for the software? How will users interact with the software?  List out all the aspects involved in making the UI better for all users that will interact or support the software.

Kyle Butt: We will need a basic command line interface which takes in a Rholang file. 

Error messages -The initial release of the Interpreter will include line numbers for debugging purposes.

Communications Interface

This will be marshalled to the JVM

System Overview

Provide a description of the software system, including its functionality and matters relating to the overall system and design.  Feel free to split this up into subsections, or use data flow diagrams or process flow diagrams.  

Normalizer

See Term Normalization and Structural Equivalence.

The normalizing parser produces terms where processes in parallel have been binned according to their type (sends, receives, expressions). 

  • Consumes a Rholang file
    • Creates an AST
  • Normalizes the terms into Pars
    • Do the Pars have a normal form or are the Pars the normal form of the source code? 
  • Sorts the Pars/Sets/Maps
    • A score tree is recursively built for each term and is used to sort the insides of Par/ESet/EMap. For most terms, the current term type's absolute value based on the Score object is added as a Leaf to the left most branch and the score tree built for the inside terms are added to the right. The Score object is a container of constants that arbitrarily assigns absolute values to term types. The sort order is total as every term type is assigned an unique value in the Score object. For ground types, the appropriate integer representation is used as the base score tree. For var types, the Debruijn level from the normalization is used.
    • Requirements for Sort:

      • Take the unsorted output and return a sorted output (in the same in memory form)
      • Order is total but arbitrary. The order is over all processes,  operators in the term language.  
      • Need to differentiate between bound and free variables.

      Sorting and then normalizing requires that the normalization preserves the sorting order (just like a stable sort). Normalizing first and then sorting is provably correct.

  • Patterns for receives each start their free variables at 0. After the patterns and channels are all normalized, they are sorted.
    • For execution, the values that are matched are assigned left to right, 0..k, 0..l, 0..m.

Interpreter

  • Starts with the empty environment.
  • Parallel terms are evaluated on a threadpool in the same environment. (Neither side of a par term can modify the binding environment of the other)
  • When a read or write occurs, to get the actual name, we need to substitute from the environment to the name, and then sort.
    • For example if x is bound to the name @{for (z ← @Nil) { z!(7) } }, and a send occurs on @{ x | 8 }, it has to be rewritten to: @{for (z ← @Nil) { z!(7) } | 8 } as a key in the tuplespace.

  • When a read occurs, the continuation stored with the pending read will have references to the surrounding environment. We can substitute or store the environment.
  • When sending a process with bound variables, we must perform an explicit substitution.
  • Upon send, all expressions are evaluated: x!(7 + 5) | for (y <- x)   here y receives the process 12, not the process 7 + 5
  • Upon eval, we must re-number the term to be evaled. See the paper: Explicit substitutions with de Bruijn's levels.
  • Terms that can't reduce should print a warning and reduce to Nil. e.g. 7 + (P | Q) Here we do not try and cause (P | Q) to reduce to an integer first.

Storage

(See (deprecated) RSpace 0.1 Specification)

Matcher

  • Basically the spatial type checking algorithm.
  • True is a free variable.
    • We don't currently allow references to variables not bound in the pattern:

      for (x <- y) {
      	match x with {
      		{ *y } => P
      	}
      }
      
      for (x <- y) {
      	match x with {
      		{ for (ret, @x, @y <- a) { ret!(x + y) } } => P
      	}
      }
    • In the first pattern above The y in the pattern doesn't refer to the same y as the bind. It is a shadowing bind of y. So in P, y refers to the name being dropped in x.
    • In the second pattern above, ret, x and y are bound by the for in the pattern. This pattern is looking for an adder fragment on a channel a.
      • a is free in the pattern, and may be used in P.
    • Because free variables are a binder, we have previously checked the pattern to verify that a free variable isn't used more than once
  • The matcher will be used by both the interpreter (To evaluate matches (wink)) and the tuplespace (To check patterns)

Assumptions and Dependencies

Version of the JVM that we will support is:

  • openjdk version "1.8.0_151"
    OpenJDK Runtime Environment (build 1.8.0_151-8u151-b12-0ubuntu0.17.10.2-b12)
    OpenJDK 64-Bit Server VM (build 25.151-b12, mixed mode)

Version of LMDB for Storage Layer:

Reference: (deprecated) RSpace 0.1 Specification

Architectural Strategies

Describe any design decisions and/or strategies that affect the overall organization of the system and its higher-level structures. These strategies should provide insight into the key abstractions and mechanisms used in the system architecture. Describe the reasoning employed for each decision and/or strategy (possibly referring to previously stated design goals and principles) and how any design goals or priorities were balanced or traded-off. Such decisions might concern (but are not limited to) things like the following:

  • Use of a particular type of product (programming language, database, library, etc. ...)
  • Reuse of existing software components to implement various parts/features of the system
  • Future plans for extending or enhancing the software
  • User interface paradigms (or system input and output models)
  • Hardware and/or software interface paradigms
  • Error detection and recovery
  • Memory management policies
  • External databases and/or data storage management and persistence
  • Distributed data or control over a network
  • Generalized approaches to control
  • Concurrency and synchronization
  • Communication mechanisms
  • Management of other resources


System Architecture


Helloworld.rho →Parser → Normalizer  → Creates AST Proto (Pars) → Sorter → Sorted Proto (Throws away sort trees, as they're only for normalizing) 


Interpreter uses storage and normalized protos to run.


Steps that need a home:

Spatial Type check (Matcher)

Execution by JVM ← This is the interpreter

Thread queue ← This is a small part of the interpreter

Storage ← Kyle Butt This is used by the interpreter.

Henry Till - Please fill in your understanding of how storage plays a part in the system architecture.  I love pictures.  (smile)