Storage & Tuplespace Meeting notes

Date

23 Jan 2018

Attendees

Medha Parlikar (Unlicensed)
Michael Stay (Unlicensed)
Henry Till
Greg Meredith
Timm Schäuble
Chris Kirkwood-Watts
Former user (Deleted) (invited & debriefed with Mike in advance)
Medha Parlikar (Unlicensed) (Scribe)

Goals

Determine how the storage layer works with the Tuplespace

Discussion items

Time	Item	Who	Notes
5 min		Michael Stay (Unlicensed)	With Kent's recent PR, we need to determine how the Tuplespace interacts with Storage. See: Node Integration Meeting notes. As far as recording the state of the virtual machine not running on the blockchain, use a "table add" and "table get" For storing things off the blockchain, using a try may be sufficient. As far as what is needed for the SDK, only the LMDB hooks are needed for 'table add' and 'table get'
	On blockchain versus off blockchain contracts, and how they interact with Storage.	Michael Stay (Unlicensed)	Going to the blockchain - we will want to be executing the code itself using the VM, which has the hooks to "table add" and "table get". It is not yet clear to me how the system contracts that are not running on the blockchain, will work with the contracts that are running on the blockchain. Does anyone have any ideas on that - Greg? We have system contracts that do things like use the communications layer and the file system and other local resources. My impression on how things are architected, there are 2 different places where Rholang is running, off the blockchain - which has the powerbox contracts - which has the ability to create side effects (writing to disk and sending messages) - and those system system contracts are delegated the authority to create the blockchain. Then there are Rholang contracts that are run on the blockchain and on the Virtual machine. How to run both, when we don't have a simulator to run contracts on the blockchain (and we don't want to expose the same authority as the powerbox contracts.) In the former, we can expose "table add" and "table get" - and it can store the continuations and the VM state - ~~and it may need to serialze the heap~~. (No need to serialize the heap with Scala VM - Scala supports serializing the state) In the case of the blockchain stuff, we probably want to store continuations ~~(not sure about the heap)~~ - but we will need the whole state of the system in a try - until Casper tells us to abandon a particular fork. We will also need a try for the names of the processes that are listening based on the AST of the bytes in a hash. We will also need the hashes of the blocks for the storage layer, which is needed for running code on the blockchain, and not the system contracts. This is another example of a try. We need to work out how the blockchain Rholang code is going to run and interact with the storage layer.
	Recommended solution	Greg Meredith	Simplest solution, class of ports that are Blockchain enabled. All I/O on those ports has to capture enough state. You don't do the I/O unless you have consensus. When you have consensus you have to capture the state. There is no point where it is necessarily where the system knows it has reached consensus, they won't receive notification from the network that consensus has been reached. If you get acknowledgement on last final block, then you know consensus has been reached. You are going to package up a bundle of these transactions and the state that has to do with that I/O that you commit to and continue moving forward. If you have to roll back, you have to roll back to the I/O that was in your last finalized block. The blocks that are issued in Casper contain 3 kinds of events - Deploy of new contract addition of new message synchronization event (COM event) (this is the same as #1 -because you have a system level contract that has to be listening for a new name) Every event will be an update to a state and the nodes will contain deltas for the state change. Greg - that has been the plan for a long time. Hold on to the deltas and then use those to rollback.
			Doing this in the VM The VM state is captured neatly thanks to Scala In the C++ there is an entry point in 'Dump world' - Greg hasn't exercised it in a very long time. Once upon a time he could do a snapshot of the state. Doing a snapshot of the state and identifying the delta would be very slow. The format of the dump is also not a good serialization target. With C++ this doesn't seem to be very practical. The C++ plan is a backup. As a lower priority development thread - and focus on the Scala solution. 6 months for the 64 Bit port. Not part of Mercury. Maybe we could have it for the Earth milestone- or another interim release. We still need to sort out snapshotting the VM state implemented. This is still not spec'd out. Timm Schäuble - Greg do you think that the VM will be the bottleneck? Greg - it's possible. Back in the day the C++ VM was very fast. The C++ VM still has lots of headroom. If we need the headroom on performance, we could squeeze it out of the VM.
	Question on heap, why is heap something we need to concern ourselves with?	Chris Kirkwood-Watts	The heap - it looks like the Scala VM state is something you can serialize, and that comprises most of the state, IDK if there is an independent heap in the Scala - Greg - No. Everything hangs off the context object. Seems straightforward. Mike mentioned this for non-blockchain stuff. Greg - all of this is the necessary building blocks to build the state storage solution. And now, we have to design a serialization mechanism that is delta based. That is the part that is going to require engineering. We can't walk the state every time. We could snapshot every so often and record the com events that brought us from one snapshot to another. Michael Stay (Unlicensed): If we have the latest finalized block, that will contain the VM state that we can restore, and then we can run the events to restore the state. Chris - the data in the block is ordered? YEs - it is causally ordered. From the blocks, you can recover the state of the VM up to some equivalence. If there are processes running in memory (Medha to fill this in) The Serialization under C++ requires a heap. Is there any insight on how that serialization ought to go? there are a couple of usual strategies that are terrible in performance - Even if it is efficient, you will still come away with something need to clean up. Dump world code has that machinery in place. This would need to be modified to support deltas - it's an optimization, but it is going to be necessary. (Greg) - That code needs to be shaken, but at one point it worked. It's basically GC.
	Next Steps:	Chris Kirkwood-Watts / Medha Parlikar (Unlicensed)	Can we write down what has to go into a block? Michael Birch is the guy to talk to. He is starting to write the Rholang contracts for CASPER, so he needs to write down what has to go into a block. Decide on which we are going to implement Option 1: At finality snapshot the VM state. In the event of a rollback, roll back the state to the last finalized block, and replay the Com events in the block to achieve updated VM state. Option 2: Implement a serialization mechanism that is delta based to store the machine state and snapshot every so often. Rollback based upon the deltas. Option 3: Compute state for every block in advance and remember it in case that block wins.

Action items

Medha Parlikar (Unlicensed): Clean up the notes and follow up on details of what has to be done.
Medha Parlikar (Unlicensed): Get video from Greg and link in.