Community Update 170

Community Update 170

General

  • Release

    • Most main net nodes are running the Last Finalized State version 0.10.2 https://github.com/rchain/rchain/releases/tag/v0.10.2  which contains the memory leak fix https://github.com/rchain/rchain/pull/3353 in addition to the PR to refactor rspace to use the  KeyValueStore https://github.com/rchain/rchain/pull/3295 and https://github.com/rchain/rchain/pull/3328 and some dependencies. This fix improves node startup speed and reduces the memory usage substantially.  We're running 0.11.0.alpha-2 version on one of the main net observer nodes.

    • Main net has 30 nodes, with a few large configuration machines running 5 nodes each. 

    • Test net now has 0.11.0.alpha-2 version running for a few days https://github.com/rchain/rchain/releases/tag/v0.11.0-alpha.2 that fixes the memory issue in 0.11.0.alpha-1. This should allow us to continue testing the block merge version that will go on to the main net.

    • In addition, we are setting up testnet 2 with the same 0.11.0.alpha-2 version for now, with the block merge enabled, to allow further testing and gathering of the performance numbers. Going forward, the version on testnet 2 will have all block merge code changes, including the ones requiring hard fork. Testnet 1 will get only the changes that can make it to the main net prior to hard fork.

    • Team is reviewing the final PRs for block merge version 0.11.0. When merged, we will have that release. This release, after testing is the block merge version targeted for the main net.

  • We continue to work on improvements to the block merge version. Details are at Core team · GitHub

  • Will is working on finishing up the work needed for the first hard fork (carry forward balances without prior state), as well as some of the block merge related issues such as rejected deploys.

  • Note: Main difference between testing on Observer nodes vs. Validator nodes: When we test on the observer node, a large part of the code base is executed. The Block creation code runs only on the validator nodes. As a part of that, the portion of the Runtime Manager that plays the deploy is run on the validator node but not on the observer node.

Sprint 83 in progress

  •  

    • Main Focus is to continue Block merge improvements, resolve any identified bugs, prepare for hard fork 1 to eliminate the slashed validator, harden the main net, improve performance.  Current PR list is at https://github.com/rchain/rchain/pulls .  

    • Tomislav is making good progress on removing the channels map, allowing us to create the 'hard-forked block merge' version on Testnet 2 that will allow us to measure performance in that configuration, which will be the eventual long-term version of block merge.

    • Gurinder will present preliminary performance data next week for both versions of the block merge (with and without channels map) and then a few more times after that, as the team makes progress with completing block merge optimizations for the hard fork version.  One issue we're running into is a limit on the number of cpus allocated by the cloud provider, we are trying to resolve this issue, so we can test scaling across a larger number of nodes. Roughly speaking, you need as many cpus (or threads) as the square of number of nodes you want to test. So if we want to test 40 nodes, we need 1600+ cpus (or threads).

    • We are working towards making the testnet 2 available for the community in the next couple of weeks.

    • Nutzipper is continuing to work on the remaining block merge changes, including enhancements. Tomislav and Will will help him with writing tests on all parts of the block merge functionality.

    • One of the community members has done some very nice work on a Rust implementation of parts of the Rholang interpreter, demonstrating the potential for 100x performance improvement. A lot of this is future potential, not immediately leverageable by the current code base. We will start exploring with him next week to see what parts of his work may be applicable and/or what other activities can be targeted for the near-term improvement of RNode performance.  As we work through that, we will update the roadmap.  I will also be adding a 'TPS/Performance Improvement' milestone to the roadmap, as requested by the community.

    • Will is also close to completing the run of the balances report (new state) for the first 'balances only' hard fork. We should be able to make progress on the verification of the balances etc. next week in preparation for the first hard fork.

    • VERY Preliminary performance numbers on the leaderful, 0.99 synchrony constraint (unoptimized) block merge version that will be eventually promoted to the main net:

      • 7x to 10x improvement (currently 41 sec per block on the main net vs 4 to 6 secs per block with block merge) over what we see on the main net currently, with 5 nodes on testnet 2, with the 0.11.0.alpha-2 version.  

      • Numbers likely to change as we move to leaderless 0.67 synchrony constraint, remove channels map, vary the number of nodes, vary the number of blocks for tests, vary machine resources, vary network distribution etc.. 

    • Tomislav is also identifying additional items that can be included in the hard fork changes. 

    • Current Work In Progress

      • Good progress continuing on block merge improvements / enhancements

      • The main net version of block merge works with synchrony constraint 0.99. It was easier to do it this way to identify and analyze bugs and take corrective action. The focus here is to just make sure the algorithm is correct and works well. For now, we will simply kick out the unmerged deploys back to the user. In future, we will see if there is a better way to handle this.

As we have been emphasizing in the last several community debriefs, we discovered an unanticipated need to remove the channel mappings to get the full performance on block merge. This removal is a data structure change and will need a hard fork. For this reason, the full performance of block merge will not be visible on the main net until after we perform the data hard fork. There are also a few optimizations that need to be completed. Status of all block merge tasks can be seen at <https://github.com/rchain/rchain/projects/4> . We are approximately 2 to 3 weeks away from completing the remaining block merge enhancements. 

The block merge code that does not need the hard fork will continue to be tested on the current testnet and move on to the main net. The current testnet remains available for regular use by dApp developers and others, as normal.

TEST NET 2:  To test the version of block merge with data changes that need a hard fork, we are creating a separate feature branch and a TEST NET 2 as the testbed for this feature branch.

  1. We will generate from TEST NET 2 what the eventual block merge performance numbers may look like. To enable this, Will and Gurinder will be working on setting up a transaction server on Testnet 2, so that revdefine.io can report the statistics on this net also.

  2. Once block merge tasks are complete, we will continue to work on other hard fork changes and test them on TEST NET 2 as well.

  3. Additionally, we will start experimenting with decentralized nodes on TEST NET 2 when we are ready for that.

  4. Through this whole duration, there will be no guarantees of data storage format compatibility on Test Net 2, as the data storage format changes are incrementally implemented. However, we encourage the community to start using Test Net 2 in addition to the current test net, so that we can quickly identify and resolve any issues with the upcoming hard forks. All new development should be targeted to TEST NET 2 and all current code MUST BE tested against TEST NET 2 to ensure future compatibility. 

The anticipated changes are mostly storage level format changes visible only to node operators. When we hard fork on the main net, we will be starting from an empty state, with REV balances only. On Test Net 2, this will be repeated multiple times.

Tech-Governance meetings on Thursdays 10 AM Eastern, 7 Am Pacific 



Mercury requirements and acceptance criteria

Details on the acceptance criteria: Mercury acceptance criteria

Please see the documentation at https://github.com/rchain/rchain/blob/dev/docs/features.md

Testnet status

Please see RChain public testnet information to learn more about public testnet as well as a FAQ.

Tech Governance + Community testing

Thursdays at 14:00 UTC. Please see RChain community RNode testing for more information.

Blockers to Mainnet

NA


Risks to code completion for Mercury

 

Developer website

https://developer.rchain.coop 

Date

Date

May 5, 2021