Community Update 172
General
Release
Most main net nodes are running the Last Finalized State version 0.10.2 https://github.com/rchain/rchain/releases/tag/v0.10.2 which contains the memory leak fix https://github.com/rchain/rchain/pull/3353 in addition to the PR to refactor rspace to use the KeyValueStore https://github.com/rchain/rchain/pull/3295 and https://github.com/rchain/rchain/pull/3328 and some dependencies. This fix improves node startup speed and reduces the memory usage substantially. We're running 0.11.0.alpha-2 version on one of the main net observer nodes.
Main net has 30 nodes, with a few large configuration machines running 5 nodes each.
Test net now has 0.11.0.alpha-2 version running for a few days https://github.com/rchain/rchain/releases/tag/v0.11.0-alpha.2 that fixes the memory issue in 0.11.0.alpha-1. This should allow us to continue testing the block merge version that will go on to the main net.
In addition, we are running testnet 2 with the hard fork version of block merge. Going forward, Testnet 1 will get only the changes that can make it to the main net prior to hard fork.
We continue to work on improvements to the block merge version. Details are at Core team · GitHub
Note: Main difference between testing on Observer nodes vs. Validator nodes: When we test on the observer node, a large part of the code base is executed. The Block creation code runs only on the validator nodes. As a part of that, the portion of the Runtime Manager that plays the deploy is run on the validator node but not on the observer node.
Sprint 84 in progress
Main Focus is to continue Block merge work and improvements, resolve any identified bugs, prepare for hard fork 1 to eliminate the slashed validator, harden the main net, improve performance. Current PR list is at https://github.com/rchain/rchain/pulls .
Significant ongoing work in the following streams simultaneously
(1) Leaderful block merge targeted for the current main net
(2) Leaderless block merge changes that will get on the main net, even though may not become immediately operational
(3) Block merge tasks that can be on the main net only after Hard fork 2
(4) Hard fork 1 for balances
(5) Planning for Hard fork 2 - identifying all items/tasks necessary, analyzing impact and interactions, specifying and reviewing the designs etc.
(6) Implementing Hard fork 2 changes for items that are clear how to do
(7) Performance improvementsCurrently block merge is being tested on both Testnet 1 and Testnet 2. There's a 'hard-forked block merge' version on Testnet 2 that is being run as needed to apply patches, run tests, measure performance etc. This will be the eventual long-term version of block merge. Currently the team is in the process of creating the necessary feature branches and rebasing code on these. Some of the performance tests need high number of cpus and nodes to be run for a brief period of time and stopped (because this is expensive). Once we are done with tests and have a stable version, we should be able to make this Testnet 2 accessible to the community.
Nutzipper is continuing to work on the remaining block merge changes, including enhancements. Tomislav and Will will help him with writing tests on all parts of the block merge functionality.
Will has the PRs for finishing up the work needed for the first hard fork (carry forward balances without prior state). He has discovered at least one issue of incorrect balances with three addresses and is investigating the root cause.
Performance Improvement efforts: One of the community members has done some very nice work on a Rust implementation of parts of the Rholang interpreter, demonstrating the potential for 100x performance improvement. A lot of this is future potential, not immediately leverageable by the current code base. We will start exploring with him next week to see what parts of his work may be applicable and/or what other activities can be targeted for the near-term improvement of RNode performance, even as we plan for a future Rust client if that makes sense. One of the candidate near term activities for performance is to replace protobuf with CapnProto. We anticipate that this would increase performance, as well as ensure (currently being verified) data structure compatibility between Scala and Rust clients. As we work through all of this, we will update the roadmap. I will also be adding a 'TPS/Performance Improvement' milestone to the roadmap, as requested by the community. Tomislav has created an issue with critical paths for performance improvements https://github.com/rchain/rchain/issues/3413 so that we can report metrics in a coherent fashion and identify which paths offer best return on effort.
As we work through block merge and the PoS changes, the team is discovering more and more items/changes that need to be included in the second hard fork. We are being mindful to not expand the scope too much because that would increase risk of introducing too many bugs and interactions. One approach being considered is to create a safe process for soft forks, such that all changes need not go in as part of the hard fork.
VERY Preliminary performance numbers on the leaderful, 0.99 synchrony constraint (unoptimized) block merge version that will be eventually promoted to the main net:
7x to 10x improvement (currently 41 sec per block on the main net vs 4 to 6 secs per block with block merge) over what we see on the main net currently, with 5 nodes on testnet 2, with the 0.11.0.alpha-2 version.
Numbers likely to change as we move to leaderless 0.67 synchrony constraint, remove channels map, vary the number of nodes, vary the number of blocks for tests, vary machine resources, vary network distribution etc..
Block merge scope
The main net version of block merge works with synchrony constraint 0.99. It was easier to do it this way to identify and analyze bugs and take corrective action. The focus here is to just make sure the algorithm is correct and works well. For now, we will simply kick out the unmerged deploys back to the user. In future, we will see if there is a better way to handle this.
As we have been emphasizing in the last several community debriefs, we discovered an unanticipated need to remove the channel mappings to get the full performance on block merge. This removal is a data structure change and will need a hard fork. For this reason, the full performance of block merge will not be visible on the main net until after we perform the data hard fork. There are also a few optimizations that need to be completed. Status of all block merge tasks can be seen at <https://github.com/rchain/rchain/projects/4> . We are approximately 2 to 3 weeks away from completing the remaining block merge enhancements.
The block merge code that does not need the hard fork will continue to be tested on the current testnet and move on to the main net. The current testnet remains available for regular use by dApp developers and others, as normal.
TEST NET 2: To test the version of block merge with data changes that need a hard fork, we are creating a separate feature branch and a TEST NET 2 as the testbed for this feature branch.
We will generate from TEST NET 2 what the eventual block merge performance numbers may look like. To enable this, Will and Gurinder will be working on setting up a transaction server on Testnet 2, so that revdefine.io can report the statistics on this net also.
Once block merge tasks are complete, we will continue to work on other hard fork changes and test them on TEST NET 2 as well.
Additionally, we will start experimenting with decentralized nodes on TEST NET 2 when we are ready for that.
Through this whole duration, there will be no guarantees of data storage format compatibility on Test Net 2, as the data storage format changes are incrementally implemented. However, we encourage the community to start using Test Net 2 in addition to the current test net, so that we can quickly identify and resolve any issues with the upcoming hard forks. All new development should be targeted to TEST NET 2 and all current code MUST BE tested against TEST NET 2 to ensure future compatibility.
The anticipated changes are mostly storage level format changes visible only to node operators. When we hard fork on the main net, we will be starting from an empty state, with REV balances only. On Test Net 2, this will be repeated multiple times.
Tech-Governance meetings on Thursdays 10 AM Eastern, 7 Am Pacific
Mercury requirements and acceptance criteria
Details on the acceptance criteria: Mercury acceptance criteria
Please see the documentation at https://github.com/rchain/rchain/blob/dev/docs/features.md.
Testnet status
Please see RChain public testnet information to learn more about public testnet as well as a FAQ.
Tech Governance + Community testing
Thursdays at 14:00 UTC. Please see RChain community RNode testing for more information.
Blockers to Mainnet
NA
Risks to code completion for Mercury
Developer website
Date |
|---|
May 19, 2021 |