Status of production nodes versions same as last week. Dev team is currently testing 0.9.25. It's available in the docker hub for any community members that want to try it out.
Node version 0.9.24 is the current release on observer nodes - Housekeeping changes and meeting requirements of exchanges for reporting state and transaction history RNode-0.9.24 release plan
RNode 0.9.24 changes impact only the observer/read only node. We've set up an 'exchanges only' read only node for exclusive use by the exchanges.
Dev team working on a Feature branch that has improvements to Last Finalized State, block store, dag store and now also beginning to store Casper state in LMDB. 0.9.25 will be released when we complete this feature branch and are able to merge that into the main dev branch. It has substantial improvements and some bug fixes to both observer/read only changes and validator nodes. Full current PR list at https://github.com/rchain/rchain/pulls?q=is%3Apr+label%3Anext-release+
Testnet is running rchain/rnode-staging:v0.9.23-8-g0eb25aa24 - this has a couple of patches beyond the 0.9.23
Sequence of updates is testnet to mainnet observers and then main net validators if applicable. Current philosophy is to minimize updates/disruptions to validator nodes while enabling improved observer node functionality.
Focus is to make sure that the network can handle the anticipated volume from the exchanges and that exchanges can have responsive monitoring and customer service.
Sprint 55 in progress
Main Focus: Work on Last Finalized State, hardening the main net, improve performance, make usability improvements including configuration, API improvements including functionality needed by exchanges.
0.9.25 available in docker hub and currently being tested in the different networks.
Currently testing. Tests will take a while because of
configuration changes - we need to set up each environment and migrate from old to new config, make sure there are no problems.
replicating the slashing behavior prevalent in the main net
lot of PRs, so we want to make sure everything behaves as expected
Slashing investigation: One validator node was slashed due to tuple space mismatch. First part of debugging revealed that the problem is manifesting when Trie is recalculated because of insert/delete of nodes when they are share common prefix.
Getting issue details...STATUS
We are fixing this issue but while it would certainly help reduce errors, it's not clear that this is the ONLY source of the problem. At this time this is a non-deterministic and rare error. It took 3 months to manifest and only in one of the ten main net nodes. We will continue to watch and analyze/debug it. We have to put in place a strategy to handle such errors. This is a future ToDo.
New insight on slashing: In developing a dashboard to display validator rewards and costs, Tomislav noticed that the sequence/timing of application of the various charges and rewards may be incorrect. Currently investigating further. This may provide a more direct fix for the current slashing situation.
A couple of the PRs targeted for 0.9.26 are ready for review - we're seeing much improved memory utilization
0.9.26 release possibly in 3 weeks that should complete last finalized state changes.
We completed DAG Storage changes and the migration logic from current file storage - this is currently being tested but will not be released in 0.9.25. The Key value store that Tomislav demoed earlier is now being used for DAG storage, and we plan to also use it for caching the transactions and state changes. Full List at https://github.com/rchain/rchain/pulls
Addressing discovered bugs: Investigating the 'tuple space error' that we occasionally see on the main net.
Ongoing - Improvements to last finalized state issued but quite a bit of work involved still. Significant progress, some of which will be released in 0.9.25. The PR and the branch are structured so that multiple people can collaborate/ work on different parts of the feature at the same time. The scope of this work enables (a) faster catchup by new nodes - you can start from the last finalized state - this is a differentiator for RChain (b) offloading older data and differentiated storage and retrieval strategies for the same (c) allows for a leaner / less bloated node. Tomislav continuing to work and test this. Nutzipper and Will are helping to accelerate delivery. Having to pick between refactoring and work-arounds in various parts. This change touches most parts of the code base. Trying to get a more modular and future-beneficial approach.
Ongoing SRE - Adding diversity of cloud providers. Sandbox with new configuration created on IBM. Testnet migration from Google to IBM in progress. Completed Oracle cloud onboarding, will start using their resources after we exhaust IBM resources. Main net validator nodes are still on Google cloud.
Validator expansion and Validator support:
Exploring methods of enabling easy scaling and decentralization of nodes. Couple of options are (a) one click install buttons for various clouds (b) run your own 'shard in a box' or 'node in a box' using low cost devices like Raspberry Pi based Antsle etc.
Looking at improving the communication channels for two way feedback and training of validators.
Ongoing - CI/CD:
We are slowly moving from Jira to Github for the development team, started working on release notes in Github. For a while, we will maintain in both Jira and Github. Recent issue of test failing without errors on Github mostly resolved.
0.9.25 release notes will be in Github - hopefully the community finds the format improved.
In the last meeting, we cleaned up the item numbering. They will move from issues+discussion to an RCHIP number once approved
We will start reporting on zulip backend and other dApp efforts, once we have more to report. Zulip meeting on Thursdays 3 PM Eastern, 12 Noon Pacific
Arthur Greef's auction dApp demo
Theo to demo functional cryptography based anonymous voting component when ready.
Current Backlog (partial)
Improve merging in system deploys
Improve Casper by enabling more tests and resolving identified code issues
Improve BlockMerge including refactoring RunTimeManager
Improve multi-parent Casper enablement
Implement sharding capabilities
Improve logging to be able to learn what API calls are being used, so they can be related to resource use and performance etc
Rholang 1.1 to improve syntax and user experience / learning curve