Sharding Meeting notes

Date

27 Feb 2018

Attendees

Goals

Work out details on sharding

Discussion items

Time	Item	Who	Notes
	Statement of the concerns		Regions are how we are doing static annotations in the source code. We are still unclear on how a validator can work on a subset of the entire blockDAG and still be able to do anything worthwhile. We are worried about transitive closure properties when we have races across regions.
			Gets worked out in the static annotations This assigns contracts to regions, assignment of validators is still unclear. → Mike Understand the dependency relationships for each region. Has a path up to the root. Kent has a diagram in his proposal. Mike - it requires each validator to validate all the transactions it has and all the ones above it? No - Greg- only the transactions that impinge upon it. Kent - you want to have some control over your parents because they control you. Reviews Nash's proposal. Key point is that the transactions that impinge upon both sides of the lower regions. the static analysis up front has figured out for us these dependencies
	Nomenclature		Sharding solution is needed in order to be confident about the numbers we are speaking about publicly. Does anyone have a problem with the position that we need Sharding? Concern is from ETH, because all miners are competing for the global state lock. All the sharding work is to avoid the global state lock. However, we naturally have a lot more room in RChain just by the nature of sending and receiving and we are not doing these big long computations as big chunks of trace that we are committing to. Greg - that is debatable - I can put lots of math inside a send. Mike -straight line code can't affect anything else. It doesn't need a lock to do that computation. Greg- that is not what we are calling a transaction, its where you have I/O (COMM Event) Our COMM events are much more finely grained, they are commutative unless they are on the same channel. State lock is needed until there is contention. Mike - it's not clear to me that we will have ETH's issue. We can avoid a global state lock by the use of names. We can only shard on processes or names. Processes wiggle, we can't easily shard on processes. Greg doesn't know how to do it right now. Names are much more stable and they line up with transactions very nicely. Seems to be the path of least resistance. Michael - the payment solution needs to shard in the exact same way. Greg - agrees. Greg: A shard will correspond to a group of names. Arrange the code so we understand the groups of names. Provided an algorithm from Andy Gordon. Michael - Declarative aspect to the regions in the Pi Calculus, not obvious to him what the groups of names will look like and how we will declare them. We want to load balance the shards ideally. Greg- think of groups of names as logical entities. They are not necessarily tied to physical resources in a way that is constant. The groups of names might have different resources assigned to them at different times. Similary to IP server farms. Michael - I don't follow that. If we see a COMM event as a fundamental computational event - then names are resources (because comm events happen through names) and so name groups become resource allocations. It has to be tied tot he RChain notion of computation. Michael and Greg agrees. Greg- we cannot force people to write good code. Now we need to assign validators to the groups of names such that we have security and consistency. Distinction - There are effectively 2 different type systems, 1 is about structure, say something is a pair, I should be able to break it into 2 pieces. This is structural typing. Namespace idea is breaking up into chunks based upon their structure Nominal Typing - Language java - foo pair and bar pair- even tho the internal structure is the same - the name foo is different from the name bar - therefor if you try to type check a foo as a bar, they won't typecheck. Unless you have put a relationship in place. The name name group proposal is a Nominal typing - 'collecting a group of names' - Decorated within the program. We calculate post facto what names are in the group by analyzing the program, and once we have calculated that, we post calculate a predicate that would select out that name. The reason we want to name these name groups, is because we want to add meta-data about he kinds of resources that can serve them, like validators, they have to provide some kind of proof that they have certain capacities - ex storage, bandwidth, uptime How this affects policy - when we select validators, on the metadata that is supplied in addition to the other kinds of criteria - include what is known about their load and stake. By calcuating the name groups upon which we will shard upon the contracts, we give contract owners back control. Impinges upon the policy of selecting validator. Need to consider how the child validator relates to the parent validator.
	Concerns		Mike - I haven't heard how the mechanics of blocks come together in a parent and how they separate. Kent and Nash have been talking.
	Discussion about Concerns, Kent walks through his proposal doc		Kent shows a diagram about A and B - with a merge block above. Kyle asks - what happens if A forks and doesn't participate in the Merge block? Concept of Dominant name groups - A is dominant on B or the combination C - is dominant on both. Is the domination total or partial? Partial - there is another constraint, you cannot have a cycle - you can't have recursion where A dominates B, B dominates C - C cannot dominate A. Weighting for the root block is the combined stake of the children. MIke - if a Merge block is proposed, It must take precedence? Michael - the dominance thing is an implementation detail to meet the actual definitions that Vlad has put out. A merge block if it is final, it must be final in all the things it touches. If it is finally excluded, if it cannot be in 1 of the chains, it cannot be in any of the chains. Greg - the name groups are just a hierarchy coming out of the groups of names. In Casper - you always need to be thinking about the view and whether a block is final or not in your view. In order to validate a merge block you have to take on the transactions of A, and then you can revert back to your own shard. Mike given that A is listening and B is sending, who proposes? Either Each trust each other on the state of A and B, while they are trying to construct the merge block. MIke what does the structure of the merge block look like? Does A have to query B? Kent - consists of all the transactions in A and B and the last finalized blocks for each. The post states for each get included in the merge block independently. Mike - sounding to him like the A's and B's therefore do not need a separate set of validators to handle inter A, B blocks. They get together and then split apart. This only works if the set is the union of the 2 - Otherwise it doesn't work. Greg and Kyle agrees. Kyle needs to think more - and he believes it has to include all of A and B. In this case, all validators are a validator in the top name Michael - We don't want to think of an individual name and an individual entity. Greg - the power set is going to be ridiculously large. there is homomorphism and we are restricting the view by carving out a chunk from either side. Still hung up on, how we decide on the name side, the sparse chunk of the powerset we are thinking of, what do the leaves of the tree look like. They do need to be defined in advance of wrting contracts. The contracts have to have annotations of which name groups they belong to in advance. Greg- name groups are ephemeral - we can change the algorithm to give a principled name grouping for any given contract. You can check if the name groups exist in the chain, if not add it? Greg - this is done by the contracts we serve. Problem is the number of validators in the set. There is a lot more fringe than there is interior. a cartel could buy off a section of the fringe. You can declare leaf shards ad hoc and everything is built up from referencing these shards. If I write a contract and says it belongs in Shard 1 and someone else does the same thing in another shard. Same name on both. How does this work with the asycnroniticyt works - if there are 2 names that are the same in different shards, how does that work? Second contract gets rejected? Check in with Mike and Kyle: Concerns - What is drawn in Kent's diagram, if you can only deploy contracts into leaf nodes? Looks good. Can we deploy contracts into the other namespaces? If Rev isn't living in one of the leaf nodes? It ought to run in all of them. Greg- it has too, has to live in all shards. Mike - I don't understand how Rev can be in all of them. Only understand how contracts can be in the leaves. Next meeting scheduled this week to continue discussion.