Cost accounting specification

Summary

In order to protect from DOS attack we charge for the execution of the rholang program. Interpreting Rholang programs consists of:

  • lexing and parsing
  • normalizing the program
  • reducing the program

There are possible DOS attacks at each of these steps:

  1. Lexer/Parser:
    1. One can write a big program that will cause lexer/parser to stack overflow.
  2. Normalizer:
    1. Structuring the program in such a way that during normalizing the AST we run into stack overflow.
  3. Reducer:
    1. Again, it is possible to structure a program in such a way that the reduction will throw StackOverflow error.
    2. Building a program that costs more to execute than provided phlogiston limit.

Current obstacles

We don't control the lexer/parser parts yet, so it's still possible to provide a program that won't fit into memory. We need to control these parts as well in a manner that doesn't risk with stack overflow errors (executing in the stack-safe monad). We also might investigate whether it's possible to assess if provided phlo limit is sufficient during lexing/parsing.

Questions

- Should we charge for lexer/parser phase?
- Should we charge for normalizer phase?

Business requirements

  1. Stop program execution when deployment runs out of phlos. If it's not possible to do immediately, do it as soon as possible.
  2. Charge for execution of the rholang program.
  3. Charge for storage of the rholang program in the tuplespace.

Technical requirements

  1. Given that rholang interpreter is multi-threaded and the phlo limit is global we need to share and update its state correctly.
  2. There are multiple types of errors that can be throw during the interpretation. Program errors are not "fatal", in a sense that they don't interrupt the deployment's execution - they cause a deployment to error and are reported at the end but they don't interrupt reduction of other terms in the deployment. OutOfPhlogistonError needs to be fatal. It must stop the execution of other terms in the deployment:

    term1 | term2 | term3


    If we run out of phlogistons during reducing term1, term2 and term3 need to be stopped as soon as possible. We don't want to make more work that user is willing to pay for.

Interpreter phases and cost accounting

Lexer/Parser

We don't charge during this phase.

Normalizer

We don't charge during this phase.

Reducer

Reducer is where the magic happens, it's the workhorse of the rholang interpreter. We charge for every _meaningful_ operation.

From the cost accounting perspective, reduction of the rholang program is one of the following:
1. Evaluating a term.
2. Substituting variables in the term.
3. Spatial match algorithm.
4. Methods called on rholang terms.
5. Interaction with RSpace.

Evaluating a term

We charge constant cost for evaluating a term like: send, receive, method call, variable. This is a constant cost for which order is the same as basic methods like nth method call, keys etc.

Substitution

Substitution is the process where we substitute bound variables with the environment variables they bind to. This process can either succeed or fail (failing to substitute is an error but we charge nevertheless). The cost of substitution is:
- for success cost proportional to the byte size of the substitute term
- for failure cost proportional to the byte size of the original term

Spatial match

Spatial match is a powerful feature of rholang which allows to write complex match between rholang terms and patterns. The more complex the (term, pattern) pair is the more expensive it is to compute the match.

For example

3 matches 1


is simple, both 3 and 1 are concrete terms so they can be compared using plain java equivalence,

match 3 {
  2 \/ 1 => …
}


is more expensive because it involves logical connective AND. This means that the algorithm needs to check that all of the branches of the connective satisfy the predicate.

match 3 \/ 4 {
  ~2 /\ ~1 => …
}


is even more complex because we need to check all the permutations of terms and patterns.

Matching can quickly become complex as the more complex patterns and/or terms are being matched against each other. General idea behind cost accounting is that we charge for making an actual comparison between terms:

  • in @x!(10) matches @y!(11) we will charge for comparing channels and data
  • in [1,2,3] matches [=x, =y, =z] we will charge for comparing respective elements of the lists against each other
  • etc.

so the more (term, pattern) comparisons match requires the more expensive it is. The actual cost of comparing term against pattern is proportional to the byte size of the smaller object. This is based on the java equality method.

Methods



nthtoByteArrayhexToBytestoUtf8BytesuniondiffadddeletecontainsgetgetOrElsesetkeyssizelengthsliceappendinterpolate
ListC

--


n/an/an/an/an/an/an/an/an/an/an/an/aCproportional to value of higher indexproportional to the length of underlying stringn/a
GByteArrayn/an/an/an/an/an/an/an/anan/an/an/an/an/an/aproportional to value of higher indexproportional to the logarithm of length of target byte arrayn/a
TupleC--n/an/an/an/an/an/an/an/an/an/an/an/an/an/an/an/a
Setn/a--n/an/aproportional to the size of collection that is an argument method proportional to the size of collection that is an argument methodCCCn/an/an/an/aproportional to the size of base collection on which this method is called. Complexity of the size method is O(n)n/an/asame as unionn/a
Mapn/a--n/an/aproportional to the size of collection that is an argument method proportional to the size of collection that is an argument methodn/aCCCCCCproportional to the size of base collection on which this method is called. Complexity of the size method is O(n)n/an/asame as unionn/a
any termn/aproportional to the byte size of the underlying rholang termn/an/an/an/an/an/an/an/an/an/an/an/an/an/an/an/a
Stringn/a--proportional to the length of underlying stringproportional to the length of underlying stringn/an/an/an/an/an/an/an/an/an/aCproportional to value of higher indexproportional to the sum of the lengths of both stringsproportional to length_of_string * number_of_entries_in_map
  • C - constant
  • n/a - method not available

Interaction with RSpace

Cost accounting for interaction with RSpace has two flavors:

  1. Charge for storing the term
  2. Charge for work done required to find a match between `produce` and `consume`.

Storage

Before calling produce or consume on RSpace we charge for storing the term. We do that to secure the required phlos for executing the action. If our produce / consume is linear (not persistent) and it finds a match we refund the phlos secured before executing an action. The cost of storage is proportional to the byte size of the term we want to store.

Matching

In order to find a match between `produce` and `consume` RSpace runs a spatial match algorithm on supplied bind patterns and data residing on the channel. The spatial match algorithm can be complex (expensive). The nature of the interpreter and RSpace (see Technical details section) requires making all available phlos (at the moment of interacting with RSpace) the upper limit for the spatial match algorithm triggered in the RSpace. The spatial match algorithm tracks and updates the cost state while it runs. It means that as soon as available phlos are used up in the middle of running the match we will short circuit and do not try matching anymore.

Upon leaving the RSpace we charge with the amount of phlos used in the RSpace or, if RSpace returns OutOfPhlos error, we zero the cost accounting state.

Technical details of the cost accounting in the interpreter

TODO