RNode performance discovery

WIP

Setup

A new implementation of Span tracking https://github.com/rchain/rchain/pull/2569

to enable:

rnode.server.metrics.zipkin = true
kamon.trace.sampler = "always"

and to avoid StackTraceExceptions: (until it's fixed: RCHAIN-3618 - Getting issue details... STATUS )

-J-Xss5M


Contract execution analysis

Rev transfer on empty tuplespace

Run 1

https://github.com/rchain/rchain/blob/dev/rholang/examples/vault_demo/3.transfer_funds.rho

The whole flow of proposal is captured in two stages

Significant observations:

  • for simple contracts the 'par' mechanisms that are utilised in rholang do not make a difference

very little is run in parallel because the execution is dependent on itself

  • there are some rogue invocations but it looks like they do not contribute to execution time

  • in most cases the interaction with rspace is on the level of ms
  • there are some cases when rspace takes a good while longer (20x)

Run 2

A more detailed view

Reset and restore installs

Produce + create checkpoint

Rev transfer after 500 blocks with simple transfers

https://github.com/rchain/rchain/blob/dev/rholang/examples/vault_demo/3.transfer_funds.rho

500 transfers done (of 1 rev instead of 100).

It seems that the length of the chain affected the transfer execution by a factor of 3.

  • checkpointing still takes about 150 ms

Rev transfer after 500 blocks with other contracts


Wide contract

https://github.com/rchain/perf-harness/blob/master/contracts/wide_a_setup.rho

https://github.com/rchain/perf-harness/blob/master/contracts/wide_b_run.rho

Run 1 (100 comm events)

A dedicated implementation of span tracking was used in these measurements: https://github.com/dzajkowski/rchain/tree/metrics/replace-task-local-spans-with-explicit-traceid-2

High level execution

High level execution of wide run

Create block

Reset

Soft checkpoint

<1ms

Deserialisation of 100 terms

Task ordering

Run 2 (1000 comm events)

A dedicated implementation of span tracking was used in these measurements: https://github.com/dzajkowski/rchain/tree/metrics/replace-task-local-spans-with-explicit-traceid-2

High level execution

Create block

Reset

Soft checkpoint

<1ms

Deserialisation of 1000 terms

Task ordering

Thoughts after run 1 & run 2

  • reset & restore installs looks stable
  • soft checkpoint is a major improvement over checkpoint
  • the RSpace actions look fairly stable (produces and consumes on the same channel sometimes collide but usually one action takes about 1 ms)
  • the amount of tasks started up front is worrying
  • with the large amount of tasks scheduled one would expect them to be happening in RSpace quicker but instead they seem to all bunch up post scheduling (perhaps the rholang reducer is creating a lot of monix async boundaries and there is a lot of ctx switching that could be avoided)