1. Introduction

1.1 Purpose of the system

The latest messages data structure represents a mapping between validator public keys and the hash of the latest block the respective validator has seen. It is used in various places by Casper protocol such as equivocation detection and score map building, but, most importantly, it is used by safety oracle to count agreement graph's edges as it is computation heavy task.

In order to count agreement graph's edges, safety oracle, provided by a list of validator candidates, checks for every pair of validators if they see an agreement with each other. This process involves looking up the latest message for the first validator and checking its justifications to find one sent by the second validator and compatible with the estimate (block to detect safety on). The same process is then done for the second validator with respect to the first validator. As you can see, the process of counting agreement graph's edges involves looking up latest messages for n^2 number of validator pairs, where n is the number of validators which can reach a few thousands.

The purpose of this document is to provide a specification of latest messages data structure storage on disk that will be efficient enough to allow safety oracle computation to do constant-time lookups in the storage.

1.2 Design goals

The design goals of the latest message storage include:

Do constant-time lookups which happen n^2 times (n is the number of validators) every time a block is received by a node
Do updates which happen once every time a block is received by a node
Do inserts which happen rarely as validators pool will be fairly stable in the long run
Use as less I/O operators as we can to support the previous three types of operations
An ability to restore the latest messages on node startup in case of file corruption after a system crash

1.3 Definitions, acronyms, and abbreviations

1.4 References

2. Current software architecture

As for now the latest messages data structure is stored in-memory only and represented as Map[Validator, BlockMessage].

3. Proposed software architecture

3.1 Overview

The API for storing latest messages is represented by LatestMessagesStorage.

LatestMessagesStorage

trait LatestMessagesStorage[F[_]] {
  def access[A](fa: F[Map[Validator, BlockHash] => A]): F[A]

  def insert(validator: Validator, blockHash: BlockHash): F[Unit]
}

The proposed implementation LatestMessagesFileStorage of this API is designed as a combination of in-memory representation and synchronized on-disk representation implemented as an arbitrarily ordered sequence of publickey-blockhash pairs with in-memory index.

LatestMessagesFileStorage

final class LatestMessagesFileStorage[F[_]: Monad: Concurrent: Sync: Log] private (
    lock: MVar[F, Unit],
    latestMessagesRef: Ref[F, Map[Validator, BlockHash]],
    indexRef: Ref[F, Map[Validator, Long]],
    dataFileResource: Resource[F, RandomAccessFile],
    crcPath: Path
) extends LatestMessagesStorage[F] {
  def access[A](fa: F[Map[Validator, BlockHash] => A]): F[A] =
    for {
      _              <- lock.take
      latestMessages <- latestMessagesRef.get
      result         <- fa.map(_(latestMessages))
      _              <- lock.put(())
    } yield result

  def insert(validator: Validator, blockHash: BlockHash): F[Unit] =
    for {
      _      <- lock.take
      _      <- updateFile(validator, blockHash)
      result <- latestMessagesRef.update(_ + (validator -> blockHash))
      _      <- lock.put(())
    } yield result

  private def updateFile(validator: Validator, blockHash: BlockHash): F[Unit]
}

object LatestMessagesFileStorage {
  final case class Config(
      dataPath: Path,
      crcPath: Path
  )

  def apply[F[_]: Monad: Concurrent: Sync: Log](config: Config): F[LatestMessagesFileStorage[F]]
}

Every instance of LatestMessagesFileStorage can only be created by using LatestMessagesFileStorage.apply provided by the LatestMessagesFileStorage.Config configuration containing the path to the on-disk storage file. On apply invocation, this file is read into memory as a List[(Long, Validator, BlockHash)]. The first element of the tuple stands for the offset in the file, the second and third elements stand for the elements of the publickey-blockhash pair. After that, two maps are built out of the list: Map[Validator, BlockHash] and Map[Validator, Long]. The first map has the same semantics as the legacy in-memory representation of latest messages. The second map provided by validator public key gives back the offset of the corresponding pair in the file. We will call the first map as in-memory representation and the second map as in-memory index.

The in-memory representation and in-memory index are wrapped into cats.effect.concurrent.Ref to represent mutable atomic references and passed as the second and third arguments to the LatestMessagesFileStorage constructor respectively. The first argument is a cats.effect.concurrent.MVar used as a mutex for synchronizing the storage and the fourth argument is a random access file wrapped into cats.effect.Resource to capture the effectful allocation of the file.

Required operations are handled in the following fashion:

Lookups can be done by acquiring lock and running the user provided function of type Map[Validator, BlockHash] on the in-memory representation. The function can do lookups in the provided map in effectively constant time. lock is released after the function is executed.
Updates can be done by acquiring lock and looking up the offset in the in-memory index in effectively constant time, seeking to the corresponding part of the file and rewriting the blockhash value. In-memory representation is updated with the new blockhash value. lock is released afterwards.
Inserts can be done by acquiring lock and appending the pair to the end of the file as the order of the pairs is completely arbitrary. In-memory representation is updated with the new validator-blockhash pair and in-memory index is updated with the new validator-offset pair. lock is released afterwards.

crcPath is used to store the CRC of the on-disk storage file. As CRC is a linear function, it is possible to recompute CRC after changing a part of the file by xoring the old value of CRC with the old part of the file and the new part of the file (i.e. newCrc = oldCrc ⊕ crc(oldPart) ⊕ crc(newPart)). The file can be updated atomically by writing the new CRC value to a temporary file and replacing the old CRC file with a new one since renaming is an atomic operation. CRC file is used to detect corrupted data after a system crash.

3.2 Software/hardware mapping

3.2.1 Memory allocation

Both validator public key size and block has size are 52 bytes each (8 bytes for pointer + 12 bytes for array header + 32 bytes for the content), so a single pair requires 112 bytes of memory (52 * 2 + 8 as pair itself also requires a pointer). Hence, the in-memory representation requires about 33 KiB of RAM for test net bond file with 300 validators and only 1 MiB of RAM in case the amount of validators bloats up to 10000. The in-memory index will at most double the memory consumption. Hence, we do not expect that holding in-memory representation will make a significant effect on memory consumption.

3.2.2 Disk Memory

In case of on-disk representation, the actual size of a pair requires only 64 bytes as headers and pointers are not required for storing data on the disk. Hence, only 19 KiB is required for storing a file with 300 publickey-blockhash pairs and 625 KiB is required for the severe case of 10000 publickey-blockhash pairs.

Note: I think that using memory-mapped is unnecessary as we can easily fit the whole data structure into the memory. Lookups do not involve any I/O, inserts require a simple append of the pair to the disk and by using the in-memory index we can rewrite the necessary place in the file using a single seek operation and a single write operation (i.e. without actually scanning the whole file or using binary search on it).

3.3 Persistent data management

As both publickey and blockhash are invariable in size, the proposed method of storing latest messages data structure on disk is to consequently write out the publickey-blockhash pairs byte-by-byte (i.e. 32 bytes of publickey followed by 32 bytes of blockhash).

This data will be stored in the file specified by LatestMessagesFileStorage.Config.dataPath and the CRC value will be stored in the file specified by LatestMessagesFileStorage.Config.crcPath. The configuration for latest messages storage will be added to the existing node configuration coop.rchain.node.configuration.Configuration:

Node Configuration

final class Configuration(
    val command: Command,
    val server: Server,
    val grpcServer: GrpcServer,
    val tls: Tls,
    val casper: CasperConf,
    val blockstorage: LMDBBlockStore.Config,
    val latestMessagesStorage: LatestMessagesFileStorage.Config,
    private val options: commandline.Options
)

The default value for latestMessagesStorage is defined by using dataDir (which defaults to $HOME/.rnode) as follows:

LatestMessagesFileStorage.Config default

val latestMessagesStorage = LatestMessagesFileStorage.Config(
  dataDir.resolve("casper-latest-messages-file-storage"),
  dataDir.resolve("casper-latest-messages-crc")
)

The default values can be redefined by adding the following command-line options to coop.rchain.node.configuration.commandline.Options:

New options

val casperLatestMessagesData =
  opt[Path](required = false, descr = "Path to latest messages data file") // --casper-latest-messages-data
val casperLatestMessagesCrc =
  opt[Path](required = false, descr = "Path to latest messages crc file") // --casper-latest-messages-crc

LatestMessagesFileStorage will be initialized in NodeRuntime.main using the provided configuration. As SafetyOracle depends on latest messages data structure, it will have to be initialized after LatestMessagesFileStorage.

3.5 Operational aspects

3.5.1 Metrics

LatestMessagesFileStorage will export the following metrics by reusing the existing class coop.rchain.metrics.Metrics:

Accessing the in-memory representation map by invoking access method will be reported with name "latest-messages-file-storage-access".
Appending a new key-value pair to the data file by invoking insert method will be reported with name "latest-messages-file-storage-insert".
Overwriting the value of an existing key-value pair in the data file by invoking insert method will be reported with name "latest-messages-file-storage-update".

By using this metrics data one could determine how often lookups/updates/inserts are performed by node.

3.5.2 Logs

LatestMessagesFileStorage will not log anything.

3.6 Global software control

Both lookups and inserts/updates require acquiring the lock and are hence mutually exclusive. On the other hand, doing a bulk of lookups still requires taking only a single lock by using the access method.

3.7 Boundary conditions

On node startup on-disk storage file is read into memory. In-memory representation and in-memory index are built from the read data.
On node shutdown no additional actions are required as the system is always fully synchronized
In case of node error, on startup the storage file is checked for corruptions by comparing the CRC to the actual check value of the storage file. In case of a corruption, the file is rebuilt from scratch using the information from persisted DAG.

RChain wiki

Latest Message Storage Specification (Draft)

1. Introduction

1.1 Purpose of the system

1.2 Design goals

1.3 Definitions, acronyms, and abbreviations

1.4 References

2. Current software architecture

3. Proposed software architecture

3.1 Overview

3.2 Software/hardware mapping

3.2.1 Memory allocation

3.2.2 Disk Memory

3.3 Persistent data management

3.5 Operational aspects

3.5.1 Metrics

3.5.2 Logs

3.6 Global software control

3.7 Boundary conditions

Latest Message Storage Specification (Draft)

[data-colorid=h8lwdrijs9]{color:#333333} html[data-color-mode=dark] [data-colorid=h8lwdrijs9]{color:#cccccc}[data-colorid=yr160dhc8c]{color:#333333} html[data-color-mode=dark] [data-colorid=yr160dhc8c]{color:#cccccc}1. Introduction

1.1 Purpose of the system

1.2 Design goals

1.3 Definitions, acronyms, and abbreviations

1.4 References

2. Current software architecture

3. Proposed software architecture

3.1 Overview

3.2 Software/hardware mapping

3.2.1 Memory allocation

3.2.2 Disk Memory

3.3 Persistent data management

3.5 Operational aspects

3.5.1 Metrics

3.5.2 Logs

3.6 Global software control

3.7 Boundary conditions

1. Introduction