THIS PAGE IS WORK IN PROGRESS, PLEASE DO NOT COMMENT ON IT UNTIL I REMOVE THIS BANNER

Jira epic: RCHAIN-3661 - Getting issue details... STATUS

Summary:
The purpose of ListenForDataAtName API is to provide API to RChain data pointed to by Rholang names of the deployed and proposed contracts that are finalized.

This is a developer-level API that does not impact the state of the blockchain and thus we should complicate our approach by trying to build in some payment tooling. Instead, I propose we simplify the solution to just the rnode updates needed to deliver the data.

The purpose of this page is to highlight how developers would like to get data from RChain and use it in their app. There is currently a gRPC API that developers can use called listenForDataAtName that most agree is close to what developers want. With this page, I hope to capture the developer requirements as well as the proposed design to deliver this feature.

Current Solution
The current implementation works on a pulling model. The flow is:

deploy contract
propose
do ListenForDataAtName while there is no data

Due to how this call returns data, this solution is not ideal for dApp developers. First, the data that is returned from this call difficult deconstruct. Data is returned when new blocks are received by a node regardless of whether or not the name being listened to has actually received a new value.

JIRA tickets

(Chris Boscolo/ Kelly Foster: add Jira tickets that are associated with this feature.

Requirements

Most queries, I just need the last/current "data-at-name", I don't care about the previous ones, I want to know what is the "data-at-name" for a given rnode based on current fork choice tip. I'm a bit troubled to always have to give this "depth" parameter.
Occasionally, I need to get "data-at-name" in the past, for example "at block height 1000, tell me the value at this unforgeable name".
I want to know if the data I get back is in a finalized block.
I would like an API that works using a pub-sub model, where I ask for data at a name, and receive updates when the data at that name change. (Ideally, this API would also let me know if the updates have been finalized.)
Since listening for data does not update that blockchain, I want to be able to do this on an observer node without incurring any blockchain (REV) costs.
If this feature is builtin to rnode, as a validator, I want the ability to prevent clients from using it and consuming all my node system resources managing the off-chain communication.

Questions: If the name being listened to receives data and that data is identical to what was previously there, and the caller is using a pub-sub style API, should an update be sent the pub-sub channel?

Yes

Proposal

This solution to LFDAN, (pronounced laugh-dan), assumes that there are validator nodes and observer nodes. Observer nodes, listen for blocks and update their local state based on these blocks, but do not participate in the consensus with other validators. That is, observer nodes do not propose blocks. Although a validator node can service LFDAN calls, it is presumed that validator nodes will not want to take on this extra computational burden. Thus, there should be an ability for full validator nodes can turn off the ability for clients to use the LFDAN feature.

Question: So that clients only need to make a connection to an observer node, should observer nodes have the ability to forward deploys to a full validator node.

High-level steps:

These steps are described

Client deploys rholang to a validator node. This is a signed message, and the signature represents the `DeployID`
Client figures out the unforgeable name that they are interests in. (More on that below)
Client requests to get a report from an observer node on all updates to names, matches, and continuations and filters the results to only those names for which it is interested.

What unforgeable name?

One of the challenges in getting updates from RChain is that on-chain all Rholang names are represented as an unforgeable name using the output of a hash. This unforgeable name is determined algorithmically such that other validators derive the same hash and can confirm that the merkle root representing the state of RSpace is valid. But the actual unforgeable names, (other than those in the top level new), cannot be known until the Rholang is actually executed. This presents a challenge to any code that desires to watch certain unforgeable names to see when they have been updated.

Currently, most code that desires to listen for data is only interested in listening to names that are at the top level of the Rholang code, and thus the unforgeable names can be calculated using the algorithm used by the validators. This means a client can deploy Rholang, then call a method to get all updates for these names that can be calculated. For names that are nested deeper in the code, there are currently no simple mechanisms for easily determining the unforgeable names.

The approach we are proposing here is to instrument the deployed Rholang to write unforgeable names that need to be listened on to the DeployID channel.

Sample with instrumented code

new _nameToListenOn, deployId(`rho:deploy:id`) in {
  _nameToListenOn!(0) |
         
  contract MyContract(@"hello", @name) = {
    for (@value <- _nameToListenOn) {
          // Do something with _nameToListenOn |
    } |
    
  // send the interesting channels down an easily accessible channel thx to deployId
  deployId!(("NAME_TO_LISTEN_ON", *_nameToListenOn))
} |

Changes to rnode/rspace

Assumptions:

we are able to pick a data scraping solution that will be good enough to attach a data aggregation solution (e.g. kafka, elastic search, etc)
this data scraping solution does not need to be "always on-line" as it can rely on an observer node
we are able to propagate information of processed and finalized blocks (as viewed by a node)
we are able to propagate information on deployid - block mapping

Proposed solution:

we have an opt-in mode for a node that opens an additional api endpoint (report)
the report api accepts a hash of a block
as a response the node produces a stream of data that has been produced while processing the passed block

This can be processed "as-is" without any additional post processing (this way a user that is interested in the current value is able to just look for their deploy, scrape the data and move on) or can be gathered and dumped into whatever big data processing engine.

Details:

the propagation of information of processed and finalized blocks (and deployid) needs to be discussed, it seems that a "blockStatus" or "deployStatus" api should be good enough for querying the node (internal api) and perhaps a "blocks" api that accepts a blockhash and returns higher block hashes (this needs to be similar to how catch up works to be useful as a means of figuring out the state of the chain)
the report api would start a replay of the given block via a modified replay rspace that "reports" all the matches per deploy.
- this gives an introspection mechanism
- allows viewing the details of a given deploy
- catches all data (not only the tuplespace state at the end of deploy)
- can be done off-line on a read only node
- does not introduce a "read only" continuation

It could be viewed as continuing work on listenForDataAtName with the difference of not having any special entities in the system to allow data extraction.

In the end it's all deploys and deployid we already have.