Comm Layer Implementation
The communication layer of an RChain node comprises several levels of abstraction. It is responsible for assembling an overlay network upon the Internet, handling peers appearing and disappearing. It is further responsible for providing reliable communication among subsets of RChain nodes suitable for implementing time-sensitive protocols like Casper. In addition, it exposes a set of networking primitives to the RhoVM, giving Rholang code some communications ability.
It is simplest to describe this structure from the bottom protocols upwards, from peer-to-peer networking to direct communication functions exposed to the VM.
Please note that since namespacing hasn't been finalized, yet, much of this is subject to change when the namespace-related duties of the communication layers are decided. Of particular interest are the effects that namespaces might have on which remote nodes get included in a node's peer group.
Wire Protocol
For all structured, protocol messages, the wire format is serialized protocol buffers. This is a thoroughly debugged, forwards-compatible message format, and there is a protocol buffer implementation for nearly every major language and no few minor ones.
The format of messages is controlled at compile time by code generated according to simple schema. This code, native to the language for which it is generated, allows messages to be serialized into native types (instances of classes or structs, for example) and used directly within the language.
Protocol buffer messages yield relatively compact serialization formats. All integral types are varint encoded, for example, and fields without data may be elided. Since schema enforcement is done at compile time, there is no schema overhead in the serialized format. A few rules of construction are all that is required to ensure forwards compatibility of a protocol, by allowing previous versions of systems to ignore unrecognized fields.
This lack of rich type information in serialized protocol buffers, although yielding good network performance, is not without its annoyances. An protocol buffer of unknown type cannot be deserialized upon receipt; the receiver must assume its type. This leads to the use of a union-type pattern in protocol design. Moreover, there is no native framing protocol, so messages cannot be streamed effectively without cobbling together one of your own. Finally, if protocol version is important, it must be included explicitly as part of a message.
Overlay Network
Like Ethereum, RChain borrows the peer-to-peer mesh overlay from BitTorrent and other file-sharing technologies. The current implementation in RChain is based on the parts of the Kademlia protocol that establish and maintain a view of some subset of the network (a node's peers). This is described in more detail here.
Each peer included in the network overlay for a particular node may carry subjective measures. These include reputation, a measure of how well the peer is performing the duties from the protocol perspective. Also included might be metric information about latency to this node or measured throughput (if doing large transfers, for instance). These measures may be used for a number of reasons. A node may wish to replace underperforming peers. Alternatively, a node may wish to pick a certain number of the best performing peers to use in a validation subnet.
Peer-to-Peer Protocol
There is a need for security and compatibility testing above the basic mesh overlay, so that peers which are either incompatible (running too-old versions of the software, for example) or which are simply unfamiliar with the cryptographic protocols in RChain (running different or malicious software, say) are not joined to the node's view.
Belonging in this layer, but not yet implemented, is filtering based on namespace interests. If a validator's view of the network should be limited to other nodes interested in processing transactions from the same or related namespaces, this should form part of a potential peer's inclusion-or-exclusion decision.
Other RChain Protocols
For consensus and any other activity under the control of the nodes themselves, a more direct, connected, reliable topology may be required. Currently, a ZeroMQ-based communications object is implemented to fill that role. ZeroMQ is not required but simplifies a number of network-programming chores, including reconnection and buffering. However, those characteristics that make it easy to program network-aware applications using ZeroMQ argue against exposing it to higher layers in the RChain node code, since they are also useful tools in carrying out denial-of-service or other resource-exhaustion attacks.
ZeroMQ imposes few, if any, restrictions on the form that communication takes. It provides enough of a framing protocol that in combination with protocol buffers, the communication needs of a robust consensus protocol like Casper could be met. If the overhead imposed by a message queuing system is too high, the system could simply use a less reliable, connectionless transport like UDP, as a sort of optimistic communications layer.
External Network Communication
If RChain is to support connection to and communication with nodes that may not even be part of the RChain peer network, then clearly these connections cannot be part of any peer-to-peer networking overlay. As such, the current plan is to implement these connections directly, but within a different part of the communications layer.
There are perhaps two broad types of network resources, termed here "active" and "passive" resources. An active resource is a specific resource requested from the communication system for the exclusive use of a running Rholang program (or by the RhoVM). By contrast, an example of a passive resource might be "external network input on name N," which causes the communications layer to wait for input that can be routed to the appropriate place in a higher layer of the system.
For active network communication, network resources (sockets, for example) should be exposed in a manner consistent with other system-level resources. My assumption is that this will be done using a "resource handle" of some type, most likely an index into a table. The table of raw resources would never be exposed to the RhoVM layer or higher. All actions, including setting and getting configuration parameters, reading data, writing data with or without blocking, and so on, are done through primitives via the handle. Examples in a fictional Rosette-like language might include (for unreliable, connectionless communication):
- reserving a system UDP socket resource
(make-socket 'udp)
=> handle - writing data to a the socket
(write
handle address data)
=> bool
or (for connection-oriented input, here a simplified version of a pattern that enables servers to be written):
- reserving a system TCP socket resource
(make-socket 'tcp)
=> handle1 - readying a socket for serving connections
(bind-and-listen
handle1 address)
- accepting connections on the socket
(accept
handle1)
=> handle2 - reading data on the new socket
(read
handle2)
=> data
An active resource, like a raw connection over TCP or UDP, a socket is created in the JVM. The socket is entered into a table and a handle returned. Since all socket operations are thereafter indirect via RhoVM-called primitives, they can be monitored and charged to the calling account appropriately. An entry in the table may be associated at creation time with information that controls its lifespan, which may be updated from higher layers in the RChain system.
Your crack communications layer team does not yet have a great idea of how passive network operations might work, beyond a simple model of reserving one known (or discoverable) TCP and one known (or discoverable) UDP port for data multiplexing. Any incoming data will have to be well-formed, in the sense that it must be name-routable. There can be a protocol buffer for generalized input that carries routing information as well as a binary payload that is only restricted in size, not form.