Title: **DEP-0000: Wire Protocol** Short Name: `0000-wire-protocol` Type: Standard Status: Undefined (as of 2018-02-04) Github PR: (add HTTPS link here after PR is opened) Authors: [Paul Frazee](https://github.com/pfrazee), [Bryan Newbold](https://github.com/bnewbold) # Summary [summary]: #summary This DEP describes the Dat wire protocol: a transport-agnostic message stream spoken between nodes in a swarm of hypercore network peers (including Dat clients). The wire protocol includes mechanisms for framing, stream encryption, and feed key authentication. # Motivation [motivation]: #motivation The protocol described here is already in use as of 2017 (by hypercore, Dat, and Beaker Browser users), and was partially described in an earlier [whitepaper][whitepaper]. This document fills in some additional details. [whitepaper]: https://TODO # Stream Connections [stream-details]: #stream-details The Dat wire protocol depends on a lower binary transport channel which provides the following semantics: - reliable delivery (no dropped messages, or partial messages) - in-order delivery of messages Peers wishing to connect need to discover each other using some mechanism or another (see forthcoming DEPs on some options; this process is modular and swappable), and both need to have the public key for the primary hypercore they wish to exchange. Messages are framed by the Dat protocol itself (see Messages section for details). ## Channels [channels]: #channels Multiple hypercore registers can be synchronized over the same protocol connection. Messages pertaining to the separate registers (aka, "Feeds", "channels") are tagged with an id for disambiguation. Note that at least one feed is necessary for each connection (for handshaking to succeed), and that the first feed is the one used for discovery and as an encryption key. To initiate a new channel (after the primary is established), ## Handshake Procedure [handshake]: #handshake A handshake procedure needs to occur for each feed on a channel; the first part of the first handshake happens in cleartext and both validates discovery keys and establishes encyption paramters used for the rest of the connection. The first (primary) channel has `id=0`. The first (cleartext) message is a Feed message, and includes two fields: a nonce and a discovery key. The **nonce** is generated by each peer as a random 32-byte sequence. The **discovery key** is generated from the public encryption key for a hypercore register (in this case the first, or "primary" register) by using the public key to sign the 9-byte ASCII string "hypercore" (with no trailing NULL byte) using the BLAKE2b keyed hashing algorithm (provided by most BLAKE2b hash implementations). The discovery key is 32 bytes long. The discovery key is used in cleartext instead of the public key to avoid leaking the public key to the network; read access to hypercore registers (including Dat archives) is controlled by limiting access to public keys. When the connection is first opened, the connecting peer sends their Feed message. The receiving peer checks that the discovery key was what they were expecting (eg, that they know of a public key matching that discovery key and are willing to synchronize the register associated with that key). If so, they reply with their own Feed. If not, they drop the connection. Once Feed messages are exchanged, both peers have all information they need to encrypt all further content on the channel, and do so (see below for details). The second part of the handshake is to exchange Handshake messages, which set some parameters for the channel. Handshakes also include the self-identified peer id, which can be used to detect accidental self-connections or redundant connections to the same peer (eg, over different transports). ## Encryption Scheme [encryption]: #encryption After the first Feed messages are exchanged (one message in each direction, in cleartext), all further bytes exchanged over the channel are encrypted. Framing metadata (aka, message length and type) is encrypted, but a third party could likely infer message lengths (and thus potentially message types) by observing packet sizes and timing; no padding is applied at the protocol layer. The encryption scheme used is libsodium's stream primative, specifically the XSalsa20 cipher. The cipher is fed a shared key (the primary hypercore register public key), a nonce (selected by the sender and exchanged during handshake), and a block offset (representing all encrypted bytes sent on the connection in this direction so far). *TODO: the following paragraph gets in to implementation details... some mention of the 64 byte chunks is needed, but maybe not this much detail?* The specific libsodium function used is usually `crypto_stream_xsalsa20_xor_ic()`. Some interfacing code is necessary to process messages that don't align with the cipher's 64-byte chunk size; unused bytes in any particular chunk can be ignored. For example, if 1000 encrypted bytes had been sent on a connection already, and then a new 50 byte message needed to be encrypted and sent, then one would offset the message by `1000 % 64 = 40` bytes and XOR the first 24 bytes against block 15, then XOR the remaining 26 bytes against block 16. The bytes would be shifted back and recombined before sending, so only 50 bytes would go down the connection; the same process would be followed by the receiver. ## Block Tree Digest [block-tree-digest]: #block-tree-digest When asking for a block of data we want to reduce the amount of duplicate hashes that are sent back. To communicate which hashes we have, we just have to communicate two things: which uncles we have and whether or not we have any parent node that can verify the tree. Consider the following tree: ``` 0 1 2 3 4 5 6 ``` If we want to fetch block 0, we need to communicate whether of not we already have the uncles (2, 5) and the parent (3). This information can be compressed into very small bit vector using the following scheme: - Let the trailing bit denote whether or not the leading bit is a parent and not a uncle. - Let the previous trailing bits denote whether or not we have the next uncle. Let's consider an example. Suppose we want to fetch block 0, and we have 2 and 3 but not 5. We need to therefore communicate: ``` the leading bit is a parent, not an uncle we already have the first uncle, 2 so don't send us that we don't have the next uncle, 5 we have the next parent, 3 ``` We would encode this into the bit vector `1011`. Decoded: ``` 101(1) <-- the leading bit is a parent, not an uncle 10(1)1 <-- we already have the first uncle, 2 so don't send us that 1(0)11 <-- we don't have the next uncle, 5 (1)000 <-- we have the next parent, 3 ``` So using this digest the recipient can easily figure out that they only need to send us one hash, 5, for us to verify block 0. The bit vector 1 (only contains a single one) means that we already have all the hashes we need so just send us the block. These digests are very compact in size, only `(log2(number-of-blocks) + 2) / 8` bytes needed in the worst case. For example if you are sharing one trillion blocks of data the digest would be `(log2(1000000000000) + 2) / 8 ~= 6` bytes long. # Message Details [message-details]: #message-details TODO: description of framing Wire format is `(
)`. `header` is a varint, of form `channel << 4 | <4-bit-type>`. Messages are encoded (serialized) using Google's [profobuf][protobuf] encoding. [protobuf]: https://TODO
`type` code Name
N/A [Keep-Alive][msg-keepalive]
0 [Feed][msg-feed]
1 [Handshake][msg-handshake]
2 [Info][msg-info]
3 [Have][msg-have]
4 [Unhave][msg-unhave]
5 [Want][msg-want]
6 [Unwant][msg-unwant]
7 [Request][msg-request]
8 [Cancel][msg-cancel]
9 [Data][msg-data]
15 [Extension][msg-extension]
#### Keep-Alive [msg-keepalive]: #msg-keepalive A message of body length 0 (giving a total message size of 1 byte for the `len` varint) is a keep-alive. Depending on transport and application needs, peers may optionally send keep-alive messages to help detect and prevent channel loss. Peers must always handle keep-alive messages correctly (aka, ignore them), regardless of transport. TODO: what is a good default interval? #### Feed [msg-feed]: #msg-feed // type=0, should be the first message sent on a channel message Feed { required bytes discoveryKey = 1; optional bytes nonce = 2; } #### Handshake [msg-handshake]: #msg-handshake // type=1, overall connection handshake. should be send just after the feed message on the first channel only message Handshake { optional bytes id = 1; optional bool live = 2; // keep the connection open forever? both ends have to agree optional bytes userData = 3; repeated string extensions = 4; } TODO: What are semantics of 'live' bit? what if there is disagreement? #### Info [msg-info]: #msg-info // type=2, message indicating state changes etc. // initial state for uploading/downloading is true // if both ends are not downloading and not live it is safe to consider the stream ended message Info { optional bool uploading = 1; optional bool downloading = 2; } #### Have [msg-have]: #msg-have // type=3, what do we have? message Have { required uint64 start = 1; optional uint64 length = 2 [default = 1]; // defaults to 1 optional bytes bitfield = 3; } #### Unhave [msg-unhave]: #msg-unhave // type=4, what did we lose? message Unhave { required uint64 start = 1; optional uint64 length = 2 [default = 1]; // defaults to 1 } #### Want [msg-want]: #msg-want // type=5, what do we want? remote should start sending have messages in this range message Want { required uint64 start = 1; optional uint64 length = 2; // defaults to Infinity or feed.length (if not live) } #### Unwant [msg-unwant]: #msg-unwant // type=6, what don't we want anymore? message Unwant { required uint64 start = 1; optional uint64 length = 2; // defaults to Infinity or feed.length (if not live) } #### Request [msg-request]: #msg-request // type=7, ask for data message Request { required uint64 index = 1; optional uint64 bytes = 2; optional bool hash = 3; optional uint64 nodes = 4; } #### Cancel [msg-cancel]: #msg-cancel // type=8, cancel a request message Cancel { required uint64 index = 1; optional uint64 bytes = 2; optional bool hash = 3; } #### Data [msg-data]: #msg-data // type=9, get some data message Data { message Node { required uint64 index = 1; required bytes hash = 2; required uint64 size = 3; } required uint64 index = 1; optional bytes value = 2; repeated Node nodes = 3; optional bytes signature = 4; } #### Extension [msg-extension]: #msg-extension `type=15` is an extension message that is encoded like: # Examples ## Simple Download Alyssa P Hacker and Ben Bitdiddle want to share a book... B connects to A. - full public key and discovery key for the connection - example nonces (in full) - messages in "struct" syntax and raw hex: - B: sends Feed - A: replies Feed - B: Handshake (downloading, not live) - A: Handshake (uploading, not live) - B: Info: downloading only - A: Info: uploading only - B: Have: nothing - A: Have: everything - B: Want: first register entry - A: Data: first entry - (repeat for all other chunks) - connection closes ## Multiple Feeds Describe in detail how to "add" a new channel (feed/register) to an existing connection, using Feed (and Handshake?) messages. ## Swarm Synchronization TODO: should this more involved example actually live here? or in hypercore DEP? It feels pretty message-level, but does involve more hypercore semantics. This example wouldn't include actual messages, but would describe an N-way (3+) node swarm, with a single (complete) seeder, two peers that both download from the seeder and exchange messages, and a fourth peer that downloads from one of the non-seeder peers only. - Peer A: seeder, writer. Starts with full history and appends to log - Peer B: swarm, reader. Starts with full history. Live connection. Connected to A, C, D. - Peer C: swarm, reader. Starts with sparse (old) history. Only wants "latest" data. Connected to A and, B. - Peer D: like Peer C, but only connected to B. # Rationale and alternatives [alternatives]: #alternatives - Why is this design the best in the space of possible designs? - What other designs have been considered and what is the rationale for not choosing them? - What is the impact of not doing this? # Unresolved questions [unresolved]: #unresolved-questions What are extension strings? What can 'userData' bytes be used for? Encryption might not make sense in some contexts (eg, IPC, or if the transport layer is already providing encryption). Should this DEP recognize this explicitly? - What parts of the design do you expect to resolve through the DEP consensus process before this gets merged? - What parts of the design do you expect to resolve through implementation and code review, or are left to independent library or application developers? - What related issues do you consider out of scope for this DEP that could be addressed in the future independently of the solution that comes out of this DEP? # Changelog [changelog]: #changelog A brief statemnt about current status can go here, follow by a list of dates when the status line of this DEP changed (in most-recent-last order). - YYYY-MM-DD: First complete draft submitted for review