From fc894b547e6eaeea37420bb902e62695b341cb1c Mon Sep 17 00:00:00 2001
From: Bryan Newbold <bnewbold@robocracy.org>
Date: Sun, 4 Mar 2018 17:22:45 -0800
Subject: WIP on multiwriter DEP

---
 proposals/0000-multiwriter.md | 255 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 255 insertions(+)
 create mode 100644 proposals/0000-multiwriter.md

diff --git a/proposals/0000-multiwriter.md b/proposals/0000-multiwriter.md
new file mode 100644
index 0000000..ec5d2d3
--- /dev/null
+++ b/proposals/0000-multiwriter.md
@@ -0,0 +1,255 @@
+
+Title: **DEP-0000: Multi-Writer**
+
+Short Name: `0000-multiwriter`
+
+Type: Standard
+
+Status: Undefined (as of 2018-03-XX)
+
+Github PR: (add HTTPS link here after PR is opened)
+
+Authors:
+[Bryan Newbold](https://github.com/bnewbold),
+[Stephen Whitmore](https://github.com/noffle),
+[Mathias Buus](https://github.com/mafintosh)
+
+
+# Summary
+[summary]: #summary
+
+Multi-Writer is a set of schema, API, and feature extentions to multiple agents
+(users, devices, or software) to write to the same HyperDB feed. By building on
+top of this abstraction layer, future versions of hyperdrive and Dat will gain
+these features.
+
+Mechanisms for distributed consistency and granting trust are specified here;
+the need for merge conflict algorithms and secure key distribution are
+mentioned but specific solutions are not specified.
+
+
+# Motivation
+[motivation]: #motivation
+
+The current hypercore/Dat ecosystem currently lacks solutions two fundamental
+use cases:
+
+- individual users should be able to modify distributed archives under their
+  control from multiple devices, at a minimum to prevent loss of control of
+  content if a single device (containing secret keys) is lost
+- contributions from and collaboration between multiple users on a single
+  archive or database should be possible, with appropriate trust and access
+  control semantics
+
+Access to a single secret key is currently required to make any change to a
+hypercore feed, and it is broadly considered best practice not to distribute
+secret keys between multiple users or multiple devices. In fact, the current
+hypercore implementation has no mechanism to resolve disputes or recover if
+multiple agents used the same secret key to append to the same feed.
+
+Solutions to these two use cases are seen as essential for many current and
+future Dat ecosystem applications.
+
+
+# Semantics and Usage
+[usage-documentation]: #usage-documentation
+
+TODO: semantics, terminology
+
+TODO: things left to application developers (secure key distribution, non-trivial merge resolution)
+
+TODO: brief note on scaling properties (eg, reasonable numbers of feeds per database)
+
+
+# Implementation
+[reference-documentation]: #reference-documentation
+
+The protobuf schema fields of interest for multi-writer (from the "Node"
+message type specified in the HyperDB DEP) are:
+
+- `seq`: the sequence number of this entry in the owner's hypercore. 0 is the
+  first, 1 the second, and so forth.
+- `feed`: the ID of the hypercore writer that wrote this
+- `clock`: vector clock to determine node insertion causality
+
+## Directed acyclic graph
+
+The combination of all operations performed on a HyperDB by all of its members
+forms a DAG (*directed acyclic graph*). Each write to the database (setting a
+key to a value) includes information to point backward at all of the known
+"heads" in the graph.
+
+To illustrate what this means, let's say Alice starts a new HyperDB and writes 2
+values to it:
+
+```
+// Feed
+
+0 (/foo/bar = 'baz')
+1 (/foo/2   = '{ "some": "json" }')
+
+
+// Graph
+
+Alice:  0  <---  1
+```
+
+Where sequence number 1 (the second entry) refers to sequence number 0 on the
+same feed (Alice's).
+
+Now Alice *authorizes* Bob to write to the HyperDB. Internally, this means Alice
+writes a special message to her feed saying that Bob's feed (identified by his
+public key) should be read and replicated in by other participants. Her feed
+becomes
+
+```
+// Feed
+
+0 (/foo/bar = 'baz')
+1 (/foo/2   = '{ "some": "json" }')
+2 (''       = '')
+
+
+// Graph
+
+Alice: 0  <---  1  <---  2
+```
+
+Authorization is formatted internally in a special way so that it isn't
+interpreted as a key/value pair.
+
+Now Bob writes a value to his feed, and then Alice and Bob sync. The result is:
+
+```
+// Feed
+
+//// Alice
+0 (/foo/bar = 'baz')
+1 (/foo/2   = '{ "some": "json" }')
+2 (''       = '')
+
+//// Bob
+0 (/a/b     = '12')
+
+
+// Graph
+
+Alice: 0  <---  1  <---  2
+Bob  : 0
+```
+
+Notice that none of Alice's entries refer to Bob's, and vice versa. This is
+because neither has written any entries to their feeds since the two became
+aware of each other (authorized & replicated each other's feeds).
+
+Right now there are two "heads" of the graph: Alice's feed at seq 2, and Bob's
+feed at seq 0.
+
+Next, Alice writes a new value, and her latest entry will refer to Bob's:
+
+```
+// Feed
+
+//// Alice
+0 (/foo/bar = 'baz')
+1 (/foo/2   = '{ "some": "json" }')
+2 (''       = '')
+3 (/foo/hup = 'beep')
+
+//// Bob
+0 (/a/b     = '12')
+
+
+// Graph
+
+Alice: 0  <---  1  <---  2  <--/  3
+Bob  : 0  <-------------------/
+```
+
+Because Alice's latest feed entry refers to Bob's latest feed entry, there is
+now only one "head" in the database. That means there is enough information in
+Alice's seq=3 entry to find any other key in the database. In the last example,
+there were two heads (Alice's seq=2 and Bob's seq=0); both of which would need
+to be read internally in order to locate any key in the database.
+
+Now there is only one "head": Alice's feed at seq 3.
+
+
+## Authorization
+
+The set of hypercores are *authorized* in that the original author of the first
+hypercore in a hyperdb must explicitly denote in their append-only log that the
+public key of a new hypercore is permitted to edit the database. Any authorized
+member may authorize more members. There is no revocation or other author
+management elements currently.
+
+
+## Vector clock
+
+Each node stores a [vector clock](https://en.wikipedia.org/wiki/Vector_clock) of
+the last known sequence number from each feed it knows about. This is what forms
+the DAG structure.
+
+A vector clock on a node of, say, `[0, 2, 5]` means:
+
+- when this node was written, the largest seq # in my local fed is 0
+- when this node was written, the largest seq # in the second feed I have is 2
+- when this node was written, the largest seq # in the third feed I have is 5
+
+For example, Bob's vector clock for Alice's seq=3 entry above would be `[0, 3]`
+since he knows of her latest entry (seq=3) and his own (seq=0).
+
+The vector clock is used for correctly traversing history. This is necessary for
+the `db#heads` API as well as `db#createHistoryStream`.
+
+
+# Examples
+[examples]: #examples
+
+TODO:
+
+
+# Security and Privacy Concerns
+[privacy]: #privacy
+
+TODO:
+
+
+# Drawbacks
+[drawbacks]: #drawbacks
+
+TODO: Why should we *not* do this?
+
+
+# Rationale and alternatives
+[alternatives]: #alternatives
+
+TODO:
+
+- Why is this design the best in the space of possible designs?
+- What other designs have been considered and what is the rationale for not choosing them?
+- What is the impact of not doing this?
+
+
+# Unresolved questions
+[unresolved]: #unresolved-questions
+
+TODO:
+
+- What parts of the design do you expect to resolve through the DEP consensus process before this gets merged?
+- What parts of the design do you expect to resolve through implementation and code review, or are left to independent library or application developers?
+- What related issues do you consider out of scope for this DEP that could be addressed in the future independently of the solution that comes out of this DEP?
+
+
+# Changelog
+[changelog]: #changelog
+
+As of March 2018, Mathias Buus (@mafintosh) is leading development of a hyperdb
+nodejs module on [github](https://github.com/mafintosh/hyperdb), which includes
+multi-writer features and is the basis for this DEP.
+
+- 2017-12-06: @noffle publishes `ARCHITECTURE.md` overview in the
+  [hyperdb github repo][arch_md]
+- 2018-03-XX: First partial draft submitted for review
+
+[arch_md]: https://github.com/mafintosh/hyperdb/blob/master/ARCHITECTURE.md
-- 
cgit v1.2.3


From 27d1cdc77676d5db996db6c3504f8ff3d6de159c Mon Sep 17 00:00:00 2001
From: Bryan Newbold <bnewbold@robocracy.org>
Date: Sun, 18 Mar 2018 23:57:56 -0700
Subject: work in progress on multi-writer DEP

---
 proposals/0000-multiwriter.md | 158 +++++++++++++++++++++++++++++++++++-------
 1 file changed, 133 insertions(+), 25 deletions(-)

diff --git a/proposals/0000-multiwriter.md b/proposals/0000-multiwriter.md
index ec5d2d3..a9b8542 100644
--- a/proposals/0000-multiwriter.md
+++ b/proposals/0000-multiwriter.md
@@ -7,7 +7,7 @@ Type: Standard
 
 Status: Undefined (as of 2018-03-XX)
 
-Github PR: (add HTTPS link here after PR is opened)
+Github PR: [Draft](https://github.com/datprotocol/DEPs/pull/10)
 
 Authors:
 [Bryan Newbold](https://github.com/bnewbold),
@@ -18,21 +18,24 @@ Authors:
 # Summary
 [summary]: #summary
 
-Multi-Writer is a set of schema, API, and feature extentions to multiple agents
-(users, devices, or software) to write to the same HyperDB feed. By building on
-top of this abstraction layer, future versions of hyperdrive and Dat will gain
-these features.
+Multi-Writer is a set of schema, API, and feature extentions to allow multiple
+agents (users, devices, or software) to write to the same hyperdb database. By
+building on top of this abstraction layer, future versions of hyperdrive and
+Dat will gain these features.
 
 Mechanisms for distributed consistency and granting trust are specified here;
 the need for merge conflict algorithms and secure key distribution are
 mentioned but specific solutions are not specified.
 
+This DEP forms the second half of the hyperdb specification; the first half
+covered only the key/value database aspects of hyperdb.
+
 
 # Motivation
 [motivation]: #motivation
 
-The current hypercore/Dat ecosystem currently lacks solutions two fundamental
-use cases:
+The current hypercore/Dat ecosystem currently lacks solutions for two
+fundamental use cases:
 
 - individual users should be able to modify distributed archives under their
   control from multiple devices, at a minimum to prevent loss of control of
@@ -51,35 +54,133 @@ Solutions to these two use cases are seen as essential for many current and
 future Dat ecosystem applications.
 
 
-# Semantics and Usage
+# Concepts, Behavior, and Usage
 [usage-documentation]: #usage-documentation
 
-TODO: semantics, terminology
+The multi-writer features of hyperdb are implemented by storing and replicating
+the contributions of each writer in a separate hypercore feed. This
+specification concerns itself with the details of how changes from multiple
+feeds (which may be written and replicated concurrently or asynchronously) are
+securely combined to present a unified key/value interface.
+
+The following related concerns are explicitly left to application developers to
+design and implement:
+
+- secure key distribution and authentication (eg, if a friend should be given
+  write access to a hyperdb database, how is that friend's feed key found and
+  verified?)
+- merge conflict resolution, potentially using application-layer semantics
+
+Before we go any further, a few definitions:
+
+*Feed*: A hypercore feed: an append-only log of *Entries*, which can be
+arbitrary data blobs. Hyperdb is built on top of several Feeds.
+
+*Writer*: a user (or user controlled device or software agent) that has a
+distinct feed with a public/private key pair, and thus the ability to append
+hypercore entries and "write" changes to their version of the database.
+
+*Original Writer*: the writer who created a given hyperdb database in the form
+of the *Original Feed*. The public key of the original feed is the one used to
+reference the database as a collection of feeds (eg, for the purpose of
+discovery).
+
+At a high level, multi-writer hyperdb works by having existing authorized
+writers (starting with the original writer) include authorization of new
+writers by appending metadata to their own feed which points to the new feeds
+(by public key). Each entry in each writer's feed contains "clock" metadata
+that records the known state of the entire database (all writers) seen from the
+perspective of that writer at the time they created the entry, in the form of
+"clock" version pointers. This metadata (a "[vector clock][vc]") can be used by
+other writers to resolve (or at least identify) conflicting content in the
+database. The technical term for this type of system is a "Conflict-free
+replicated data type" ([CRDT][crdt]).
+
+[vc]: https://en.wikipedia.org/wiki/Vector_clock
+[crdt]: https://en.wikipedia.org/wiki/Conflict-free_replicated_data_type
+
+
+## API
+[api]: #api
+
+The `db.get(key)` method described in the [hyperdb DEP][dep-hyperdb] actually
+returns an array of values. If it is unambiguous what the consistent value is,
+the array will have only that one value. If there is a conflict, multiple
+values will be returned. If multiple values are returned, the writer should
+attempt to merge the values (or chose one over the other) and write the result
+to the database with `db.put(key, value)`.
+
+`db.authorize(key)` will write metadata to the local feed authorizing the new
+feed (corresponding to `key`) to be included in the database.
 
-TODO: things left to application developers (secure key distribution, non-trivial merge resolution)
+<!-- TODO: update link -->
+[dep-hyperdb]: https://github.com/bnewbold/dat-deps/blob/dep-hyperdb/proposals/0000-hyperdb.md
 
-TODO: brief note on scaling properties (eg, reasonable numbers of feeds per database)
+
+## Scaling
+[scaling]: #scaling
+
+TODO: brief note on scaling properties (eg, reasonable numbers of writers per
+database)
 
 
 # Implementation
 [reference-documentation]: #reference-documentation
 
-The protobuf schema fields of interest for multi-writer (from the "Node"
-message type specified in the HyperDB DEP) are:
+The complete protobuf schemas for the hyperdb "Entry" and "InflatedEntry"
+message types (as specified in the hyperdb DEP) are:
+
+
+```
+message Entry {
+  required string key = 1;
+  optional bytes value = 2;
+  required bytes trie = 3;
+  repeated uint64 clock = 4;
+  optional uint64 inflate = 5;
+}
+
+message InflatedEntry {
+  message Feed {
+    required bytes key = 1;
+  }
+
+  required string key = 1;
+  optional bytes value = 2;
+  required bytes trie = 3;
+  repeated uint64 clock = 4;
+  optional uint64 inflate = 5;
+  repeated Feed feeds = 6;
+  optional bytes contentFeed = 7;
+}
+```
+
+The fields of interest for multi-writer are:
+
+- `clock`: a "vector clock" to record observed state at the time of writing the
+  entry. Included in every Entry and InflatedEntry.
+- `inflate`: a back-pointer to the entry index of the most recent InflatedEntry
+  (containing a feed metadata change). Included in every Entry and
+  InflatedEntry.
+- `feeds`: list of public keys for each writer's feed. Only included in
+  InflatedEntry, and only when feeds have changed.
+
+When serialized on disk in a SLEEP directory, the original feed is written
+under `./source/`. If the "local" feed is different from the original feed, it
+is written to `./local/`. All other feeds are written under directories
+prefixed `./peers/<feed-discovery-key>/`.
 
-- `seq`: the sequence number of this entry in the owner's hypercore. 0 is the
-  first, 1 the second, and so forth.
-- `feed`: the ID of the hypercore writer that wrote this
-- `clock`: vector clock to determine node insertion causality
+TODO: the above disk format is incorrect?
 
 ## Directed acyclic graph
+[dag]: #dag
 
-The combination of all operations performed on a HyperDB by all of its members
+The combination of all operations performed on a hyperdb by all of its members
 forms a DAG (*directed acyclic graph*). Each write to the database (setting a
 key to a value) includes information to point backward at all of the known
 "heads" in the graph.
 
-To illustrate what this means, let's say Alice starts a new HyperDB and writes 2
+To illustrate what this means, let's say Alice starts a new hyperdb and writes 2
 values to it:
 
 ```
@@ -97,7 +198,7 @@ Alice:  0  <---  1
 Where sequence number 1 (the second entry) refers to sequence number 0 on the
 same feed (Alice's).
 
-Now Alice *authorizes* Bob to write to the HyperDB. Internally, this means Alice
+Now Alice *authorizes* Bob to write to the hyperdb. Internally, this means Alice
 writes a special message to her feed saying that Bob's feed (identified by his
 public key) should be read and replicated in by other participants. Her feed
 becomes
@@ -218,7 +319,11 @@ TODO:
 # Drawbacks
 [drawbacks]: #drawbacks
 
-TODO: Why should we *not* do this?
+Significant increase in implementation, application, and user experience
+complexity.
+
+Two hard problems (merges and secure key distribution) are left to application
+developers.
 
 
 # Rationale and alternatives
@@ -234,11 +339,14 @@ TODO:
 # Unresolved questions
 [unresolved]: #unresolved-questions
 
-TODO:
+What is the technical term for the specific CRDT we are using?
+"Operation-based"?
+
+If there are conflicts resulting in ambiguity of whether a key has been deleted
+or has a new value, does `db.get(key)` return an array of `[new_value, None]`?
 
-- What parts of the design do you expect to resolve through the DEP consensus process before this gets merged?
-- What parts of the design do you expect to resolve through implementation and code review, or are left to independent library or application developers?
-- What related issues do you consider out of scope for this DEP that could be addressed in the future independently of the solution that comes out of this DEP?
+What is a reasonable large number of writers to have in a single database?
+Write "Scaling" section.
 
 
 # Changelog
-- 
cgit v1.2.3


From 33541cbfbcfc2d45fa5b4336b969fbd9181fb8ef Mon Sep 17 00:00:00 2001
From: Bryan Newbold <bnewbold@robocracy.org>
Date: Sat, 9 Jun 2018 18:12:21 -0700
Subject: address early maf comments on PR

---
 proposals/0000-multiwriter.md | 28 +++++++++++++++++++++++-----
 1 file changed, 23 insertions(+), 5 deletions(-)

diff --git a/proposals/0000-multiwriter.md b/proposals/0000-multiwriter.md
index a9b8542..fb60efc 100644
--- a/proposals/0000-multiwriter.md
+++ b/proposals/0000-multiwriter.md
@@ -69,7 +69,8 @@ design and implement:
 - secure key distribution and authentication (eg, if a friend should be given
   write access to a hyperdb database, how is that friend's feed key found and
   verified?)
-- merge conflict resolution, potentially using application-layer semantics
+- merge conflict resolution (using the provided API), potentially using
+  application-layer semantics
 
 Before we go any further, a few definitions:
 
@@ -120,8 +121,14 @@ feed (corresponding to `key`) to be included in the database.
 ## Scaling
 [scaling]: #scaling
 
-TODO: brief note on scaling properties (eg, reasonable numbers of writers per
-database)
+There is some overhead associated with each "writer" added to the feed,
+impacting the number of files on disk, memory use, and the computational cost
+of some lookup oprations. The design should easily accomodate dozens of
+writers, and should scale to 1,000 writers without too much additional
+overhead. Note that a large number of writers also implies a larger number and
+rate of append operations, and additional network connections, which may cause
+scaling issues on their own. More real-world experience and benchmarking is
+needed in this area.
 
 
 # Implementation
@@ -298,7 +305,8 @@ A vector clock on a node of, say, `[0, 2, 5]` means:
 - when this node was written, the largest seq # in the third feed I have is 5
 
 For example, Bob's vector clock for Alice's seq=3 entry above would be `[0, 3]`
-since he knows of her latest entry (seq=3) and his own (seq=0).
+since he knows of her latest entry (seq=3) and his own (seq=0). Note that the
+order of clocks is not consistent across writers, only within the same feed.
 
 The vector clock is used for correctly traversing history. This is necessary for
 the `db#heads` API as well as `db#createHistoryStream`.
@@ -329,6 +337,13 @@ developers.
 # Rationale and alternatives
 [alternatives]: #alternatives
 
+Design goals for hyperdb (including the multi-writer feature) included:
+
+- ability to execute operations (get, put) with a sparse (partial) replication
+  of the database, using as few additional network requests as possible
+- minimal on-disk and on-wire overhead
+- implemented on top of an append-only log (to build on top of hypercore)
+
 TODO:
 
 - Why is this design the best in the space of possible designs?
@@ -344,6 +359,7 @@ What is the technical term for the specific CRDT we are using?
 
 If there are conflicts resulting in ambiguity of whether a key has been deleted
 or has a new value, does `db.get(key)` return an array of `[new_value, None]`?
+Answer: `get` always returns nodes (not just values), so context is included. In the case of a deletion, a the value within the node will be `null`.
 
 What is a reasonable large number of writers to have in a single database?
 Write "Scaling" section.
@@ -356,8 +372,10 @@ As of March 2018, Mathias Buus (@mafintosh) is leading development of a hyperdb
 nodejs module on [github](https://github.com/mafintosh/hyperdb), which includes
 multi-writer features and is the basis for this DEP.
 
+Jim Pick (@jimpick) has been an active contributor working out multi-writer details.
+
 - 2017-12-06: @noffle publishes `ARCHITECTURE.md` overview in the
   [hyperdb github repo][arch_md]
-- 2018-03-XX: First partial draft submitted for review
+- 2018-06-XX: First partial draft submitted for review
 
 [arch_md]: https://github.com/mafintosh/hyperdb/blob/master/ARCHITECTURE.md
-- 
cgit v1.2.3


From 89c41746bc8c3a43651ede387b9ca39fc2ba2a2d Mon Sep 17 00:00:00 2001
From: Bryan Newbold <bnewbold@robocracy.org>
Date: Sat, 9 Jun 2018 22:10:14 -0700
Subject: tweaks, updates, some progress

---
 proposals/0000-multiwriter.md | 65 ++++++++++++++++++++++++++++---------------
 1 file changed, 43 insertions(+), 22 deletions(-)

diff --git a/proposals/0000-multiwriter.md b/proposals/0000-multiwriter.md
index fb60efc..b83606a 100644
--- a/proposals/0000-multiwriter.md
+++ b/proposals/0000-multiwriter.md
@@ -75,7 +75,10 @@ design and implement:
 Before we go any further, a few definitions:
 
 *Feed*: A hypercore feed: an append-only log of *Entries*, which can be
-arbitrary data blobs. Hyperdb is built on top of several Feeds.
+arbitrary data blobs.
+
+*Database*: in this context, a Hyperdb key/value database. Built from several
+Feeds (two Feeds per Writer).
 
 *Writer*: a user (or user controlled device or software agent) that has a
 distinct feed with a public/private key pair, and thus the ability to append
@@ -101,21 +104,33 @@ replicated data type" ([CRDT][crdt]).
 [crdt]: https://en.wikipedia.org/wiki/Conflict-free_replicated_data_type
 
 
-## API
+## Core API
 [api]: #api
 
-The `db.get(key)` method described in the [hyperdb DEP][dep-hyperdb] actually
-returns an array of values. If it is unambiguous what the consistent value is,
-the array will have only that one value. If there is a conflict, multiple
-values will be returned. If multiple values are returned, the writer should
-attempt to merge the values (or chose one over the other) and write the result
-to the database with `db.put(key, value)`.
+A "node" is a data structure representing a database entry, including the
+`key`, `value`, and feed that the entry is commited to.
+
+`db.get(key)` (as described in the [hyperdb DEP][dep-hyperdb])
+returns an array of nodes. If it is unambiguous what the consistent state of a key is,
+the array will have only that one value. If there is a conflict (ambiguity),
+multiple nodes will be returned.
+
+If multiple nodes are returned from a `get`, the writer should attempt to merge
+the values (or chose one over the other) and write the result to the database
+with `db.put(key, value)`. Client libraries can make this process more
+ergonomic by accepting a helper function to select between multiple nodes.
+Libraries can also offer an option to either directly return the value of a
+single node (instead of the node itself), or raise an error; this is likely to
+be more ergonomic for applications which do not intend to support multiple
+writers per database.
 
 `db.authorize(key)` will write metadata to the local feed authorizing the new
 feed (corresponding to `key`) to be included in the database.
 
-<!-- TODO: update link -->
-[dep-hyperdb]: https://github.com/bnewbold/dat-deps/blob/dep-hyperdb/proposals/0000-hyperdb.md
+`db.authorized(key)` (returning a boolean) indicates whether the given `key` is
+an authorized writer to the hyperdb database.
+
+[dep-hyperdb]: https://github.com/datprotocol/DEPs/blob/master/proposals/0004-hyperdb.md
 
 
 ## Scaling
@@ -131,7 +146,7 @@ scaling issues on their own. More real-world experience and benchmarking is
 needed in this area.
 
 
-# Implementation
+# Implementation Details
 [reference-documentation]: #reference-documentation
 
 The complete protobuf schemas for the hyperdb "Entry" and "InflatedEntry"
@@ -142,9 +157,10 @@ message types (as specified in the hyperdb DEP) are:
 message Entry {
   required string key = 1;
   optional bytes value = 2;
-  required bytes trie = 3;
-  repeated uint64 clock = 4;
-  optional uint64 inflate = 5;
+  optional bool deleted = 3;
+  required bytes trie = 4;
+  repeated uint64 clock = 5;
+  optional uint64 inflate = 6;
 }
 
 message InflatedEntry {
@@ -154,11 +170,12 @@ message InflatedEntry {
 
   required string key = 1;
   optional bytes value = 2;
-  required bytes trie = 3;
-  repeated uint64 clock = 4;
-  optional uint64 inflate = 5;
-  repeated Feed feeds = 6;
-  optional bytes contentFeed = 7;
+  optional bool deleted = 3;
+  required bytes trie = 4;
+  repeated uint64 clock = 5;
+  optional uint64 inflate = 6;
+  repeated Feed feeds = 7;
+  optional bytes contentFeed = 8;
 }
 ```
 
@@ -173,9 +190,9 @@ The fields of interest for multi-writer are:
   InflatedEntry, and only when feeds have changed.
 
 When serialized on disk in a SLEEP directory, the original feed is written
-under `./source/`. If the "local" feed is different from the original feed, it
-is written to `./local/`. All other feeds are written under directories
-prefixed `./peers/<feed-discovery-key>/`.
+under `source/`. If the "local" feed is different from the original feed (aka,
+local is not the "Original Writer"), it is written to `local/`. All other feeds
+are written under directories prefixed `./peers/<feed-discovery-key>/`.
 
 TODO: the above disk format is incorrect?
 
@@ -364,6 +381,10 @@ Answer: `get` always returns nodes (not just values), so context is included. In
 What is a reasonable large number of writers to have in a single database?
 Write "Scaling" section.
 
+The javascript hyperdb implementation has a new `deleted` flag and `Header`
+prefix message. These fall under the scope of the `hyperdb` DEP; should they be
+mentioned here?
+
 
 # Changelog
 [changelog]: #changelog
-- 
cgit v1.2.3


From 714563f4e49add95f13f42e8dfb4e34c9b964fd0 Mon Sep 17 00:00:00 2001
From: Bryan Newbold <bnewbold@robocracy.org>
Date: Sat, 9 Jun 2018 22:56:32 -0700
Subject: update unresolved, security, usage

---
 proposals/0000-multiwriter.md | 90 +++++++++++++++++++++++++++----------------
 1 file changed, 57 insertions(+), 33 deletions(-)

diff --git a/proposals/0000-multiwriter.md b/proposals/0000-multiwriter.md
index b83606a..07116a2 100644
--- a/proposals/0000-multiwriter.md
+++ b/proposals/0000-multiwriter.md
@@ -113,7 +113,10 @@ A "node" is a data structure representing a database entry, including the
 `db.get(key)` (as described in the [hyperdb DEP][dep-hyperdb])
 returns an array of nodes. If it is unambiguous what the consistent state of a key is,
 the array will have only that one value. If there is a conflict (ambiguity),
-multiple nodes will be returned.
+multiple nodes will be returned. If a key has been deleted from the database, a
+node with the `deleted` flag will be returned; note that such "tombstone" nodes
+can still have a `value` field, which may contain application-specific metadata
+(such as a timestamp).
 
 If multiple nodes are returned from a `get`, the writer should attempt to merge
 the values (or chose one over the other) and write the result to the database
@@ -125,11 +128,14 @@ be more ergonomic for applications which do not intend to support multiple
 writers per database.
 
 `db.authorize(key)` will write metadata to the local feed authorizing the new
-feed (corresponding to `key`) to be included in the database.
+feed (corresponding to `key`) to be included in the database. Once authorized,
+a feed may further authorize additional feeds (recursively).
 
 `db.authorized(key)` (returning a boolean) indicates whether the given `key` is
 an authorized writer to the hyperdb database.
 
+At the time of this DEP there is no mechanism for revoking authorization.
+
 [dep-hyperdb]: https://github.com/datprotocol/DEPs/blob/master/proposals/0004-hyperdb.md
 
 
@@ -192,9 +198,8 @@ The fields of interest for multi-writer are:
 When serialized on disk in a SLEEP directory, the original feed is written
 under `source/`. If the "local" feed is different from the original feed (aka,
 local is not the "Original Writer"), it is written to `local/`. All other feeds
-are written under directories prefixed `./peers/<feed-discovery-key>/`.
+are written under directories prefixed `peers/<feed-discovery-key>/`.
 
-TODO: the above disk format is incorrect?
 
 ## Directed acyclic graph
 [dag]: #dag
@@ -300,14 +305,6 @@ to be read internally in order to locate any key in the database.
 Now there is only one "head": Alice's feed at seq 3.
 
 
-## Authorization
-
-The set of hypercores are *authorized* in that the original author of the first
-hypercore in a hyperdb must explicitly denote in their append-only log that the
-public key of a new hypercore is permitted to edit the database. Any authorized
-member may authorize more members. There is no revocation or other author
-management elements currently.
-
 
 ## Vector clock
 
@@ -338,17 +335,50 @@ TODO:
 # Security and Privacy Concerns
 [privacy]: #privacy
 
-TODO:
+As noted above, there is no existing mechanism for removing authorization for a
+feed once added, and an authorized feed may recursively authorize additional
+feeds. There is also no mechanism to restrict the scope of an authorized feed's
+actions (eg, limit to only a specific path prefix). This leaves application
+designers and users with few tools to control trust or access ("all or
+nothing"). Care must be taken in particular if self-mutating software is being
+distributed via hyperdb, or when action may be taken automatically based on the
+most recent content of a database (eg, bots or even third-party tools may
+publish publicly, or even take real-world action like controlling an electrical
+relay).
+
+There is no mechanism to remove malicious history (or any history for that
+matter); if an authorized (but hostile) writer appends a huge number of key
+operations (bloating hyperdb metadata size), or posts offensive or illegal
+content to a database, there is no way to permanently remove the data without
+creating an new database.
+
+The read semantics of hyperdb are unchanged from hypercore: an actor does not
+need to be "authorized" (for writing) to read the full history of a database,
+they only need the public key.
 
 
 # Drawbacks
 [drawbacks]: #drawbacks
 
-Significant increase in implementation, application, and user experience
-complexity.
-
-Two hard problems (merges and secure key distribution) are left to application
-developers.
+Mutli-writer capability incurs a non-trivial increase in library, application,
+and user experience complexity. For many applications, collaboration is an
+essential feature, and the complexity is easily justified. To minimize
+complexity for applications which do not need multi-writer features,
+implementation authors should consider configuration modes which hide the
+complexity of unused features. For example, by having an option to returning a
+single node for a `get()` (and throw an error if there is a conflict), or a
+flag to throw an error if a database unexpectedly contains more than a single
+feed.
+
+Two tasks (conflict merges and secure key distribution) are left to application
+developers. Both of these are Hard Problems. The current design mitigates the
+former by reducing the number of merge conflicts that need to be handled by an
+application (aka, only the non-trivial ones need to be handled), and
+implementation authors are encouraged to provide an ergonomic API for writing
+conflict resolvers. The second problem (secure key distribution) is out of
+scope for this DEP. It is hoped that at least one pattern or library will
+emerge from the Dat ecosystem such that each application author doesn't need to
+invent a solution from scratch.
 
 
 # Rationale and alternatives
@@ -361,29 +391,23 @@ Design goals for hyperdb (including the multi-writer feature) included:
 - minimal on-disk and on-wire overhead
 - implemented on top of an append-only log (to build on top of hypercore)
 
-TODO:
+If a solution for core use cases like collaboration and multi-device
+synchronization is not provided at a low level (as this DEP provides), each
+application will need to invent a solution at a higher level, incuring
+duplicated effort and a higher risk of bugs.
 
-- Why is this design the best in the space of possible designs?
-- What other designs have been considered and what is the rationale for not choosing them?
-- What is the impact of not doing this?
+As an alternative to CRDTs, Operational Transformation (OT) has a reputation
+for being more difficult to understand and implement.
 
 
 # Unresolved questions
 [unresolved]: #unresolved-questions
 
 What is the technical term for the specific CRDT we are using?
-"Operation-based"?
-
-If there are conflicts resulting in ambiguity of whether a key has been deleted
-or has a new value, does `db.get(key)` return an array of `[new_value, None]`?
-Answer: `get` always returns nodes (not just values), so context is included. In the case of a deletion, a the value within the node will be `null`.
-
-What is a reasonable large number of writers to have in a single database?
-Write "Scaling" section.
+"Operation-based" or "State-based"?
 
-The javascript hyperdb implementation has a new `deleted` flag and `Header`
-prefix message. These fall under the scope of the `hyperdb` DEP; should they be
-mentioned here?
+What is the actual on-disk layout (folder structure), if not what is documented
+here?
 
 
 # Changelog
-- 
cgit v1.2.3


From 3b8215a805048819813f39f9181a4553442f37b4 Mon Sep 17 00:00:00 2001
From: Bryan Newbold <bnewbold@robocracy.org>
Date: Sun, 10 Jun 2018 01:18:51 -0700
Subject: update impl reference and example

---
 proposals/0000-multiwriter.md | 244 +++++++++++++++++++++++++++---------------
 1 file changed, 158 insertions(+), 86 deletions(-)

diff --git a/proposals/0000-multiwriter.md b/proposals/0000-multiwriter.md
index 07116a2..ed35df6 100644
--- a/proposals/0000-multiwriter.md
+++ b/proposals/0000-multiwriter.md
@@ -200,136 +200,192 @@ under `source/`. If the "local" feed is different from the original feed (aka,
 local is not the "Original Writer"), it is written to `local/`. All other feeds
 are written under directories prefixed `peers/<feed-discovery-key>/`.
 
+## Feeds and Vector Clocks
+
+At any point in time, each writer has a potentially unique view of the
+"current" state of the database as a whole; this is the nature of real-world
+distributed systems. For example, a given write might have the most recent
+appends from one peer (eg, only seconds old), but be missing much older appends
+from another (eg, days or weeks out of date). By having each writer include
+metadata about their percieved state of the system as a whole in operations to
+their Feed, all writers are able to collectively converge on an "eventually
+consistent" picture of the database as whole (this process will be described in
+the next section).
+
+A writer's "current known state" of the database consists of the set of active
+Feeds, and for each the most recent entry sequence number ("version"). This
+state can be serialized as an array of integers, refered to as a [vector
+clock](https://en.wikipedia.org/wiki/Vector_clock).
+
+Each `put()` operation on the database appends a node to the writer's `local`
+feed, and contains the writer's vector clock as observed at that time.
+`InflatedEntry` nodes also contain a list of all known authorized Feeds;
+inflated nodes only need to be written when the Feed list changes. Every
+non-inflated entry contains a pointer back to the most recent inflated entry;
+inflated entries themselves contain a pointer back to the previous inflated
+entry (the first inflated entry has a null pointer). Elements of a vector clock
+are ordered by the Feed list from the corresponding Inflated entry.
+
+By convention, the order of Feed lists is to start with the writer's local
+feed first, then proceed by the order in which Feeds were discovered. Note that
+this ordering is not consistent across writers, only within the same feed.
+
+As an example, if a node (non-inflated entry) had a vector clock of `[0, 2,
+5]`, that would mean:
+
+- when this node was written, the largest seq # in the writer's local fed was 0
+- when this node was written, the largest seq # in the second known feed was 2
+- when this node was written, the largest seq # in the third known feed was 5
+
+
+## Multi-Feed Aware hyperdb
+[multi-aware]: #multi-aware
+
+The [hyperdb DEP](hyperdb-dep) specifies the procedures for lookup (`get()`)
+and append (`put()`) operations to the database, as well as binary encoding
+schemes for entry messages.
+
+Note that the trie encoding specifies pointers in a `(feed, entry)` pair
+format. The `feed` integer is an index into the most recent Feed list (found in
+the most recent inflated entry; see the last section). When working with a
+multi-writer hyperdb database, simply look up entries in the appropriate feed,
+instead of only looking in the current feed. The next section ("Consistent
+History") describes which entry (or entries) to start with instead of simply
+assuming the most recent entry from the local feed.
+
+
+## Consistent History
+[consist-history]: #consist-history
+
+The set of all appended nodes in all feeds of a hyperdb, and all the vector
+clock pointers between them, forms a "directed acyclic graph" (DAG). Any node
+which does not have another node pointing to it is called a "head" (this
+terminology is similar to that used in git). At any point in time, an observed
+copy of a database has one or more heads, each representing the top of a
+tree-like graph. In the trivial case of a non-multi-writer hyperdb, there is
+always only a single head: the most recent entry in the local feed. Just after
+appending to the local feed, there is also always a single head, because that
+node's vector clock will reference all know most recent entries from other
+feeds. It is only when nodes are appended by separate writers who did not know
+of the others' concurrent action (and then these changes are replicated) that
+there are multiple heads.
+
+When operating on the database (eg, executing a `get()` operation), all heads
+must be considered. The lookup proceedure documented in the [hyperdb
+DEP](hyperdb-dep) must be repeated for each head, and nodes returned
+representing the set of all unique values.
+
+The situation where a `get()` operation multiple heads returns different values
+for the same key is called a "conflict" and requires a "merge" to resolve. Some
+writer (either a human being or application-layer code) must decide on the
+correct value for the key and write that value as a new entry (with a vector
+clock that includes the previous heads). The procedure for chosing the best
+value to use in a given conflict is sometimes easy to determine, but is
+impossible to determine algorithmically in the general case. See the "Usage"
+section for more details.
 
-## Directed acyclic graph
-[dag]: #dag
 
-The combination of all operations performed on a hyperdb by all of its members
-forms a DAG (*directed acyclic graph*). Each write to the database (setting a
-key to a value) includes information to point backward at all of the known
-"heads" in the graph.
+# Examples
+[examples]: #examples
 
-To illustrate what this means, let's say Alice starts a new hyperdb and writes 2
-values to it:
+Let's say Alice starts a new hyperdb and writes two key/value entries to it:
 
 ```
-// Feed
-
-0 (/foo/bar = 'baz')
-1 (/foo/2   = '{ "some": "json" }')
+// Operations
+Alice: db.put('/foo/bar', 'baz')
+Alice: db.put('/foo/2',   '{"json":3}')
 
+// Alice's Feed
+0 (key='/foo/bar', value='baz',
+   vector_clock=[], inflated=null, feeds=['a11ce...']) (InflatedEntry)
+1 (key='/foo/2', value='{"json":3}',
+   vector_clock=[0], inflated=0)
 
 // Graph
-
 Alice:  0  <---  1
 ```
 
-Where sequence number 1 (the second entry) refers to sequence number 0 on the
-same feed (Alice's).
+The vector clock at `seq=1` points back to `seq=0`.
 
-Now Alice *authorizes* Bob to write to the hyperdb. Internally, this means Alice
-writes a special message to her feed saying that Bob's feed (identified by his
-public key) should be read and replicated in by other participants. Her feed
-becomes
+Next Alice *authorizes* Bob to write to the database. Internally, this means Alice
+writes an Inflated entry to her feed that contains Bob's Feed (identified by his
+public key) in her feed list.
 
 ```
-// Feed
-
-0 (/foo/bar = 'baz')
-1 (/foo/2   = '{ "some": "json" }')
-2 (''       = '')
+// Operations
+Alice: db.authorize('b0b123...')
 
+// Alice's Feed
+0 (key='/foo/bar', value='baz',
+   vector_clock=[], inflated=null, feeds=['a11ce...']) (InflatedEntry)
+1 (key='/foo/2', value='{"json":3}',
+   vector_clock=[0], inflated=0)
+2 (key=null, value=null,
+   vector_clock=[1], inflated=0, feeds=['a11ce...', 'b0b123...']) (InflatedEntry)
 
 // Graph
-
 Alice: 0  <---  1  <---  2
 ```
 
-Authorization is formatted internally in a special way so that it isn't
-interpreted as a key/value pair.
-
-Now Bob writes a value to his feed, and then Alice and Bob sync. The result is:
+Bob writes a value to his feed, and then Alice and Bob sync. The result is:
 
 ```
-// Feed
+// Operations
+Bob: db.put('/a/b', '12)
 
-//// Alice
-0 (/foo/bar = 'baz')
-1 (/foo/2   = '{ "some": "json" }')
-2 (''       = '')
-
-//// Bob
-0 (/a/b     = '12')
+// Alice's Feed
+0 (key='/foo/bar', value='baz',
+   vector_clock=[], inflated=null, feeds=['a11ce...']) (InflatedEntry)
+1 (key='/foo/2', value='{"json":3}',
+   vector_clock=[0], inflated=0)
+2 (key=null, value=null,
+   vector_clock=[1], inflated=0, feeds=['a11ce...', 'b0b123...']) (InflatedEntry)
 
+// Bob's Feed
+0 (key='/a/b', value='12',
+   vector_clock=[], inflated=null, feeds=['b0b123...']) (InflatedEntry))
 
 // Graph
-
 Alice: 0  <---  1  <---  2
 Bob  : 0
 ```
 
-Notice that none of Alice's entries refer to Bob's, and vice versa. This is
-because neither has written any entries to their feeds since the two became
-aware of each other (authorized & replicated each other's feeds).
-
+Notice that none of Alice's entries refer to Bob's, and vice versa. Neither has
+written any entries to their feeds since the two became aware of each other.
 Right now there are two "heads" of the graph: Alice's feed at seq 2, and Bob's
-feed at seq 0.
+feed at seq 0. Any `get()` operations would need to descend from both heads,
+though in this situation there would be no conflicts as the keys in the two
+feeds are disjoint.
 
 Next, Alice writes a new value, and her latest entry will refer to Bob's:
 
 ```
-// Feed
+// Operations
+Alice: db.put('/foo/hup', 'beep')
 
-//// Alice
-0 (/foo/bar = 'baz')
-1 (/foo/2   = '{ "some": "json" }')
-2 (''       = '')
-3 (/foo/hup = 'beep')
+// Alice's Feed
+0 (key='/foo/bar', value='baz',
+   vector_clock=[], inflated=null, feeds=['a11ce...']) (InflatedEntry)
+1 (key='/foo/2', value='{"json":3}',
+   vector_clock=[0], inflated=0)
+2 (key=null, value=null,
+   vector_clock=[1, null], inflated=0, feeds=['a11ce...', 'b0b123...']) (InflatedEntry)
+3 (key='/foo/hup', value='beep',
+   vector_clock=[2,0], inflated=2)
 
-//// Bob
-0 (/a/b     = '12')
+// Bob's Feed
+0 (key='/a/b', value='12',
+   vector_clock=[], inflated=null, feeds=['b0b123...']) (InflatedEntry))
 
 
 // Graph
-
 Alice: 0  <---  1  <---  2  <--/  3
 Bob  : 0  <-------------------/
 ```
 
-Because Alice's latest feed entry refers to Bob's latest feed entry, there is
-now only one "head" in the database. That means there is enough information in
-Alice's seq=3 entry to find any other key in the database. In the last example,
-there were two heads (Alice's seq=2 and Bob's seq=0); both of which would need
-to be read internally in order to locate any key in the database.
-
-Now there is only one "head": Alice's feed at seq 3.
-
-
-
-## Vector clock
-
-Each node stores a [vector clock](https://en.wikipedia.org/wiki/Vector_clock) of
-the last known sequence number from each feed it knows about. This is what forms
-the DAG structure.
-
-A vector clock on a node of, say, `[0, 2, 5]` means:
-
-- when this node was written, the largest seq # in my local fed is 0
-- when this node was written, the largest seq # in the second feed I have is 2
-- when this node was written, the largest seq # in the third feed I have is 5
-
-For example, Bob's vector clock for Alice's seq=3 entry above would be `[0, 3]`
-since he knows of her latest entry (seq=3) and his own (seq=0). Note that the
-order of clocks is not consistent across writers, only within the same feed.
-
-The vector clock is used for correctly traversing history. This is necessary for
-the `db#heads` API as well as `db#createHistoryStream`.
-
-
-# Examples
-[examples]: #examples
-
-TODO:
+Alice's latest feed entry now points to Bob's latest feed entry, and there is
+only one "head" in the database. This means that any `get()` operations only
+need to run once, starting at `seq=3` in Alice's feed.
 
 
 # Security and Privacy Concerns
@@ -356,6 +412,10 @@ The read semantics of hyperdb are unchanged from hypercore: an actor does not
 need to be "authorized" (for writing) to read the full history of a database,
 they only need the public key.
 
+As noted in other DEPs, a malicious writer can potentially execute a denial of
+service (DoS) attack by appending hyperdb entries that for a cyclic loop of
+references.
+
 
 # Drawbacks
 [drawbacks]: #drawbacks
@@ -409,6 +469,18 @@ What is the technical term for the specific CRDT we are using?
 What is the actual on-disk layout (folder structure), if not what is documented
 here?
 
+Couldn't the local feed's sequence number be skipped from vector clocks,
+because it's implied by the sequence number of the hypercore entry itself? Same
+with the key in the feed list (for inflated entries).
+
+If there are multiple heads, but they all return the same `value` for a `get()`
+operation, how is it decided which `node` will be returned? AKA, values are the
+same, but node metadata might not be (order of vector clock, etc).
+
+Suspect that some details are off in the example: shouldn't the InflatedEntry
+authorizing a new feed include a vector clock reference to a specific seq in
+that feed? Should new local (not yet authorized) feeds reference reference
+their source in an initial InflatedEntry (at seq=0)?
 
 # Changelog
 [changelog]: #changelog
-- 
cgit v1.2.3


From 0e9bf58695348e6a0409d5f6a08e0e19a9b318df Mon Sep 17 00:00:00 2001
From: Bryan Newbold <bnewbold@robocracy.org>
Date: Sun, 10 Jun 2018 01:26:36 -0700
Subject: tweak vector_clock and inflated refs in example

---
 proposals/0000-multiwriter.md | 24 ++++++++++++++----------
 1 file changed, 14 insertions(+), 10 deletions(-)

diff --git a/proposals/0000-multiwriter.md b/proposals/0000-multiwriter.md
index ed35df6..d2e6318 100644
--- a/proposals/0000-multiwriter.md
+++ b/proposals/0000-multiwriter.md
@@ -191,9 +191,12 @@ The fields of interest for multi-writer are:
   entry. Included in every Entry and InflatedEntry.
 - `inflate`: a back-pointer to the entry index of the most recent InflatedEntry
   (containing a feed metadata change). Included in every Entry and
-  InflatedEntry.
+  InflatedEntry. Should not be included for the very first entry in a feed
+  (which is an InflatedEntry).
 - `feeds`: list of public keys for each writer's feed. Only included in
-  InflatedEntry, and only when feeds have changed.
+  InflatedEntry, and only when feeds have changed. Does include a
+  self-reference to the current (local) Feed's key, always as the first
+  element.
 
 When serialized on disk in a SLEEP directory, the original feed is written
 under `source/`. If the "local" feed is different from the original feed (aka,
@@ -297,7 +300,7 @@ Alice: db.put('/foo/2',   '{"json":3}')
 
 // Alice's Feed
 0 (key='/foo/bar', value='baz',
-   vector_clock=[], inflated=null, feeds=['a11ce...']) (InflatedEntry)
+   vector_clock=[0], inflated=null, feeds=['a11ce...']) (InflatedEntry)
 1 (key='/foo/2', value='{"json":3}',
    vector_clock=[0], inflated=0)
 
@@ -317,7 +320,7 @@ Alice: db.authorize('b0b123...')
 
 // Alice's Feed
 0 (key='/foo/bar', value='baz',
-   vector_clock=[], inflated=null, feeds=['a11ce...']) (InflatedEntry)
+   vector_clock=[0], inflated=null, feeds=['a11ce...']) (InflatedEntry)
 1 (key='/foo/2', value='{"json":3}',
    vector_clock=[0], inflated=0)
 2 (key=null, value=null,
@@ -335,7 +338,7 @@ Bob: db.put('/a/b', '12)
 
 // Alice's Feed
 0 (key='/foo/bar', value='baz',
-   vector_clock=[], inflated=null, feeds=['a11ce...']) (InflatedEntry)
+   vector_clock=[0], inflated=null, feeds=['a11ce...']) (InflatedEntry)
 1 (key='/foo/2', value='{"json":3}',
    vector_clock=[0], inflated=0)
 2 (key=null, value=null,
@@ -343,7 +346,7 @@ Bob: db.put('/a/b', '12)
 
 // Bob's Feed
 0 (key='/a/b', value='12',
-   vector_clock=[], inflated=null, feeds=['b0b123...']) (InflatedEntry))
+   vector_clock=[0], inflated=null, feeds=['b0b123...']) (InflatedEntry))
 
 // Graph
 Alice: 0  <---  1  <---  2
@@ -365,7 +368,7 @@ Alice: db.put('/foo/hup', 'beep')
 
 // Alice's Feed
 0 (key='/foo/bar', value='baz',
-   vector_clock=[], inflated=null, feeds=['a11ce...']) (InflatedEntry)
+   vector_clock=[0], inflated=null, feeds=['a11ce...']) (InflatedEntry)
 1 (key='/foo/2', value='{"json":3}',
    vector_clock=[0], inflated=0)
 2 (key=null, value=null,
@@ -375,7 +378,7 @@ Alice: db.put('/foo/hup', 'beep')
 
 // Bob's Feed
 0 (key='/a/b', value='12',
-   vector_clock=[], inflated=null, feeds=['b0b123...']) (InflatedEntry))
+   vector_clock=[0], inflated=null, feeds=['b0b123...']) (InflatedEntry))
 
 
 // Graph
@@ -479,8 +482,9 @@ same, but node metadata might not be (order of vector clock, etc).
 
 Suspect that some details are off in the example: shouldn't the InflatedEntry
 authorizing a new feed include a vector clock reference to a specific seq in
-that feed? Should new local (not yet authorized) feeds reference reference
-their source in an initial InflatedEntry (at seq=0)?
+that feed? Should new local (not yet authorized) feeds reference
+their source in an initial InflatedEntry (eg, Bob at seq=0)? Should the first
+InflatedEntry in a feed point to itself in it's vector clock?
 
 # Changelog
 [changelog]: #changelog
-- 
cgit v1.2.3


From 89504750ce92362fa23e5573753eaa5329b490c3 Mon Sep 17 00:00:00 2001
From: Bryan Newbold <bnewbold@robocracy.org>
Date: Fri, 6 Jul 2018 11:28:19 -0400
Subject: multiwriter: op-based CRDT

---
 proposals/0000-multiwriter.md | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/proposals/0000-multiwriter.md b/proposals/0000-multiwriter.md
index d2e6318..e8f931a 100644
--- a/proposals/0000-multiwriter.md
+++ b/proposals/0000-multiwriter.md
@@ -98,7 +98,8 @@ perspective of that writer at the time they created the entry, in the form of
 "clock" version pointers. This metadata (a "[vector clock][vc]") can be used by
 other writers to resolve (or at least identify) conflicting content in the
 database. The technical term for this type of system is a "Conflict-free
-replicated data type" ([CRDT][crdt]).
+replicated data type" ([CRDT][crdt]), and specifically an "Operation-based" (as
+opposed to "State-based") CRDT.
 
 [vc]: https://en.wikipedia.org/wiki/Vector_clock
 [crdt]: https://en.wikipedia.org/wiki/Conflict-free_replicated_data_type
@@ -466,9 +467,6 @@ for being more difficult to understand and implement.
 # Unresolved questions
 [unresolved]: #unresolved-questions
 
-What is the technical term for the specific CRDT we are using?
-"Operation-based" or "State-based"?
-
 What is the actual on-disk layout (folder structure), if not what is documented
 here?
 
-- 
cgit v1.2.3


From a81c069501ab3c99d8129fde9e4f97958f62fe5f Mon Sep 17 00:00:00 2001
From: Bryan Newbold <bnewbold@robocracy.org>
Date: Fri, 6 Jul 2018 11:30:04 -0400
Subject: multiwriter: review feedback

Thanks @mafintosh, @pfrazee
---
 proposals/0000-multiwriter.md | 22 +++++++++++++---------
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/proposals/0000-multiwriter.md b/proposals/0000-multiwriter.md
index e8f931a..8264786 100644
--- a/proposals/0000-multiwriter.md
+++ b/proposals/0000-multiwriter.md
@@ -112,12 +112,15 @@ A "node" is a data structure representing a database entry, including the
 `key`, `value`, and feed that the entry is commited to.
 
 `db.get(key)` (as described in the [hyperdb DEP][dep-hyperdb])
-returns an array of nodes. If it is unambiguous what the consistent state of a key is,
-the array will have only that one value. If there is a conflict (ambiguity),
-multiple nodes will be returned. If a key has been deleted from the database, a
-node with the `deleted` flag will be returned; note that such "tombstone" nodes
-can still have a `value` field, which may contain application-specific metadata
-(such as a timestamp).
+returns an array of nodes. If it is unambiguous what the consistent state of a
+key is, the array will have only that one value. If there is a conflict
+(ambiguity), multiple nodes will be returned. If a key has unambiguously been
+removed from the database, a "null" or empty datatype is returned. If one
+branch of a conflict has a deletion (but at least one of the others does not),
+a node with the `deleted` flag will be returned; note that such "tombstone"
+nodes can still have a `value` field, which may contain application-specific
+metadata (such as a self-reported timestamp), which may help resolve the
+conflict.
 
 If multiple nodes are returned from a `get`, the writer should attempt to merge
 the values (or chose one over the other) and write the result to the database
@@ -470,9 +473,10 @@ for being more difficult to understand and implement.
 What is the actual on-disk layout (folder structure), if not what is documented
 here?
 
-Couldn't the local feed's sequence number be skipped from vector clocks,
-because it's implied by the sequence number of the hypercore entry itself? Same
-with the key in the feed list (for inflated entries).
+The local feed's sequence number could skipped from vector clocks, because it's
+implied by the sequence number of the hypercore entry itself. Same with the key
+in the feed list (for inflated entries). In both cases, the redundant data is
+retained for simplicity.
 
 If there are multiple heads, but they all return the same `value` for a `get()`
 operation, how is it decided which `node` will be returned? AKA, values are the
-- 
cgit v1.2.3


From eb1334949faed0238d5251a0794067caa643b713 Mon Sep 17 00:00:00 2001
From: Bryan Newbold <bnewbold@robocracy.org>
Date: Fri, 6 Jul 2018 11:35:38 -0400
Subject: multiwriter on-disk folder structure

---
 proposals/0000-multiwriter.md | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/proposals/0000-multiwriter.md b/proposals/0000-multiwriter.md
index 8264786..eb25c3d 100644
--- a/proposals/0000-multiwriter.md
+++ b/proposals/0000-multiwriter.md
@@ -202,10 +202,15 @@ The fields of interest for multi-writer are:
   self-reference to the current (local) Feed's key, always as the first
   element.
 
-When serialized on disk in a SLEEP directory, the original feed is written
-under `source/`. If the "local" feed is different from the original feed (aka,
-local is not the "Original Writer"), it is written to `local/`. All other feeds
-are written under directories prefixed `peers/<feed-discovery-key>/`.
+When serialized on disk in a SLEEP directory:
+
+- `source/`: the original feed, as created or cloned by this writer
+- `local/`: if "local" feed is different from "source", it goes here
+- `peers/<discover-key>/`: all other writers' feed go under this directory (the
+  discovery key is lower-case hex-encoded)
+- `content/<discovery-key>/`: if a higher-level protocol is being used that
+  uses multiple linked hypercore feeds (eg, hyperdrive), the linked "content"
+  feeds all go under this directory
 
 ## Feeds and Vector Clocks
 
-- 
cgit v1.2.3


From 96eb0d32363deaa085bc133db1b6fda25cff2b55 Mon Sep 17 00:00:00 2001
From: Bryan Newbold <bnewbold@robocracy.org>
Date: Fri, 6 Jul 2018 13:39:56 -0400
Subject: multiwriter as DEP-0008

---
 proposals/0000-multiwriter.md | 509 -----------------------------------------
 proposals/0008-multiwriter.md | 512 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 512 insertions(+), 509 deletions(-)
 delete mode 100644 proposals/0000-multiwriter.md
 create mode 100644 proposals/0008-multiwriter.md

diff --git a/proposals/0000-multiwriter.md b/proposals/0000-multiwriter.md
deleted file mode 100644
index eb25c3d..0000000
--- a/proposals/0000-multiwriter.md
+++ /dev/null
@@ -1,509 +0,0 @@
-
-Title: **DEP-0000: Multi-Writer**
-
-Short Name: `0000-multiwriter`
-
-Type: Standard
-
-Status: Undefined (as of 2018-03-XX)
-
-Github PR: [Draft](https://github.com/datprotocol/DEPs/pull/10)
-
-Authors:
-[Bryan Newbold](https://github.com/bnewbold),
-[Stephen Whitmore](https://github.com/noffle),
-[Mathias Buus](https://github.com/mafintosh)
-
-
-# Summary
-[summary]: #summary
-
-Multi-Writer is a set of schema, API, and feature extentions to allow multiple
-agents (users, devices, or software) to write to the same hyperdb database. By
-building on top of this abstraction layer, future versions of hyperdrive and
-Dat will gain these features.
-
-Mechanisms for distributed consistency and granting trust are specified here;
-the need for merge conflict algorithms and secure key distribution are
-mentioned but specific solutions are not specified.
-
-This DEP forms the second half of the hyperdb specification; the first half
-covered only the key/value database aspects of hyperdb.
-
-
-# Motivation
-[motivation]: #motivation
-
-The current hypercore/Dat ecosystem currently lacks solutions for two
-fundamental use cases:
-
-- individual users should be able to modify distributed archives under their
-  control from multiple devices, at a minimum to prevent loss of control of
-  content if a single device (containing secret keys) is lost
-- contributions from and collaboration between multiple users on a single
-  archive or database should be possible, with appropriate trust and access
-  control semantics
-
-Access to a single secret key is currently required to make any change to a
-hypercore feed, and it is broadly considered best practice not to distribute
-secret keys between multiple users or multiple devices. In fact, the current
-hypercore implementation has no mechanism to resolve disputes or recover if
-multiple agents used the same secret key to append to the same feed.
-
-Solutions to these two use cases are seen as essential for many current and
-future Dat ecosystem applications.
-
-
-# Concepts, Behavior, and Usage
-[usage-documentation]: #usage-documentation
-
-The multi-writer features of hyperdb are implemented by storing and replicating
-the contributions of each writer in a separate hypercore feed. This
-specification concerns itself with the details of how changes from multiple
-feeds (which may be written and replicated concurrently or asynchronously) are
-securely combined to present a unified key/value interface.
-
-The following related concerns are explicitly left to application developers to
-design and implement:
-
-- secure key distribution and authentication (eg, if a friend should be given
-  write access to a hyperdb database, how is that friend's feed key found and
-  verified?)
-- merge conflict resolution (using the provided API), potentially using
-  application-layer semantics
-
-Before we go any further, a few definitions:
-
-*Feed*: A hypercore feed: an append-only log of *Entries*, which can be
-arbitrary data blobs.
-
-*Database*: in this context, a Hyperdb key/value database. Built from several
-Feeds (two Feeds per Writer).
-
-*Writer*: a user (or user controlled device or software agent) that has a
-distinct feed with a public/private key pair, and thus the ability to append
-hypercore entries and "write" changes to their version of the database.
-
-*Original Writer*: the writer who created a given hyperdb database in the form
-of the *Original Feed*. The public key of the original feed is the one used to
-reference the database as a collection of feeds (eg, for the purpose of
-discovery).
-
-At a high level, multi-writer hyperdb works by having existing authorized
-writers (starting with the original writer) include authorization of new
-writers by appending metadata to their own feed which points to the new feeds
-(by public key). Each entry in each writer's feed contains "clock" metadata
-that records the known state of the entire database (all writers) seen from the
-perspective of that writer at the time they created the entry, in the form of
-"clock" version pointers. This metadata (a "[vector clock][vc]") can be used by
-other writers to resolve (or at least identify) conflicting content in the
-database. The technical term for this type of system is a "Conflict-free
-replicated data type" ([CRDT][crdt]), and specifically an "Operation-based" (as
-opposed to "State-based") CRDT.
-
-[vc]: https://en.wikipedia.org/wiki/Vector_clock
-[crdt]: https://en.wikipedia.org/wiki/Conflict-free_replicated_data_type
-
-
-## Core API
-[api]: #api
-
-A "node" is a data structure representing a database entry, including the
-`key`, `value`, and feed that the entry is commited to.
-
-`db.get(key)` (as described in the [hyperdb DEP][dep-hyperdb])
-returns an array of nodes. If it is unambiguous what the consistent state of a
-key is, the array will have only that one value. If there is a conflict
-(ambiguity), multiple nodes will be returned. If a key has unambiguously been
-removed from the database, a "null" or empty datatype is returned. If one
-branch of a conflict has a deletion (but at least one of the others does not),
-a node with the `deleted` flag will be returned; note that such "tombstone"
-nodes can still have a `value` field, which may contain application-specific
-metadata (such as a self-reported timestamp), which may help resolve the
-conflict.
-
-If multiple nodes are returned from a `get`, the writer should attempt to merge
-the values (or chose one over the other) and write the result to the database
-with `db.put(key, value)`. Client libraries can make this process more
-ergonomic by accepting a helper function to select between multiple nodes.
-Libraries can also offer an option to either directly return the value of a
-single node (instead of the node itself), or raise an error; this is likely to
-be more ergonomic for applications which do not intend to support multiple
-writers per database.
-
-`db.authorize(key)` will write metadata to the local feed authorizing the new
-feed (corresponding to `key`) to be included in the database. Once authorized,
-a feed may further authorize additional feeds (recursively).
-
-`db.authorized(key)` (returning a boolean) indicates whether the given `key` is
-an authorized writer to the hyperdb database.
-
-At the time of this DEP there is no mechanism for revoking authorization.
-
-[dep-hyperdb]: https://github.com/datprotocol/DEPs/blob/master/proposals/0004-hyperdb.md
-
-
-## Scaling
-[scaling]: #scaling
-
-There is some overhead associated with each "writer" added to the feed,
-impacting the number of files on disk, memory use, and the computational cost
-of some lookup oprations. The design should easily accomodate dozens of
-writers, and should scale to 1,000 writers without too much additional
-overhead. Note that a large number of writers also implies a larger number and
-rate of append operations, and additional network connections, which may cause
-scaling issues on their own. More real-world experience and benchmarking is
-needed in this area.
-
-
-# Implementation Details
-[reference-documentation]: #reference-documentation
-
-The complete protobuf schemas for the hyperdb "Entry" and "InflatedEntry"
-message types (as specified in the hyperdb DEP) are:
-
-
-```
-message Entry {
-  required string key = 1;
-  optional bytes value = 2;
-  optional bool deleted = 3;
-  required bytes trie = 4;
-  repeated uint64 clock = 5;
-  optional uint64 inflate = 6;
-}
-
-message InflatedEntry {
-  message Feed {
-    required bytes key = 1;
-  }
-
-  required string key = 1;
-  optional bytes value = 2;
-  optional bool deleted = 3;
-  required bytes trie = 4;
-  repeated uint64 clock = 5;
-  optional uint64 inflate = 6;
-  repeated Feed feeds = 7;
-  optional bytes contentFeed = 8;
-}
-```
-
-The fields of interest for multi-writer are:
-
-- `clock`: a "vector clock" to record observed state at the time of writing the
-  entry. Included in every Entry and InflatedEntry.
-- `inflate`: a back-pointer to the entry index of the most recent InflatedEntry
-  (containing a feed metadata change). Included in every Entry and
-  InflatedEntry. Should not be included for the very first entry in a feed
-  (which is an InflatedEntry).
-- `feeds`: list of public keys for each writer's feed. Only included in
-  InflatedEntry, and only when feeds have changed. Does include a
-  self-reference to the current (local) Feed's key, always as the first
-  element.
-
-When serialized on disk in a SLEEP directory:
-
-- `source/`: the original feed, as created or cloned by this writer
-- `local/`: if "local" feed is different from "source", it goes here
-- `peers/<discover-key>/`: all other writers' feed go under this directory (the
-  discovery key is lower-case hex-encoded)
-- `content/<discovery-key>/`: if a higher-level protocol is being used that
-  uses multiple linked hypercore feeds (eg, hyperdrive), the linked "content"
-  feeds all go under this directory
-
-## Feeds and Vector Clocks
-
-At any point in time, each writer has a potentially unique view of the
-"current" state of the database as a whole; this is the nature of real-world
-distributed systems. For example, a given write might have the most recent
-appends from one peer (eg, only seconds old), but be missing much older appends
-from another (eg, days or weeks out of date). By having each writer include
-metadata about their percieved state of the system as a whole in operations to
-their Feed, all writers are able to collectively converge on an "eventually
-consistent" picture of the database as whole (this process will be described in
-the next section).
-
-A writer's "current known state" of the database consists of the set of active
-Feeds, and for each the most recent entry sequence number ("version"). This
-state can be serialized as an array of integers, refered to as a [vector
-clock](https://en.wikipedia.org/wiki/Vector_clock).
-
-Each `put()` operation on the database appends a node to the writer's `local`
-feed, and contains the writer's vector clock as observed at that time.
-`InflatedEntry` nodes also contain a list of all known authorized Feeds;
-inflated nodes only need to be written when the Feed list changes. Every
-non-inflated entry contains a pointer back to the most recent inflated entry;
-inflated entries themselves contain a pointer back to the previous inflated
-entry (the first inflated entry has a null pointer). Elements of a vector clock
-are ordered by the Feed list from the corresponding Inflated entry.
-
-By convention, the order of Feed lists is to start with the writer's local
-feed first, then proceed by the order in which Feeds were discovered. Note that
-this ordering is not consistent across writers, only within the same feed.
-
-As an example, if a node (non-inflated entry) had a vector clock of `[0, 2,
-5]`, that would mean:
-
-- when this node was written, the largest seq # in the writer's local fed was 0
-- when this node was written, the largest seq # in the second known feed was 2
-- when this node was written, the largest seq # in the third known feed was 5
-
-
-## Multi-Feed Aware hyperdb
-[multi-aware]: #multi-aware
-
-The [hyperdb DEP](hyperdb-dep) specifies the procedures for lookup (`get()`)
-and append (`put()`) operations to the database, as well as binary encoding
-schemes for entry messages.
-
-Note that the trie encoding specifies pointers in a `(feed, entry)` pair
-format. The `feed` integer is an index into the most recent Feed list (found in
-the most recent inflated entry; see the last section). When working with a
-multi-writer hyperdb database, simply look up entries in the appropriate feed,
-instead of only looking in the current feed. The next section ("Consistent
-History") describes which entry (or entries) to start with instead of simply
-assuming the most recent entry from the local feed.
-
-
-## Consistent History
-[consist-history]: #consist-history
-
-The set of all appended nodes in all feeds of a hyperdb, and all the vector
-clock pointers between them, forms a "directed acyclic graph" (DAG). Any node
-which does not have another node pointing to it is called a "head" (this
-terminology is similar to that used in git). At any point in time, an observed
-copy of a database has one or more heads, each representing the top of a
-tree-like graph. In the trivial case of a non-multi-writer hyperdb, there is
-always only a single head: the most recent entry in the local feed. Just after
-appending to the local feed, there is also always a single head, because that
-node's vector clock will reference all know most recent entries from other
-feeds. It is only when nodes are appended by separate writers who did not know
-of the others' concurrent action (and then these changes are replicated) that
-there are multiple heads.
-
-When operating on the database (eg, executing a `get()` operation), all heads
-must be considered. The lookup proceedure documented in the [hyperdb
-DEP](hyperdb-dep) must be repeated for each head, and nodes returned
-representing the set of all unique values.
-
-The situation where a `get()` operation multiple heads returns different values
-for the same key is called a "conflict" and requires a "merge" to resolve. Some
-writer (either a human being or application-layer code) must decide on the
-correct value for the key and write that value as a new entry (with a vector
-clock that includes the previous heads). The procedure for chosing the best
-value to use in a given conflict is sometimes easy to determine, but is
-impossible to determine algorithmically in the general case. See the "Usage"
-section for more details.
-
-
-# Examples
-[examples]: #examples
-
-Let's say Alice starts a new hyperdb and writes two key/value entries to it:
-
-```
-// Operations
-Alice: db.put('/foo/bar', 'baz')
-Alice: db.put('/foo/2',   '{"json":3}')
-
-// Alice's Feed
-0 (key='/foo/bar', value='baz',
-   vector_clock=[0], inflated=null, feeds=['a11ce...']) (InflatedEntry)
-1 (key='/foo/2', value='{"json":3}',
-   vector_clock=[0], inflated=0)
-
-// Graph
-Alice:  0  <---  1
-```
-
-The vector clock at `seq=1` points back to `seq=0`.
-
-Next Alice *authorizes* Bob to write to the database. Internally, this means Alice
-writes an Inflated entry to her feed that contains Bob's Feed (identified by his
-public key) in her feed list.
-
-```
-// Operations
-Alice: db.authorize('b0b123...')
-
-// Alice's Feed
-0 (key='/foo/bar', value='baz',
-   vector_clock=[0], inflated=null, feeds=['a11ce...']) (InflatedEntry)
-1 (key='/foo/2', value='{"json":3}',
-   vector_clock=[0], inflated=0)
-2 (key=null, value=null,
-   vector_clock=[1], inflated=0, feeds=['a11ce...', 'b0b123...']) (InflatedEntry)
-
-// Graph
-Alice: 0  <---  1  <---  2
-```
-
-Bob writes a value to his feed, and then Alice and Bob sync. The result is:
-
-```
-// Operations
-Bob: db.put('/a/b', '12)
-
-// Alice's Feed
-0 (key='/foo/bar', value='baz',
-   vector_clock=[0], inflated=null, feeds=['a11ce...']) (InflatedEntry)
-1 (key='/foo/2', value='{"json":3}',
-   vector_clock=[0], inflated=0)
-2 (key=null, value=null,
-   vector_clock=[1], inflated=0, feeds=['a11ce...', 'b0b123...']) (InflatedEntry)
-
-// Bob's Feed
-0 (key='/a/b', value='12',
-   vector_clock=[0], inflated=null, feeds=['b0b123...']) (InflatedEntry))
-
-// Graph
-Alice: 0  <---  1  <---  2
-Bob  : 0
-```
-
-Notice that none of Alice's entries refer to Bob's, and vice versa. Neither has
-written any entries to their feeds since the two became aware of each other.
-Right now there are two "heads" of the graph: Alice's feed at seq 2, and Bob's
-feed at seq 0. Any `get()` operations would need to descend from both heads,
-though in this situation there would be no conflicts as the keys in the two
-feeds are disjoint.
-
-Next, Alice writes a new value, and her latest entry will refer to Bob's:
-
-```
-// Operations
-Alice: db.put('/foo/hup', 'beep')
-
-// Alice's Feed
-0 (key='/foo/bar', value='baz',
-   vector_clock=[0], inflated=null, feeds=['a11ce...']) (InflatedEntry)
-1 (key='/foo/2', value='{"json":3}',
-   vector_clock=[0], inflated=0)
-2 (key=null, value=null,
-   vector_clock=[1, null], inflated=0, feeds=['a11ce...', 'b0b123...']) (InflatedEntry)
-3 (key='/foo/hup', value='beep',
-   vector_clock=[2,0], inflated=2)
-
-// Bob's Feed
-0 (key='/a/b', value='12',
-   vector_clock=[0], inflated=null, feeds=['b0b123...']) (InflatedEntry))
-
-
-// Graph
-Alice: 0  <---  1  <---  2  <--/  3
-Bob  : 0  <-------------------/
-```
-
-Alice's latest feed entry now points to Bob's latest feed entry, and there is
-only one "head" in the database. This means that any `get()` operations only
-need to run once, starting at `seq=3` in Alice's feed.
-
-
-# Security and Privacy Concerns
-[privacy]: #privacy
-
-As noted above, there is no existing mechanism for removing authorization for a
-feed once added, and an authorized feed may recursively authorize additional
-feeds. There is also no mechanism to restrict the scope of an authorized feed's
-actions (eg, limit to only a specific path prefix). This leaves application
-designers and users with few tools to control trust or access ("all or
-nothing"). Care must be taken in particular if self-mutating software is being
-distributed via hyperdb, or when action may be taken automatically based on the
-most recent content of a database (eg, bots or even third-party tools may
-publish publicly, or even take real-world action like controlling an electrical
-relay).
-
-There is no mechanism to remove malicious history (or any history for that
-matter); if an authorized (but hostile) writer appends a huge number of key
-operations (bloating hyperdb metadata size), or posts offensive or illegal
-content to a database, there is no way to permanently remove the data without
-creating an new database.
-
-The read semantics of hyperdb are unchanged from hypercore: an actor does not
-need to be "authorized" (for writing) to read the full history of a database,
-they only need the public key.
-
-As noted in other DEPs, a malicious writer can potentially execute a denial of
-service (DoS) attack by appending hyperdb entries that for a cyclic loop of
-references.
-
-
-# Drawbacks
-[drawbacks]: #drawbacks
-
-Mutli-writer capability incurs a non-trivial increase in library, application,
-and user experience complexity. For many applications, collaboration is an
-essential feature, and the complexity is easily justified. To minimize
-complexity for applications which do not need multi-writer features,
-implementation authors should consider configuration modes which hide the
-complexity of unused features. For example, by having an option to returning a
-single node for a `get()` (and throw an error if there is a conflict), or a
-flag to throw an error if a database unexpectedly contains more than a single
-feed.
-
-Two tasks (conflict merges and secure key distribution) are left to application
-developers. Both of these are Hard Problems. The current design mitigates the
-former by reducing the number of merge conflicts that need to be handled by an
-application (aka, only the non-trivial ones need to be handled), and
-implementation authors are encouraged to provide an ergonomic API for writing
-conflict resolvers. The second problem (secure key distribution) is out of
-scope for this DEP. It is hoped that at least one pattern or library will
-emerge from the Dat ecosystem such that each application author doesn't need to
-invent a solution from scratch.
-
-
-# Rationale and alternatives
-[alternatives]: #alternatives
-
-Design goals for hyperdb (including the multi-writer feature) included:
-
-- ability to execute operations (get, put) with a sparse (partial) replication
-  of the database, using as few additional network requests as possible
-- minimal on-disk and on-wire overhead
-- implemented on top of an append-only log (to build on top of hypercore)
-
-If a solution for core use cases like collaboration and multi-device
-synchronization is not provided at a low level (as this DEP provides), each
-application will need to invent a solution at a higher level, incuring
-duplicated effort and a higher risk of bugs.
-
-As an alternative to CRDTs, Operational Transformation (OT) has a reputation
-for being more difficult to understand and implement.
-
-
-# Unresolved questions
-[unresolved]: #unresolved-questions
-
-What is the actual on-disk layout (folder structure), if not what is documented
-here?
-
-The local feed's sequence number could skipped from vector clocks, because it's
-implied by the sequence number of the hypercore entry itself. Same with the key
-in the feed list (for inflated entries). In both cases, the redundant data is
-retained for simplicity.
-
-If there are multiple heads, but they all return the same `value` for a `get()`
-operation, how is it decided which `node` will be returned? AKA, values are the
-same, but node metadata might not be (order of vector clock, etc).
-
-Suspect that some details are off in the example: shouldn't the InflatedEntry
-authorizing a new feed include a vector clock reference to a specific seq in
-that feed? Should new local (not yet authorized) feeds reference
-their source in an initial InflatedEntry (eg, Bob at seq=0)? Should the first
-InflatedEntry in a feed point to itself in it's vector clock?
-
-# Changelog
-[changelog]: #changelog
-
-As of March 2018, Mathias Buus (@mafintosh) is leading development of a hyperdb
-nodejs module on [github](https://github.com/mafintosh/hyperdb), which includes
-multi-writer features and is the basis for this DEP.
-
-Jim Pick (@jimpick) has been an active contributor working out multi-writer details.
-
-- 2017-12-06: @noffle publishes `ARCHITECTURE.md` overview in the
-  [hyperdb github repo][arch_md]
-- 2018-06-XX: First partial draft submitted for review
-
-[arch_md]: https://github.com/mafintosh/hyperdb/blob/master/ARCHITECTURE.md
diff --git a/proposals/0008-multiwriter.md b/proposals/0008-multiwriter.md
new file mode 100644
index 0000000..8047ca4
--- /dev/null
+++ b/proposals/0008-multiwriter.md
@@ -0,0 +1,512 @@
+
+Title: **DEP-0008: Multi-Writer**
+
+Short Name: `0008-multiwriter`
+
+Type: Standard
+
+Status: Draft (as of 2018-07-06)
+
+Github PR: [Draft](https://github.com/datprotocol/DEPs/pull/10)
+
+Authors:
+[Bryan Newbold](https://github.com/bnewbold),
+[Stephen Whitmore](https://github.com/noffle),
+[Mathias Buus](https://github.com/mafintosh)
+
+
+# Summary
+[summary]: #summary
+
+Multi-Writer is a set of schema, API, and feature extentions to allow multiple
+agents (users, devices, or software) to write to the same hyperdb database. By
+building on top of this abstraction layer, future versions of hyperdrive and
+Dat will gain these features.
+
+Mechanisms for distributed consistency and granting trust are specified here;
+the need for merge conflict algorithms and secure key distribution are
+mentioned but specific solutions are not specified.
+
+This DEP forms the second half of the hyperdb specification; the first half
+covered only the key/value database aspects of hyperdb.
+
+
+# Motivation
+[motivation]: #motivation
+
+The current hypercore/Dat ecosystem currently lacks solutions for two
+fundamental use cases:
+
+- individual users should be able to modify distributed archives under their
+  control from multiple devices, at a minimum to prevent loss of control of
+  content if a single device (containing secret keys) is lost
+- contributions from and collaboration between multiple users on a single
+  archive or database should be possible, with appropriate trust and access
+  control semantics
+
+Access to a single secret key is currently required to make any change to a
+hypercore feed, and it is broadly considered best practice not to distribute
+secret keys between multiple users or multiple devices. In fact, the current
+hypercore implementation has no mechanism to resolve disputes or recover if
+multiple agents used the same secret key to append to the same feed.
+
+Solutions to these two use cases are seen as essential for many current and
+future Dat ecosystem applications.
+
+
+# Concepts, Behavior, and Usage
+[usage-documentation]: #usage-documentation
+
+The multi-writer features of hyperdb are implemented by storing and replicating
+the contributions of each writer in a separate hypercore feed. This
+specification concerns itself with the details of how changes from multiple
+feeds (which may be written and replicated concurrently or asynchronously) are
+securely combined to present a unified key/value interface.
+
+The following related concerns are explicitly left to application developers to
+design and implement:
+
+- secure key distribution and authentication (eg, if a friend should be given
+  write access to a hyperdb database, how is that friend's feed key found and
+  verified?)
+- merge conflict resolution (using the provided API), potentially using
+  application-layer semantics
+
+Before we go any further, a few definitions:
+
+*Feed*: A hypercore feed: an append-only log of *Entries*, which can be
+arbitrary data blobs.
+
+*Database*: in this context, a Hyperdb key/value database. Built from several
+Feeds (two Feeds per Writer).
+
+*Writer*: a user (or user controlled device or software agent) that has a
+distinct feed with a public/private key pair, and thus the ability to append
+hypercore entries and "write" changes to their version of the database.
+
+*Original Writer*: the writer who created a given hyperdb database in the form
+of the *Original Feed*. The public key of the original feed is the one used to
+reference the database as a collection of feeds (eg, for the purpose of
+discovery).
+
+At a high level, multi-writer hyperdb works by having existing authorized
+writers (starting with the original writer) include authorization of new
+writers by appending metadata to their own feed which points to the new feeds
+(by public key). Each entry in each writer's feed contains "clock" metadata
+that records the known state of the entire database (all writers) seen from the
+perspective of that writer at the time they created the entry, in the form of
+"clock" version pointers. This metadata (a "[vector clock][vc]") can be used by
+other writers to resolve (or at least identify) conflicting content in the
+database. The technical term for this type of system is a "Conflict-free
+replicated data type" ([CRDT][crdt]), and specifically an "Operation-based" (as
+opposed to "State-based") CRDT.
+
+[vc]: https://en.wikipedia.org/wiki/Vector_clock
+[crdt]: https://en.wikipedia.org/wiki/Conflict-free_replicated_data_type
+
+
+## Core API
+[api]: #api
+
+A "node" is a data structure representing a database entry, including the
+`key`, `value`, and feed that the entry is commited to.
+
+`db.get(key)` (as described in the [hyperdb DEP][dep-hyperdb])
+returns an array of nodes. If it is unambiguous what the consistent state of a
+key is, the array will have only that one value. If there is a conflict
+(ambiguity), multiple nodes will be returned. If a key has unambiguously been
+removed from the database, a "null" or empty datatype is returned. If one
+branch of a conflict has a deletion (but at least one of the others does not),
+a node with the `deleted` flag will be returned; note that such "tombstone"
+nodes can still have a `value` field, which may contain application-specific
+metadata (such as a self-reported timestamp), which may help resolve the
+conflict.
+
+If multiple nodes are returned from a `get`, the writer should attempt to merge
+the values (or chose one over the other) and write the result to the database
+with `db.put(key, value)`. Client libraries can make this process more
+ergonomic by accepting a helper function to select between multiple nodes.
+Libraries can also offer an option to either directly return the value of a
+single node (instead of the node itself), or raise an error; this is likely to
+be more ergonomic for applications which do not intend to support multiple
+writers per database.
+
+`db.authorize(key)` will write metadata to the local feed authorizing the new
+feed (corresponding to `key`) to be included in the database. Once authorized,
+a feed may further authorize additional feeds (recursively).
+
+`db.authorized(key)` (returning a boolean) indicates whether the given `key` is
+an authorized writer to the hyperdb database.
+
+At the time of this DEP there is no mechanism for revoking authorization.
+
+[dep-hyperdb]: https://github.com/datprotocol/DEPs/blob/master/proposals/0004-hyperdb.md
+
+
+## Scaling
+[scaling]: #scaling
+
+There is some overhead associated with each "writer" added to the feed,
+impacting the number of files on disk, memory use, and the computational cost
+of some lookup oprations. The design should easily accomodate dozens of
+writers, and should scale to 1,000 writers without too much additional
+overhead. Note that a large number of writers also implies a larger number and
+rate of append operations, and additional network connections, which may cause
+scaling issues on their own. More real-world experience and benchmarking is
+needed in this area.
+
+
+# Implementation Details
+[reference-documentation]: #reference-documentation
+
+The complete protobuf schemas for the hyperdb "Entry" and "InflatedEntry"
+message types (as specified in the hyperdb DEP) are:
+
+
+```
+message Entry {
+  required string key = 1;
+  optional bytes value = 2;
+  optional bool deleted = 3;
+  required bytes trie = 4;
+  repeated uint64 clock = 5;
+  optional uint64 inflate = 6;
+}
+
+message InflatedEntry {
+  message Feed {
+    required bytes key = 1;
+  }
+
+  required string key = 1;
+  optional bytes value = 2;
+  optional bool deleted = 3;
+  required bytes trie = 4;
+  repeated uint64 clock = 5;
+  optional uint64 inflate = 6;
+  repeated Feed feeds = 7;
+  optional bytes contentFeed = 8;
+}
+```
+
+The fields of interest for multi-writer are:
+
+- `clock`: a "vector clock" to record observed state at the time of writing the
+  entry. Included in every Entry and InflatedEntry.
+- `inflate`: a back-pointer to the entry index of the most recent InflatedEntry
+  (containing a feed metadata change). Included in every Entry and
+  InflatedEntry. Should not be included for the very first entry in a feed
+  (which is an InflatedEntry).
+- `feeds`: list of public keys for each writer's feed. Only included in
+  InflatedEntry, and only when feeds have changed. Does include a
+  self-reference to the current (local) Feed's key, always as the first
+  element.
+
+When serialized on disk in a SLEEP directory:
+
+- `source/`: the original feed, as created or cloned by this writer
+- `local/`: if "local" feed is different from "source", it goes here
+- `peers/<discover-key>/`: all other writers' feed go under this directory (the
+  discovery key is lower-case hex-encoded)
+- `content/<discovery-key>/`: if a higher-level protocol is being used that
+  uses multiple linked hypercore feeds (eg, hyperdrive), the linked "content"
+  feeds all go under this directory
+
+## Feeds and Vector Clocks
+
+At any point in time, each writer has a potentially unique view of the
+"current" state of the database as a whole; this is the nature of real-world
+distributed systems. For example, a given write might have the most recent
+appends from one peer (eg, only seconds old), but be missing much older appends
+from another (eg, days or weeks out of date). By having each writer include
+metadata about their percieved state of the system as a whole in operations to
+their Feed, all writers are able to collectively converge on an "eventually
+consistent" picture of the database as whole (this process will be described in
+the next section).
+
+A writer's "current known state" of the database consists of the set of active
+Feeds, and for each the most recent entry sequence number ("version"). This
+state can be serialized as an array of integers, refered to as a [vector
+clock](https://en.wikipedia.org/wiki/Vector_clock).
+
+Each `put()` operation on the database appends a node to the writer's `local`
+feed, and contains the writer's vector clock as observed at that time.
+`InflatedEntry` nodes also contain a list of all known authorized Feeds;
+inflated nodes only need to be written when the Feed list changes. Every
+non-inflated entry contains a pointer back to the most recent inflated entry;
+inflated entries themselves contain a pointer back to the previous inflated
+entry (the first inflated entry has a null pointer). Elements of a vector clock
+are ordered by the Feed list from the corresponding Inflated entry.
+
+By convention, the order of Feed lists is to start with the writer's local
+feed first, then proceed by the order in which Feeds were discovered. Note that
+this ordering is not consistent across writers, only within the same feed.
+
+As an example, if a node (non-inflated entry) had a vector clock of `[0, 2,
+5]`, that would mean:
+
+- when this node was written, the largest seq # in the writer's local fed was 0
+- when this node was written, the largest seq # in the second known feed was 2
+- when this node was written, the largest seq # in the third known feed was 5
+
+
+## Multi-Feed Aware hyperdb
+[multi-aware]: #multi-aware
+
+The [hyperdb DEP](hyperdb-dep) specifies the procedures for lookup (`get()`)
+and append (`put()`) operations to the database, as well as binary encoding
+schemes for entry messages.
+
+Note that the trie encoding specifies pointers in a `(feed, entry)` pair
+format. The `feed` integer is an index into the most recent Feed list (found in
+the most recent inflated entry; see the last section). When working with a
+multi-writer hyperdb database, simply look up entries in the appropriate feed,
+instead of only looking in the current feed. The next section ("Consistent
+History") describes which entry (or entries) to start with instead of simply
+assuming the most recent entry from the local feed.
+
+
+## Consistent History
+[consist-history]: #consist-history
+
+The set of all appended nodes in all feeds of a hyperdb, and all the vector
+clock pointers between them, forms a "directed acyclic graph" (DAG). Any node
+which does not have another node pointing to it is called a "head" (this
+terminology is similar to that used in git). At any point in time, an observed
+copy of a database has one or more heads, each representing the top of a
+tree-like graph. In the trivial case of a non-multi-writer hyperdb, there is
+always only a single head: the most recent entry in the local feed. Just after
+appending to the local feed, there is also always a single head, because that
+node's vector clock will reference all know most recent entries from other
+feeds. It is only when nodes are appended by separate writers who did not know
+of the others' concurrent action (and then these changes are replicated) that
+there are multiple heads.
+
+When operating on the database (eg, executing a `get()` operation), all heads
+must be considered. The lookup proceedure documented in the [hyperdb
+DEP](hyperdb-dep) must be repeated for each head, and nodes returned
+representing the set of all unique values.
+
+The situation where a `get()` operation multiple heads returns different values
+for the same key is called a "conflict" and requires a "merge" to resolve. Some
+writer (either a human being or application-layer code) must decide on the
+correct value for the key and write that value as a new entry (with a vector
+clock that includes the previous heads). The procedure for chosing the best
+value to use in a given conflict is sometimes easy to determine, but is
+impossible to determine algorithmically in the general case. See the "Usage"
+section for more details.
+
+
+# Examples
+[examples]: #examples
+
+Let's say Alice starts a new hyperdb and writes two key/value entries to it:
+
+```
+// Operations
+Alice: db.put('/foo/bar', 'baz')
+Alice: db.put('/foo/2',   '{"json":3}')
+
+// Alice's Feed
+0 (key='/foo/bar', value='baz',
+   vector_clock=[0], inflated=null, feeds=['a11ce...']) (InflatedEntry)
+1 (key='/foo/2', value='{"json":3}',
+   vector_clock=[0], inflated=0)
+
+// Graph
+Alice:  0  <---  1
+```
+
+The vector clock at `seq=1` points back to `seq=0`.
+
+Next Alice *authorizes* Bob to write to the database. Internally, this means Alice
+writes an Inflated entry to her feed that contains Bob's Feed (identified by his
+public key) in her feed list.
+
+```
+// Operations
+Alice: db.authorize('b0b123...')
+
+// Alice's Feed
+0 (key='/foo/bar', value='baz',
+   vector_clock=[0], inflated=null, feeds=['a11ce...']) (InflatedEntry)
+1 (key='/foo/2', value='{"json":3}',
+   vector_clock=[0], inflated=0)
+2 (key=null, value=null,
+   vector_clock=[1], inflated=0, feeds=['a11ce...', 'b0b123...']) (InflatedEntry)
+
+// Graph
+Alice: 0  <---  1  <---  2
+```
+
+Bob writes a value to his feed, and then Alice and Bob sync. The result is:
+
+```
+// Operations
+Bob: db.put('/a/b', '12)
+
+// Alice's Feed
+0 (key='/foo/bar', value='baz',
+   vector_clock=[0], inflated=null, feeds=['a11ce...']) (InflatedEntry)
+1 (key='/foo/2', value='{"json":3}',
+   vector_clock=[0], inflated=0)
+2 (key=null, value=null,
+   vector_clock=[1], inflated=0, feeds=['a11ce...', 'b0b123...']) (InflatedEntry)
+
+// Bob's Feed
+0 (key='/a/b', value='12',
+   vector_clock=[0], inflated=null, feeds=['b0b123...']) (InflatedEntry))
+
+// Graph
+Alice: 0  <---  1  <---  2
+Bob  : 0
+```
+
+Notice that none of Alice's entries refer to Bob's, and vice versa. Neither has
+written any entries to their feeds since the two became aware of each other.
+Right now there are two "heads" of the graph: Alice's feed at seq 2, and Bob's
+feed at seq 0. Any `get()` operations would need to descend from both heads,
+though in this situation there would be no conflicts as the keys in the two
+feeds are disjoint.
+
+Next, Alice writes a new value, and her latest entry will refer to Bob's:
+
+```
+// Operations
+Alice: db.put('/foo/hup', 'beep')
+
+// Alice's Feed
+0 (key='/foo/bar', value='baz',
+   vector_clock=[0], inflated=null, feeds=['a11ce...']) (InflatedEntry)
+1 (key='/foo/2', value='{"json":3}',
+   vector_clock=[0], inflated=0)
+2 (key=null, value=null,
+   vector_clock=[1, null], inflated=0, feeds=['a11ce...', 'b0b123...']) (InflatedEntry)
+3 (key='/foo/hup', value='beep',
+   vector_clock=[2,0], inflated=2)
+
+// Bob's Feed
+0 (key='/a/b', value='12',
+   vector_clock=[0], inflated=null, feeds=['b0b123...']) (InflatedEntry))
+
+
+// Graph
+Alice: 0  <---  1  <---  2  <--/  3
+Bob  : 0  <-------------------/
+```
+
+Alice's latest feed entry now points to Bob's latest feed entry, and there is
+only one "head" in the database. This means that any `get()` operations only
+need to run once, starting at `seq=3` in Alice's feed.
+
+
+# Security and Privacy Concerns
+[privacy]: #privacy
+
+As noted above, there is no existing mechanism for removing authorization for a
+feed once added, and an authorized feed may recursively authorize additional
+feeds. There is also no mechanism to restrict the scope of an authorized feed's
+actions (eg, limit to only a specific path prefix). This leaves application
+designers and users with few tools to control trust or access ("all or
+nothing"). Care must be taken in particular if self-mutating software is being
+distributed via hyperdb, or when action may be taken automatically based on the
+most recent content of a database (eg, bots or even third-party tools may
+publish publicly, or even take real-world action like controlling an electrical
+relay).
+
+There is no mechanism to remove malicious history (or any history for that
+matter); if an authorized (but hostile) writer appends a huge number of key
+operations (bloating hyperdb metadata size), or posts offensive or illegal
+content to a database, there is no way to permanently remove the data without
+creating an new database.
+
+The read semantics of hyperdb are unchanged from hypercore: an actor does not
+need to be "authorized" (for writing) to read the full history of a database,
+they only need the public key.
+
+As noted in other DEPs, a malicious writer can potentially execute a denial of
+service (DoS) attack by appending hyperdb entries that for a cyclic loop of
+references.
+
+
+# Drawbacks
+[drawbacks]: #drawbacks
+
+Mutli-writer capability incurs a non-trivial increase in library, application,
+and user experience complexity. For many applications, collaboration is an
+essential feature, and the complexity is easily justified. To minimize
+complexity for applications which do not need multi-writer features,
+implementation authors should consider configuration modes which hide the
+complexity of unused features. For example, by having an option to returning a
+single node for a `get()` (and throw an error if there is a conflict), or a
+flag to throw an error if a database unexpectedly contains more than a single
+feed.
+
+Two tasks (conflict merges and secure key distribution) are left to application
+developers. Both of these are Hard Problems. The current design mitigates the
+former by reducing the number of merge conflicts that need to be handled by an
+application (aka, only the non-trivial ones need to be handled), and
+implementation authors are encouraged to provide an ergonomic API for writing
+conflict resolvers. The second problem (secure key distribution) is out of
+scope for this DEP. It is hoped that at least one pattern or library will
+emerge from the Dat ecosystem such that each application author doesn't need to
+invent a solution from scratch.
+
+
+# Rationale and alternatives
+[alternatives]: #alternatives
+
+Design goals for hyperdb (including the multi-writer feature) included:
+
+- ability to execute operations (get, put) with a sparse (partial) replication
+  of the database, using as few additional network requests as possible
+- minimal on-disk and on-wire overhead
+- implemented on top of an append-only log (to build on top of hypercore)
+
+If a solution for core use cases like collaboration and multi-device
+synchronization is not provided at a low level (as this DEP provides), each
+application will need to invent a solution at a higher level, incuring
+duplicated effort and a higher risk of bugs.
+
+As an alternative to CRDTs, Operational Transformation (OT) has a reputation
+for being more difficult to understand and implement.
+
+
+# Unresolved questions
+[unresolved]: #unresolved-questions
+
+What is the actual on-disk layout (folder structure), if not what is documented
+here?
+
+The local feed's sequence number could skipped from vector clocks, because it's
+implied by the sequence number of the hypercore entry itself. Same with the key
+in the feed list (for inflated entries). In both cases, the redundant data is
+retained for simplicity.
+
+If there are multiple heads, but they all return the same `value` for a `get()`
+operation, how is it decided which `node` will be returned? AKA, values are the
+same, but node metadata might not be (order of vector clock, etc).
+
+Suspect that some details are off in the example: shouldn't the InflatedEntry
+authorizing a new feed include a vector clock reference to a specific seq in
+that feed? Should new local (not yet authorized) feeds reference
+their source in an initial InflatedEntry (eg, Bob at seq=0)? Should the first
+InflatedEntry in a feed point to itself in it's vector clock?
+
+# Changelog
+[changelog]: #changelog
+
+As of March 2018, Mathias Buus (@mafintosh) is leading development of a hyperdb
+nodejs module on [github](https://github.com/mafintosh/hyperdb), which includes
+multi-writer features and is the basis for this DEP.
+
+Jim Pick (@jimpick) has been an active contributor working out multi-writer details.
+
+- 2017-12-06: @noffle publishes `ARCHITECTURE.md` overview in the
+  [hyperdb github repo][arch_md]
+- 2018-05-02: First round of public review
+- 2018-05-23: hyperdb 3.0.0 node.js implementation released
+- 2018-06-10: Second draft submitted for review
+- 2018-07-06: Accepted with Draft status (after edits)
+
+[arch_md]: https://github.com/mafintosh/hyperdb/blob/master/ARCHITECTURE.md
-- 
cgit v1.2.3