From 6570f5f3139cd2f917da01f6ce20cde257f66801 Mon Sep 17 00:00:00 2001 From: Bryan Newbold Date: Sat, 3 Mar 2018 14:28:42 -0800 Subject: hyperdb progress --- proposals/0000-hyperdb.md | 142 +++++++++++++++++++++++++++++----------------- 1 file changed, 91 insertions(+), 51 deletions(-) diff --git a/proposals/0000-hyperdb.md b/proposals/0000-hyperdb.md index e81fc59..2a0656e 100644 --- a/proposals/0000-hyperdb.md +++ b/proposals/0000-hyperdb.md @@ -5,11 +5,11 @@ Short Name: `0000-hyperdb` Type: Standard -Status: Undefined (as of 2018-02-XX) +Status: Undefined (as of 2018-03-XX) Github PR: (add HTTPS link here after PR is opened) -Authors: noffle (Stephen Whitmore), bnewbold (Bryan Newbold) +Authors: [Stephen Whitmore](https://github.com/noffle), [Bryan Newbold](https://github.com/bnewbold) # Summary @@ -36,9 +36,11 @@ directories. This is a real-world bottleneck for several current users, with basic local actions such as adding a directory taking an unacceptably long time to complete. -A secondary benefit is to refactor the trie-structured key/value API out of -hyperdrive, allowing third party code to build applications directly on this -abstraction layer. +A secondary benefit is to refactor the [trie][trie]-structured key/value API +out of hyperdrive, allowing third party code to build applications directly on +this abstraction layer. + +[trie]: https://en.wikipedia.org/wiki/Trie # Usage Documentation @@ -56,8 +58,10 @@ for changes on `/foo/bar` will report both changes to `/foo/bar/baz` and also Lower-level details of the hypercore append-only log, disk serialization, and networked synchronization features that HyperDB builds on top of are not -described in detail here. Multi-writer hypercore details are also not discussed -in this DEP. References to formal documentation of these details are TODO. +described in detail here; see the [DEP repository][deps]. Multi-writer +hypercore semantics are also not discussed in this DEP. + +[deps]: https://github.com/datprotocol/DEPs A HyperDB database instance can be represented by a single hypercore feed (or several feeds in a multi-writer context), and is named, referenced, and @@ -67,7 +71,10 @@ single node (holding the secret key) can mutate the database (eg, via `put` or `delete` actions). **Keys** can be any UTF-8 string. Path segments are separated by the forward -slash character (`/`). +slash character (`/`). Repeated slashes (`//`) are disallowed. Leading and +trailing `/` are optional in application code: `/hello` and `hello` are +equivalent. A key can be both a "path segment" and key at the same time; eg, +`/a/b/c` and `/a/b` can both be keys at the same time. **Values** can be any binary blob, including empty (of zero length). For example, values could be UTF-8 encoded strings, JSON encoded objects, protobuf @@ -75,48 +82,53 @@ messages, or a raw `uint64` integer (of either endian-ness). Length is the only form of type or metadata stored about the value; deserialization and validation are left to library and application developers. -TODO: key corner-cases: -- `///` same as `//`? `///a` vs. `//a`? - -TODO: size limits on keys and values? - -TODO: "prefix" end in a trailing slash, or optional? - -A leading `/` is optional: `db.get('/hello')` and `db.get('hello')` are -equivalent. - ## Core API Semantics +A `db` is instantiated by opening an existing hypercore with hyperdb content +(read-only, or optionally read-write if the secret key is available), or +creating a new one. A handle could represent any specific revision in history, +or the "latest" revision. + `db.put(key, value)`: inserts `value` (arbitrary bytes) under the path `key`. -Returns an error (eg, via callback) if there was a problem. +Requires read-write access. Returns an error (eg, via callback) if there was a +problem. -`db.get(key)`: Reading a non-existant `key` is an error. +`db.get(key)`: Reading a non-existant `key` is an error. Read-only. `db.delete(key)`: Removes the key from the database. Deleting a non-existant -key is an error. +key is an error. Requires read-write access. `db.list(prefix)`: returns a flat (not nested) list of all keys currently in -the database. - -TODO: in what order? +the database under the given prefix. Prefixes operate on a path-segment basis: +`/ab` is not a valid prefix for key `/abcd`, but is valid for `/ab/cd`. If the +prefix does not exist, returns an empty list. The order of returned keys is +implementation (or configuration) specific. Read-only. -TODO: what if node is a file, not a prefix? +If the hypercore underlying a hyperdb is only partially replicated, behavior is +implementation-specific. For example, a `get()` call could block until the +relevant value is replicated, or the implementation could return an error. An example pseudo-code session working with a database might be: -TODO: - -TODO: Read Stream - -TODO: Changes Stream - -TODO: Diff Stream - + db.put('/life/animal/mammal/kitten', '{"cuteness": 500.3}') + db.put('/life/plant/bush/bananna', '{"delicious": 103.4}') + db.delete('/life/plant/bush/bananna') + db.put('/life/plant/tree/bananna', '{"delicious": 103.4}') + db.get('/life/animal/mammal/kitten') + => {"cuteness": 500.3} + db.list('/life/') + => ['/life/animal/mammal/kitten', '/life/plant/tree/bananna'] # Reference Documentation [reference-documentation]: #reference-documentation -The new Node protobuf message schema is: +A HyperDB hypercore feed typically consists of a sequence of entries, all of +which are protobuf-encoded `Node` messages. Higher-level protocols may make +exception to this: for example, hyperdrive reserves the first entry of the +(`metadata`) feed for a special entry that refers to the second (`content`) +feed. + +The `Node` protobuf message schema is: ``` message Node { @@ -124,10 +136,18 @@ The new Node protobuf message schema is: optional bytes value = 2; repeated uint64 clock = 3; optional bytes trie = 4; - optional uint64 feedSeq = 6; // TODO remove and merge into index (trie+feedSeq) + optional uint64 feedSeq = 6; } ``` +TODO(mafintosh): where is this schema actually defined for the `next` branch of +the hyperdb repo? + +`key` and `value` are as expected. A non-existant `value` indicates that this +message is a deletion (see below). The `clock` element is reserved for use in +the forthcoming `multi-writer` standard, not described here. An empty list is +the safe (and expected) value for `clock` in single-writer use cases. + ## Incremental Index HyperDB builds an *incremental index* with every new key/value pairs ("nodes") @@ -139,7 +159,7 @@ Each node stores the following basic information: - `key`: the key that is being created or modified. e.g. `/home/sww/dev.md` - `value`: the value stored at that key. -- `seq`: the sequence number of this entry in the hypercore. 0 is the first, 1 +- `feedSeq`: the sequence number of this entry in the hypercore. 0 is the first, 1 the second, and so forth. - `path`: a 2-bit hash sequence of the key's components - `trie`: a navigation structure used with `path` to find a desired key @@ -233,6 +253,14 @@ This is how finding a node works, starting at any other node: Repeat until you reach step 4 (match) or there is no entry in the node's trie for the key you're after (no match). +# Examples +[examples]: #examples + +TODO: lookup a "deep" key + +TODO: deleting a key + +TODO: listing a prefix # Drawbacks [drawbacks]: #drawbacks @@ -246,24 +274,16 @@ For the specific use case of Dat archives, HyperDB will trivially increase metadata size (and thus disk and network consumption) for archives with few files. + # Rationale and alternatives [alternatives]: #alternatives -TODO: +TODO: scaling issues with existing system; handling many entries -- Why is this design the best in the space of possible designs? -- What other designs have been considered and what is the rationale for not choosing them? -- What is the impact of not doing this? +TODO: why multi-writer fields included here -# Unresolved questions -[unresolved]: #unresolved-questions -TODO: -- What parts of the design do you expect to resolve through the DEP consensus process before this gets merged? -- What parts of the design do you expect to resolve through implementation and code review, or are left to independent library or application developers? -- What related issues do you consider out of scope for this DEP that could be addressed in the future independently of the solution that comes out of this DEP? - -# Dat Migration logistics +# Dat migration logistics [migration]: #migration HyperDB is not backwards compatible with the existing hyperdrive metadata, @@ -280,12 +300,32 @@ Upgrading a Dat (hyperdrive) archive to HyperDB will necessitate creating a new feed from scratch, meaning new public/private key pairs, and that public key URL links will need to change. + +# Unresolved questions +[unresolved]: #unresolved-questions + +Details of keys to work out (before Draft status): + +- must a prefix end in a trailing slash, or optional? +- can there be, eg, keys `/a/b` and `/a/b/c` at the same time? + +There are implied "reasonable" limits on the size (in bytes) of both keys and +values, but they are not formally specified. Protobuf messages have a hard +specified limit of 2 GByte (due to 32-bit signed arthimetic), and most +implementations impose a (configurable) 64 MByte limit. Should this DEP impose +specific limits on key and value sizes? Would be good to decide before Draft +status. + +Apart from leaving fields in the protobuf message specification, multi-writer +concerns are out of scope for this DEP. + + # Changelog [changelog]: #changelog -As of January 2018, @mafintosh is leading development of a hyperdb nodejs -module on [github](https://github.com/mafintosh/hyperdb), which is the basis -for this DEP. +As of March 2018, @mafintosh is leading development of a hyperdb nodejs module +on [github](https://github.com/mafintosh/hyperdb), which is the basis for this +DEP. - 2017-12-06: @noffle publishes `ARCHITECTURE.md` overview in the [hyperdb github repo][arch_md] -- cgit v1.2.3