From 639fa24f0e1f95a2a1d9f5c4f920a08fd3541d9a Mon Sep 17 00:00:00 2001 From: Bryan Newbold Date: Sun, 18 Mar 2018 15:34:25 -0700 Subject: start refactoring for 'version 3' hyperdb changes Will need copy editing, and there are a number of new changes that need to be understood and integrated. --- proposals/0000-hyperdb.md | 344 ++++++++++++++++++++++++++-------------------- 1 file changed, 196 insertions(+), 148 deletions(-) (limited to 'proposals') diff --git a/proposals/0000-hyperdb.md b/proposals/0000-hyperdb.md index 1f2866f..3e4fb07 100644 --- a/proposals/0000-hyperdb.md +++ b/proposals/0000-hyperdb.md @@ -19,10 +19,11 @@ Authors: [summary]: #summary HyperDB is an abstraction layer providing a general purpose distributed -key/value store over the hypercore protocol. It is an iteration on the hyperdrive -directory tree implementation, building top of the hypercore append-only log -abstraction layer. Keys are path-like strings (eg, `/food/fruit/kiwi`), and -values are arbitrary binary blobs (generally under a megabyte). +key/value store over the hypercore protocol. It is an iteration on the +hyperdrive directory tree implementation, building top of the hypercore +append-only log abstraction layer. Keys are path-like strings (eg, +`/food/fruit/kiwi`), and values are arbitrary binary blobs (generally under a +megabyte). Hyperdrive (used by the Dat application) is expected to be re-implemented on top of HyperDB for improved performance with many files (eg, millions). The @@ -128,65 +129,82 @@ An example pseudo-code session working with a database might be: # Reference Documentation [reference-documentation]: #reference-documentation -A HyperDB hypercore feed typically consists of a sequence of entries, all of -which are protobuf-encoded Node messages. Higher-level protocols may make -exception to this: for example, hyperdrive reserves the first entry of the -(`metadata`) feed for a special entry that refers to the second (`content`) -feed. +A HyperDB hypercore feed typically consists of a sequence of protobuf-encoded +messages of either "Entry" or "InflatedEntry" type. Higher-level protocols may +make exception to this, for example by prepending an application-specific +metadata message as the first entry in the feed. There is sometimes a second +"content" feed associated with the primary hyperdb key/value feed, to store +data that does not fit in the (limited) `value` size constraint. The sequence of entries includes an incremental index: the most recent entry in the feed contains metadata pointers that can be followed to efficiently look up any key in the database without needing to linear scan the entire history or -generate an independent index data structure. Of course implementations are -free to maintain such an index if they prefer. - -The Node protobuf message schema is: - -``` - message Node { - optional string key = 1; - optional bytes value = 2; - repeated uint64 clock = 3; - optional bytes trie = 4; // TODO: actually a {feed, seq} pair - optional bytes path = 5; // TODO: actual type? - optional uint64 seq = 6; - optional bytes feed = 7; // TODO: actual type? - } -``` - -TODO(mafintosh): where is this schema actually defined for the `next` branch of -the hyperdb repo? +generate an independent index data structure. Implementations are, of course, +free to maintain their own index if they prefer. + +The Entry and InflatedEntry protobuf message schemas are: + + message Entry { + required string key = 1; + optional bytes value = 2; + required bytes trie = 3; + repeated uint64 clock = 4; + optional uint64 inflate = 5; + } + + message InflatedEntry { + message Feed { + required bytes key = 1; + } + + required string key = 1; + optional bytes value = 2; + required bytes trie = 3; + repeated uint64 clock = 4; + optional uint64 inflate = 5; + repeated Feed feeds = 6; + optional bytes contentFeed = 7; + } + +Some fields are are specific to the multi-writer features described in their +own DEP and mentioned only partially here. The fields common to both message +types are: - `key`: UTF-8 key that this node describes. Leading and trailing forward slashes (`/`) are always striped before storing in protobuf. - `value`: arbitrary byte array. A non-existant `value` entry indicates that - this Node indicates a deletion of the key; this is distinct from specifying + this Entry indicates a deletion of the key; this is distinct from specifying an empty (zero-length) value. -- `clock`: reserved for use in the forthcoming `multi-writer` standard, not - described here. An empty list is the safe (and expected) value for `clock` in - single-writer use cases. -- `trie`: a structured array of pointers to other Node entries in the feed, +- `trie`: a structured array of pointers to other Entry entries in the feed, used for navigating the tree of keys. -- `path`: a 2-bit hash sequence of `key`'s components. -- `seq`: the zero-indexed sequence number of this Node (hyperdb) entry in - the feed. Note that this may be offset from the hypercore entry index if - there are any leading non-hyperdb entries. -- `feed`: reserved for use with `multi-writer`. The safe single-writer value is +- `clock`: reserved for use in the forthcoming `multi-writer` standard An empty + list is the safe (and expected) value for `clock` in single-writer use cases. +- `inflate`: TODO(mafintosh) + +The fields specific to InflatedEntry are: + +- `feeds`: reserved for use with `multi-writer`. The safe single-writer value is to use the current feed's hypercore public key. +- `contentFeed`: for applications which require a parallel "content" hypercore + feed for larger data, this field can be used to store the 32-byte public key + for that feed. -TODO(mafintosh): should `feed` actually be the current hypercore publickey? +For the case of a single-writer feed, not using multi-writer features, it is +sufficient to write a single InflatedEntry message as the first entry in the +hypercore feed, with `feeds` containing a single entry (a pointer to the +current feed itself), and `contendFeed` optionally set to a pointer to a paired +contend feed. ## Path Hashing [path_hashing]: #path_hashing Every key path has component-wise fixed-size hash representation that is used -by the trie. This is written to the `path` field each Node protobuf -message. The concatenation of all path hashes yields a hash for the entire key. -Note that analagously to a hash map data structure, there can be more than one -key (string) with the same key hash in the same database with no problems: the -hash points to a bucket of keys, which can be iterated over linearly to find -the correct value. +by the trie. The concatenation of all path hashes yields a "path hash array" +for the entire key. Note that analagously to a hash map data structure, there +can be more than one key (string) with the same key hash in the same database +with no problems: the hash points to a linked-list "bucket" of Entries, which +can be iterated over linearly to find the correct value. The path hash is represented by an array of bytes. Elements are 2-bit encoded (values 0, 1, 2, 3), except for an optional terminating element which has value @@ -212,26 +230,34 @@ the terminator (`4`) will have the highest byte index. For example, consider the key `/tree/willow`. `tree` has a hash `[0xAC, 0xDC, 0x05, 0x6C, 0x63, 0x9D, 0x87, 0xCA]`, which converts into the array: - [ 0, 3, 2, 2, 0, 3, 1, 3, 1, 1, 0, 0, 0, 3, 2, 1, 3, 0, 2, 1, 1, 3, 1, 2, 3, 1, 0, 2, 2, 2, 0, 3 ] +``` +[ 0, 3, 2, 2, 0, 3, 1, 3, 1, 1, 0, 0, 0, 3, 2, 1, 3, 0, 2, 1, 1, 3, 1, 2, 3, 1, 0, 2, 2, 2, 0, 3 ] +``` `willow` has a 64-bit hash `[0x72, 0x30, 0x34, 0x39, 0x35, 0xA8, 0x21, 0x44]`, which converts into the array: - [ 2, 0, 3, 1, 0, 0, 3, 0, 0, 1, 3, 0, 1, 2, 3, 0, 1, 1, 3, 0, 0, 2, 2, 2, 1, 0, 2, 0, 0, 1, 0, 1 ] +``` +[ 2, 0, 3, 1, 0, 0, 3, 0, 0, 1, 3, 0, 1, 2, 3, 0, 1, 1, 3, 0, 0, 2, 2, 2, 1, 0, 2, 0, 0, 1, 0, 1 ] +``` These combine into the unified byte array with 65 elements: - [ 0, 3, 2, 2, 0, 3, 1, 3, 1, 1, 0, 0, 0, 3, 2, 1, 3, 0, 2, 1, 1, 3, 1, 2, 3, 1, 0, 2, 2, 2, 0, 3, - 2, 0, 3, 1, 0, 0, 3, 0, 0, 1, 3, 0, 1, 2, 3, 0, 1, 1, 3, 0, 0, 2, 2, 2, 1, 0, 2, 0, 0, 1, 0, 1, - 4 ] +``` +[ 0, 3, 2, 2, 0, 3, 1, 3, 1, 1, 0, 0, 0, 3, 2, 1, 3, 0, 2, 1, 1, 3, 1, 2, 3, 1, 0, 2, 2, 2, 0, 3, + 2, 0, 3, 1, 0, 0, 3, 0, 0, 1, 3, 0, 1, 2, 3, 0, 1, 1, 3, 0, 0, 2, 2, 2, 1, 0, 2, 0, 0, 1, 0, 1, + 4 ] +``` As another example, the key `/a/b/c` converts into the 97-byte hash array: - [ 1, 2, 0, 1, 2, 0, 2, 2, 3, 0, 1, 2, 1, 3, 0, 3, 0, 0, 2, 1, 0, 2, 0, 0, 2, 0, 0, 3, 2, 1, 1, 2, - 0, 1, 2, 3, 2, 2, 2, 0, 3, 1, 1, 3, 0, 3, 1, 3, 0, 1, 0, 1, 3, 2, 0, 2, 2, 3, 2, 2, 3, 3, 2, 3, - 0, 1, 1, 0, 1, 2, 3, 2, 2, 2, 0, 0, 3, 1, 2, 1, 3, 3, 3, 3, 3, 3, 0, 3, 3, 2, 3, 2, 3, 0, 1, 0, - 4 ] +``` +[ 1, 2, 0, 1, 2, 0, 2, 2, 3, 0, 1, 2, 1, 3, 0, 3, 0, 0, 2, 1, 0, 2, 0, 0, 2, 0, 0, 3, 2, 1, 1, 2, + 0, 1, 2, 3, 2, 2, 2, 0, 3, 1, 1, 3, 0, 3, 1, 3, 0, 1, 0, 1, 3, 2, 0, 2, 2, 3, 2, 2, 3, 3, 2, 3, + 0, 1, 1, 0, 1, 2, 3, 2, 2, 2, 0, 0, 3, 1, 2, 1, 3, 3, 3, 3, 3, 3, 0, 3, 3, 2, 3, 2, 3, 0, 1, 0, + 4 ] +``` Note that "hash collisions" are rare with this hashing scheme, but are likely to occur with large databases (millions of keys), and collision have been @@ -277,190 +303,216 @@ Then, to manually "expand" arrays in python3: Each node stores a *prefix [trie](https://en.wikipedia.org/wiki/Trie)* that can be used to look up other keys, or to list all keys with a given prefix. -This is stored under the `trie` field of the Node protobuf message. +This is stored under the `trie` field of the Entry protobuf message. -The trie effectively mirrors the `path` hash array. Each element in the `trie` -points to the newest Node which has an identical path up to that specific +The trie effectively mirrors the path hash array. Each element in the `trie` +points to the newest Entry which has an identical path up to that specific prefix location. Elements can be null; any trailing null elements can be left blank. -The data structure of the trie is a sparse array of pointers to other Node -entries. Each pointer indicates a feed index and a sequence number; for the -non-multi-writer case, the feed index is always 0, so we consider only the -sequence number (aka, entry index). +The data structure of the trie is a sparse array of pointers to other Entry +entries. Each pointer indicates a feed index and an entry index pointer; for +the non-multi-writer case, the feed index is always 0, so we consider only the +entry index. To lookup a key in the database, the recipe is to: -1. Calculate the `path` array for the key you are looking for. -2. Select the most-recent ("latest") Node for the feed. -3. Compare path arrays. If the paths match exactly, compare keys; they match, +1. Calculate the path hash array for the key you are looking for. +2. Select the most-recent ("latest") Entry for the feed. +3. Compare path hash arrays. If the paths match exactly, compare keys; they match, you have found the you were looking for! Check whether the `value` is - defined or not; if not, this Node represents that the key was deleted from + defined or not; if not, this Entry represents that the key was deleted from the database. 4. If the paths match, but not the keys, look for a pointer in the last `trie` - array index, and iterate from step #3 with the new Node. + array index, and iterate from step #3 with the new Entry. 5. If the paths don't entirely match, find the first index at which the two - arrays differ, and look up the corresponding element in this Node's `trie` + arrays differ, and look up the corresponding element in this Entry's `trie` array. If the element is empty, then your key does not exist in this hyperdb. 6. If the trie element is not empty, then follow that pointer to select the - next `Node`. Recursively repeat this process from step #3; you will be - descending the `trie` in a search, and will either terminate in the Node you + next Entry. Recursively repeat this process from step #3; you will be + descending the `trie` in a search, and will either terminate in the Entry you are looking for, or find that the key is not defined in this hyperdb. Similarly, to write a key to the database: -1. Calculate the `path` array for the key, and start with an empty `trie` of +1. Calculate the path hash array for the key, and start with an empty `trie` of the same length; you will write to the `trie` array from the current index, which starts at 0. -2. Select the most-recent ("latest") Node for the feed. -3. Compare path arrays. If the paths match exactly, and the keys match, then - you are overwriting the current Node, and can copy the "remainder" of it's +2. Select the most-recent ("latest") Entry for the feed. +3. Compare path hash arrays. If the paths match exactly, and the keys match, then + you are overwriting the current Entry, and can copy the "remainder" of it's `trie` up to your current `trie` index. 4. If the paths match, but not the keys, you are adding a new key to an existing hash bucket. Copy the `trie` and extend it to the full length. Add - a pointer to the Node with the same hash at the final array index. + a pointer to the Entry with the same hash at the final array index. 5. If the paths don't entirely match, find the first index at which the two arrays differ. Copy all `trie` elements (empty or not) into the new `trie` for indicies between the "current index" and the "differing index". -6. Next look up the corresponding element in this Node's `trie` array at the +6. Next look up the corresponding element in this Entry's `trie` array at the differing index. If this element is empty, then you have found the most - similar Node. Write a pointer to this node to the `trie` at + similar Entry. Write a pointer to this node to the `trie` at the differing index, and you are done (all remaining `trie` elements are empty, and can be omitted). 7. If the differing tree element has a pointer (is not empty), then follow that - pointer to select the next `Node`. Recursively repeat this process from step + pointer to select the next `Entry`. Recursively repeat this process from step #3. To delete a value, follow the same procedure as adding a key, but write the -`Node` without a `value` (in protobuf, this is distinct from having a `value` +`Entry` without a `value` (in protobuf, this is distinct from having a `value` field with zero bytes). Deletion nodes may persist in the database forever. +## Binary Trie Encoding +[trie_encoding]: #trie_encoding + +TODO(mafintosh) + + # Examples [examples]: #examples ## Simple Put and Get Starting with an empty HyperDB `db`, if we `db.put('/a/b', '24')` we expect to -see a single `Node`: +see a single `Entry` and index 0: ``` { key: 'a/b', value: '24', trie: - [ ], - seq: 0, - path: - [ 1, 2, 0, 1, 2, 0, 2, 2, 3, 0, 1, 2, 1, 3, 0, 3, 0, 0, 2, 1, 0, 2, 0, 0, 2, 0, 0, 3, 2, 1, 1, 2, - 0, 1, 2, 3, 2, 2, 2, 0, 3, 1, 1, 3, 0, 3, 1, 3, 0, 1, 0, 1, 3, 2, 0, 2, 2, 3, 2, 2, 3, 3, 2, 3, - 4 ] } + [ ] } ``` -Note that the first 64 bytes in `path` match those of the `/a/b/c` example from +For reference, the path hash array for this key (index 0) is: + +``` +[ 1, 2, 0, 1, 2, 0, 2, 2, 3, 0, 1, 2, 1, 3, 0, 3, 0, 0, 2, 1, 0, 2, 0, 0, 2, 0, 0, 3, 2, 1, 1, 2, + 0, 1, 2, 3, 2, 2, 2, 0, 3, 1, 1, 3, 0, 3, 1, 3, 0, 1, 0, 1, 3, 2, 0, 2, 2, 3, 2, 2, 3, 3, 2, 3, + 4 ] +``` + +Note that the first 64 bytes in path match those of the `/a/b/c` example from the [path hashing][path_hash] section, because the first two path components -are the same. Since this is the first entry, `seq` is 0. +are the same. Since this is the first entry, the entry index is 0. -Now we `db.put('/a/c', 'hello')` and expect a second Node: +Now we `db.put('/a/c', 'hello')` and expect a second Entry: ``` { key: 'a/c', value: 'hello', trie: [ , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , - , , { feed: 0, seq: 0 } ], - seq: 1, - path: - [ 1, 2, 0, 1, 2, 0, 2, 2, 3, 0, 1, 2, 1, 3, 0, 3, 0, 0, 2, 1, 0, 2, 0, 0, 2, 0, 0, 3, 2, 1, 1, 2, - 0, 1, 1, 0, 1, 2, 3, 2, 2, 2, 0, 0, 3, 1, 2, 1, 3, 3, 3, 3, 3, 3, 0, 3, 3, 2, 3, 2, 3, 0, 1, 0, - 4 ] } + , , { feed: 0, index: 0 } ] } +``` + +The path hash array for this key (index 1) is: + +``` +[ 1, 2, 0, 1, 2, 0, 2, 2, 3, 0, 1, 2, 1, 3, 0, 3, 0, 0, 2, 1, 0, 2, 0, 0, 2, 0, 0, 3, 2, 1, 1, 2, + 0, 1, 1, 0, 1, 2, 3, 2, 2, 2, 0, 0, 3, 1, 2, 1, 3, 3, 3, 3, 3, 3, 0, 3, 3, 2, 3, 2, 3, 0, 1, 0, + 4 ] ``` -The `seq` is incremented to 1. The first 32 characters of `path` are common -with the first Node (they share a common prefix `/a`). +The first 32 characters of path are common with the first Entry (they share a +common prefix, `/a`). `trie` is defined, but mostly sparse. The first 32 elements of common prefix -match the first Node, and then two additional hash elements (`[0, 1]`) happen +match the first Entry, and then two additional hash elements (`[0, 1]`) happen to match as well; there is not a differing entry until index 34 (zero-indexed). -At this entry there is a reference pointing to the first Node. An additional 29 +At this entry there is a reference pointing to the first Entry. An additional 29 trailing null entries have been trimmed in reduce metadata overhead. -Next we insert a third node with `db.put('/x/y', 'other')`, and get a third Node: +Next we insert a third node with `db.put('/x/y', 'other')`, and get a third Entry: ``` { key: 'x/y', value: 'other', trie: - [ , { feed: 0, seq: 1 } ], - seq: 2, - path: - [ 1, 1, 0, 0, 3, 1, 2, 3, 3, 1, 1, 1, 2, 2, 1, 1, 1, 0, 2, 3, 3, 0, 1, 2, 1, 1, 2, 3, 0, 0, 2, 1, - 0, 2, 1, 0, 1, 1, 0, 1, 0, 1, 3, 1, 0, 0, 2, 3, 0, 1, 3, 2, 0, 3, 2, 0, 1, 0, 3, 2, 0, 2, 1, 1, - 4 ] } + [ , { feed: 0, index: 1 } ], +``` + +The path hash array for this key (index 2) is: + +``` +[ 1, 1, 0, 0, 3, 1, 2, 3, 3, 1, 1, 1, 2, 2, 1, 1, 1, 0, 2, 3, 3, 0, 1, 2, 1, 1, 2, 3, 0, 0, 2, 1, + 0, 2, 1, 0, 1, 1, 0, 1, 0, 1, 3, 1, 0, 0, 2, 3, 0, 1, 3, 2, 0, 3, 2, 0, 1, 0, 3, 2, 0, 2, 1, 1, + 4 ] ``` Consider the lookup-up process for `db.get('/a/b')` (which we expect to -successfully return `'24'`, as written in the first Node). First we calculate -the `path` for the key `a/b`, which will be the same as the first Node. Then we -take the "latest" Node, with `seq=2`. We compare the `path` arrays, starting at -the first element, and find the first difference at index 1 (`1 == 1`, then `1 -!= 2`). We look at index 1 in the current Node's `trie` and find a pointer to -`seq = 1`, so we fetch that Node and recurse. Comparing `path` arrays, we now -get all the way to index 34 before there is a difference. We again look in the -`trie`, find a pointer to `seq = 0`, and fetch the first Node and recurse. Now -the path elements match exactly; we have found the Node we are looking for, and -it has an existant `value`, so we return the `value`. +successfully return `'24'`, as written in the first Entry). First we calculate +the path for the key `a/b`, which will be the same as the first Entry. Then we +take the "latest" Entry, with entry index 2. We compare the path hash arrays, +starting at the first element, and find the first difference at index 1 (`1 == +1`, then `1 != 2`). We look at index 1 in the current Entry's `trie` and find a +pointer to entry index 1, so we fetch that Entry and recurse. Comparing path +hash arrays, we now get all the way to index 34 before there is a difference. +We again look in the `trie`, find a pointer to entry index 0, and fetch the +first Entry and recurse. Now the path elements match exactly; we have found the +Entry we are looking for, and it has an existant `value`, so we return the +`value`. Consider a lookup for `db.get('/a/z')`; this key does not exist, so we expect -to return with "key not found". We calculate the `path` for this key: +to return with "key not found". We calculate the path hash array for this key: - [ 1, 2, 0, 1, 2, 0, 2, 2, 3, 0, 1, 2, 1, 3, 0, 3, 0, 0, 2, 1, 0, 2, 0, 0, 2, 0, 0, 3, 2, 1, 1, 2, - 1, 2, 3, 0, 1, 0, 1, 1, 1, 1, 2, 1, 1, 1, 0, 1, 0, 3, 3, 2, 0, 3, 3, 1, 1, 0, 2, 1, 0, 1, 1, 2, - 4 ] +``` +[ 1, 2, 0, 1, 2, 0, 2, 2, 3, 0, 1, 2, 1, 3, 0, 3, 0, 0, 2, 1, 0, 2, 0, 0, 2, 0, 0, 3, 2, 1, 1, 2, + 1, 2, 3, 0, 1, 0, 1, 1, 1, 1, 2, 1, 1, 1, 0, 1, 0, 3, 3, 2, 0, 3, 3, 1, 1, 0, 2, 1, 0, 1, 1, 2, + 4 ] +``` -Similar to the first lookup, we start with `seq = 2` and follow the pointer to -`seq = 1`. This time, when we compare `path` arrays, the first differing entry -is at index `32`. There is no `trie` entry at this index, which tells us that -the key does not exist in the database. +Similar to the first lookup, we start with entry index 2 and follow the pointer to +entry index 1. This time, when we compare path hash arrays, the first differing +entry is at array index `32`. There is no `trie` entry at this index, which +tells us that the key does not exist in the database. ## Listing a Prefix Continuing with the state of the database above, we call `db.list('/a')` to list all keys with the prefix `/a`. -We generate a `path` array for the key `/a`, without the terminating symbol +We generate a path hash array for the key `/a`, without the terminating symbol (`4`): - [ 1, 2, 0, 1, 2, 0, 2, 2, 3, 0, 1, 2, 1, 3, 0, 3, 0, 0, 2, 1, 0, 2, 0, 0, 2, 0, 0, 3, 2, 1, 1, 2 ] +``` +[ 1, 2, 0, 1, 2, 0, 2, 2, 3, 0, 1, 2, 1, 3, 0, 3, 0, 0, 2, 1, 0, 2, 0, 0, 2, 0, 0, 3, 2, 1, 1, 2 ] +``` -Using the same process as a `get()` lookup, we find the first Node that -entirely matches this prefix, which will be Node `seq = 1`. If we had failed to -find any Node with a complete prefix match, then we would return an empty list +Using the same process as a `get()` lookup, we find the first Entry that +entirely matches this prefix, which will be entry index 1. If we had failed to +find any Entry with a complete prefix match, then we would return an empty list of matching keys. Starting with the first prefix-matching node, we save that key as a match -(unless the Node is a deletion), then select all `trie` pointers with an index -higher than the prefix length, and recursively inspect all pointed-to Nodes. +(unless the Entry is a deletion), then select all `trie` pointers with an index +higher than the prefix length, and recursively inspect all pointed-to Entries. ## Deleting a Key Continuing with the state of the database above, we call `db.delete('/a/c')` to remove that key from the database. -The process is almost entirely the same as inserting a new Node at that key, -except that the `value` field is undefined. The new Node (`seq = 3`) is: +The process is almost entirely the same as inserting a new Entry at that key, +except that the `value` field is undefined. The new Entry (at entry index 3) +is: ``` { key: 'a/c', value: , - trie: [ , { feed: 0, seq: 2 }, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , - , , { feed: 0, seq: 0 } ], - seq: 3, - path: - [ 1, 2, 0, 1, 2, 0, 2, 2, 3, 0, 1, 2, 1, 3, 0, 3, 0, 0, 2, 1, 0, 2, 0, 0, 2, 0, 0, 3, 2, 1, 1, 2, - 0, 1, 1, 0, 1, 2, 3, 2, 2, 2, 0, 0, 3, 1, 2, 1, 3, 3, 3, 3, 3, 3, 0, 3, 3, 2, 3, 2, 3, 0, 1, 0, - 4 ] } + trie: [ , { feed: 0, index: 2 }, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , + , , { feed: 0, index: 0 } ] } +``` + +The path hash array for this Entry (key) is: + +``` +[ 1, 2, 0, 1, 2, 0, 2, 2, 3, 0, 1, 2, 1, 3, 0, 3, 0, 0, 2, 1, 0, 2, 0, 0, 2, 0, 0, 3, 2, 1, 1, 2, + 0, 1, 1, 0, 1, 2, 3, 2, 2, 2, 0, 0, 3, 1, 2, 1, 3, 3, 3, 3, 3, 3, 0, 3, 3, 2, 3, 2, 3, 0, 1, 0, + 4 ] ``` + + # Drawbacks [drawbacks]: #drawbacks @@ -479,21 +531,17 @@ files. The metadata overhead for a single database entry varies based on the size of the database. In a "heavy" case, considering a two-path-segment key with an -entirely saturated `trie` and `uint32` size feed and sequence pointers, and +entirely saturated `trie` and `uint32`-sized feed and entry index pointers, and ignoring multi-writer fields: - `trie`: 4 * 2 * 64 bytes = 512 bytes -- `seq`: 4 bytes -- `path`: 65 bytes -- total: 581 bytes +- total: 512 bytes -In a "light" case, with few `trie` entries and single-byte varint sequence -numbers: +In a "light" case, with few `trie` entries and single-byte varint feed and +entry index pointers: - `trie`: 2 * 2 * 4 bytes = 16 bytes -- `seqq: 1 byte -- `path`: 65 bytes -- total: 82 +- total: 16 For a database with most keys having N path segments, the cost of a `get()` scales with the number of entries M as `O(log(M))` with best case 1 lookup and -- cgit v1.2.3