aboutsummaryrefslogtreecommitdiffstats
path: root/proposals/0000-hyperdb.md
diff options
context:
space:
mode:
Diffstat (limited to 'proposals/0000-hyperdb.md')
-rw-r--r--proposals/0000-hyperdb.md35
1 files changed, 18 insertions, 17 deletions
diff --git a/proposals/0000-hyperdb.md b/proposals/0000-hyperdb.md
index 4663d3e..6fd1872 100644
--- a/proposals/0000-hyperdb.md
+++ b/proposals/0000-hyperdb.md
@@ -20,7 +20,7 @@ Authors:
Hyperdb is an abstraction layer providing a general purpose distributed
key/value store over the hypercore protocol. It is an iteration on the
-hyperdrive directory tree implementation, building top of the hypercore
+hyperdrive directory tree implementation, building on top of the hypercore
append-only log abstraction layer. Keys are path-like strings (eg,
`/food/fruit/kiwi`), and values are arbitrary binary blobs (generally under a
megabyte).
@@ -99,9 +99,9 @@ or the "latest" revision.
Requires read-write access. Returns an error (eg, via callback) if there was a
problem.
-`db.get(key)`: Reading a non-existant `key` is an error. Read-only.
+`db.get(key)`: Reading a non-existent `key` is an error. Read-only.
-`db.delete(key)`: Removes the key from the database. Deleting a non-existant
+`db.delete(key)`: Removes the key from the database. Deleting a non-existent
key is an error. Requires read-write access.
`db.list(prefix)`: returns a flat (not nested) list of all keys currently in
@@ -163,14 +163,15 @@ own DEP and mentioned only partially here. The fields common to both message
types are:
- `key`: UTF-8 key that this node describes. Leading and trailing forward
- slashes (`/`) are always striped before storing in protobuf.
-- `value`: arbitrary byte array. A non-existant `value` entry indicates that
+ slashes (`/`) are always stripped before storing in protobuf.
+- `value`: arbitrary byte array. A non-existent `value` entry indicates that
this Entry indicates a deletion of the key; this is distinct from specifying
an empty (zero-length) value.
- `trie`: a structured array of pointers to other Entry entries in the feed,
used for navigating the tree of keys.
-- `clock`: reserved for use in the forthcoming `multi-writer` standard An empty
- list is the safe (and expected) value for `clock` in single-writer use cases.
+- `clock`: reserved for use in the forthcoming `multi-writer` standard. An
+ empty list is the safe (and expected) value for `clock` in single-writer use
+ cases.
- `inflate`: a "pointer" (reference to a feed index number) to the most recent
entry in the feed that has the optional `feeds` and `contentFeed` fields set.
Not defined if `feeds` and `contentFeed` are defined.
@@ -218,7 +219,7 @@ case we use all zeros.
When converting the 8-bytehash to an array of 2-bit bytes, the ordering is
proceed byte-by-byte, and for each take the two lowest-value bits (aka, `hash &
0x3`) as byte index `0`, the next two bits (aka, `hash & 0xC`) as byte index
-`1`, etc. When concatanating path hashes into a longer array, the first
+`1`, etc. When concatenating path hashes into a longer array, the first
("left-most") path element hash will correspond to byte indexes 0 through 31;
the terminator (`4`) will have the highest byte index.
@@ -323,7 +324,7 @@ always 0, so we consider only the entry index.
For a `trie` with `N` buckets, each may have zero or more pointers. Typically
there are a maximum of 3 pointers per bucket, corresponding to the 4 possible
values minus the current Entry's value, but in the event of hash collisions (in
-the path array space) , there may be multiple pointers in the same bucket
+the path array space), there may be multiple pointers in the same bucket
corresponding to the same value.
To lookup a key in the database, the recipe is to:
@@ -359,7 +360,7 @@ Similarly, to write a key to the database:
a pointer to the Entry with the same hash at the final array index.
5. If the paths don't entirely match, find the first index at which the two
arrays differ. Copy all `trie` elements (empty or not) into the new `trie`
- for indicies between the "current index" and the "differing index".
+ for indices between the "current index" and the "differing index".
6. Next look up the corresponding element in this Entry's `trie` array at the
differing index. If this element is empty, then you have found the most
similar Entry. Write a pointer to this node to the `trie` at
@@ -382,7 +383,7 @@ arrays of pointers to entries) into a variable-length bytestring as the `trie`
field of an Entry protobuf message.
Consider a trie array with `N` buckets and `M` non-empty buckets (`0 <= M <=
-N`). In the encoded field, there will be `M` concatanated bytestrings of the
+N`). In the encoded field, there will be `M` concatenated bytestrings of the
form:
- trie index (varint), followed by...
@@ -449,9 +450,9 @@ Now `M` is 2. The first bytestring chunk will be:
the second bytestring chunk would be:
- index varint is `64` (65th element in the trie array; the terminating value)
-- btifield is `0b10000` (varint 1); there is a single pointer set... but it
+- bitfield is `0b10000` (varint 1); there is a single pointer set... but it
contains a hash collision, so there are two pointers
-- first pointer is (`feed=0` << 1 | 1`, `index=23`) or (varint 1, varint=23);
+- first pointer is (`feed=0 << 1 | 1`, `index=23`) or (varint 1, varint=23);
note the `more` bit is set high!
- second pointer is (`feed=1 << 1 | 0`, `index=17`) or (varint 2, varint 17);
note the `more` bit is low, as usual. In the extremely unlikely case of
@@ -547,7 +548,7 @@ pointer to entry index 1, so we fetch that Entry and recurse. Comparing path
hash arrays, we now get all the way to index 34 before there is a difference.
We again look in the `trie`, find a pointer to entry index 0, and fetch the
first Entry and recurse. Now the path elements match exactly; we have found the
-Entry we are looking for, and it has an existant `value`, so we return the
+Entry we are looking for, and it has an existent `value`, so we return the
`value`.
Consider a lookup for `db.get('/a/z')`; this key does not exist, so we expect
@@ -658,7 +659,7 @@ The total metadata overhead for a database with M entries scales with `O(M
The basic key/value semantics of hyperdb (as discussed in this DEP, not
considering multi-writer changes) are not known to introduce new privacy issues
-when compared with, eg, storing binary values at key-like paths using the
+when compared with, e.g., storing binary values at key-like paths using the
current ("legacy") hyperdrive system.
A malicious writer could cause trouble for readers, even readers who do not
@@ -699,7 +700,7 @@ for these fields.
Hyperdb is not backwards compatible with the existing hyperdrive metadata,
meaning dat clients may need to support both versions during a transition
-period. This applies both to archives saved to disk (eg, in SLEEP) and to
+period. This applies both to archives saved to disk (e,g., in SLEEP) and to
archives received and published to peers over the network.
No changes to the Dat network wire protocol itself are necessary, only changes
@@ -721,7 +722,7 @@ Should we declare that `contendFeed` pointers *not* change over the history of
a feed? See also <https://github.com/datprotocol/DEPs/issues/13>
Need to think through deletion process with respect to listing a path prefix;
-will previously deleted nodes be occulded, or potentially show up in list
+will previously deleted nodes be occluded, or potentially show up in list
results? Should be reviewed (by a non-author of this document) before accepted
as a Draft.