aboutsummaryrefslogtreecommitdiffstats
path: root/notes/ideas
diff options
context:
space:
mode:
authorBryan Newbold <bnewbold@robocracy.org>2019-01-04 17:41:27 -0800
committerBryan Newbold <bnewbold@robocracy.org>2019-01-04 17:41:27 -0800
commit084e476957ce80b456dcf0575de4efc7331d34f9 (patch)
tree5377ef32c140ef6a0b73d67296eeacf17edcabc9 /notes/ideas
parentd7b0a156d2a3a21e2bf5afc3e4b97e7cf1044248 (diff)
downloadfatcat-084e476957ce80b456dcf0575de4efc7331d34f9.tar.gz
fatcat-084e476957ce80b456dcf0575de4efc7331d34f9.zip
clean up notes a tiny bit
Diffstat (limited to 'notes/ideas')
-rw-r--r--notes/ideas/bot_tools.txt17
-rw-r--r--notes/ideas/domains.txt5
-rw-r--r--notes/ideas/more_api_patterns.txt15
-rw-r--r--notes/ideas/thoughts.txt32
4 files changed, 69 insertions, 0 deletions
diff --git a/notes/ideas/bot_tools.txt b/notes/ideas/bot_tools.txt
new file mode 100644
index 00000000..cf465bde
--- /dev/null
+++ b/notes/ideas/bot_tools.txt
@@ -0,0 +1,17 @@
+
+Could be helpful for writing bots for import:
+
+metafacture: large/popular java framework for pipelines and munging library
+metadata.
+
+ https://github.com/metafacture/metafacture-core/wiki
+
+catmandu: large/popular set of perl libraries for munging bibliographic
+metadata, including a DSL ("Fix"). Can also push/pull to backends.
+
+miku/siskin: luigi and higher-level tool for running regular tasks.
+
+ https://github.com/miku/span
+
+miku/span: golang lower-level tools for parsing and normalizing specific
+formats (including KBART, DOAJ).
diff --git a/notes/ideas/domains.txt b/notes/ideas/domains.txt
new file mode 100644
index 00000000..8556494e
--- /dev/null
+++ b/notes/ideas/domains.txt
@@ -0,0 +1,5 @@
+
+Many obvious domains and hacks are taken. Would love to get fatcat.org; for now
+registered fatcat.wiki.
+
+fatca.tt is available.
diff --git a/notes/ideas/more_api_patterns.txt b/notes/ideas/more_api_patterns.txt
new file mode 100644
index 00000000..ca61ac81
--- /dev/null
+++ b/notes/ideas/more_api_patterns.txt
@@ -0,0 +1,15 @@
+
+If returning a long list (eg, all releases for a container):
+
+ "releases": {
+ "data": [
+ <release>,
+ <release>,
+ ...
+ ],
+ "has_mode": true,
+ "total_count": 100,
+ "url": "/v0/container/asdf/releases"
+ }
+
+This pattern from the Stripe API.
diff --git a/notes/ideas/thoughts.txt b/notes/ideas/thoughts.txt
new file mode 100644
index 00000000..c01c0d37
--- /dev/null
+++ b/notes/ideas/thoughts.txt
@@ -0,0 +1,32 @@
+
+Instead of having a separate id pointer table, could have an extra "mutable"
+public ID column (unique, indexed) on entity rows. Backend would ensure the
+right thing happens. Changelog tables (or special redirect/deletion tables)
+would record changes and be "fallen through" to.
+
+Instead of having merge redirects, could just point all identifiers to the same
+revision (and update them all in the future). Don't need to recurse! Need to
+keep this forever though, could scale badly if "aggregations" get merged.
+
+Redirections of redirections should probably simply be disallowed.
+
+"Deletion" is really just pointing to a special or null entity.
+
+Trade-off: easy querying for common case (wanting "active" rows) vs. robust
+handling of redirects (likely to be pretty common). Also, having UUID handling
+across more than one table.
+
+## Scaling database
+
+Two scaling issues: size of database due to edits (likely billions of rows) and
+desire to do complex queries/reports ("analytics"). The later is probably not a
+concern, and could be handled by dumping and working on a cluster (or secondary
+views, etc). So just a distraction? Simpler to have all rolled up.
+
+Cockroach is postgres-like; might be able to use that for HA and scaling?
+Bottlenecks are probably complex joins (mitigated by "interleave"?) and bulk
+import performance (one-time?).
+
+Using elastic for most (eg, non-logged-in) views could keep things fast.
+
+Cockroach seems more resourced/polished than TiDB?