From 4ce751f000285bc97adef27bab0873ae2690859e Mon Sep 17 00:00:00 2001
From: Bryan Newbold <bnewbold@robocracy.org>
Date: Thu, 22 Mar 2018 21:31:05 -0700
Subject: bunch of unstructured notes

---
 README.md                 |  2 +-
 next_thoughts.txt         | 19 +++++++++++++++++++
 notes/bot_tools.txt       | 17 +++++++++++++++++
 notes/initial_sources.txt |  9 ++++++++-
 notes/test_cases.txt      |  7 +++++++
 plan.txt                  |  3 +++
 6 files changed, 55 insertions(+), 2 deletions(-)
 create mode 100644 next_thoughts.txt
 create mode 100644 notes/bot_tools.txt
 create mode 100644 notes/test_cases.txt

diff --git a/README.md b/README.md
index 184b6f26..5bea2290 100644
--- a/README.md
+++ b/README.md
@@ -20,4 +20,4 @@ Use `pipenv` (which you can install with `pip`).
 
 Run tests:
 
-    pipenv run nosetests3 backend/ webface/
+    pipenv run nosetests3 fatcat
diff --git a/next_thoughts.txt b/next_thoughts.txt
new file mode 100644
index 00000000..0e89249a
--- /dev/null
+++ b/next_thoughts.txt
@@ -0,0 +1,19 @@
+Should probably just UUID all the (public) ids.
+
+Instead of having a separate id pointer table, could have an extra "mutable"
+public ID column (unique, indexed) on entity rows. Backend would ensure the
+right thing happens. Changelog tables (or special redirect/deletion tables)
+would record changes and be "fallen through" to.
+
+Instead of having merge redirects, could just point all identifiers to the same
+revision (and update them all in the future). Don't need to recurse! Need to
+keep this forever though, could scale badly if "aggregations" get merged.
+
+Redirections of redirections should probably simply be disallowed.
+
+"Deletion" is really just pointing to a special or null entity.
+
+Trade-off: easy querying for common case (wanting "active" rows) vs. robust
+handling of redirects (likely to be pretty common). Also, having UUID handling
+across more than one table.
+
diff --git a/notes/bot_tools.txt b/notes/bot_tools.txt
new file mode 100644
index 00000000..cf465bde
--- /dev/null
+++ b/notes/bot_tools.txt
@@ -0,0 +1,17 @@
+
+Could be helpful for writing bots for import:
+
+metafacture: large/popular java framework for pipelines and munging library
+metadata.
+
+    https://github.com/metafacture/metafacture-core/wiki
+
+catmandu: large/popular set of perl libraries for munging bibliographic
+metadata, including a DSL ("Fix"). Can also push/pull to backends.
+
+miku/siskin: luigi and higher-level tool for running regular tasks.
+
+    https://github.com/miku/span
+
+miku/span: golang lower-level tools for parsing and normalizing specific
+formats (including KBART, DOAJ).
diff --git a/notes/initial_sources.txt b/notes/initial_sources.txt
index a68fb982..cc22019d 100644
--- a/notes/initial_sources.txt
+++ b/notes/initial_sources.txt
@@ -9,11 +9,18 @@ then merge in:
 
     dblp
     CORE
-    oaDOI
+    MSAG dump
+    VIAF
     archive.org paper/url manifest
     semantic scholar
+    oaDOI
 
 and later:
 
+    wikidata
     opencitations
     openlibrary
+
+national libraries:
+
+    http://www.dnb.de/EN/Service/DigitaleDienste/LinkedData/linkeddata_node.html
diff --git a/notes/test_cases.txt b/notes/test_cases.txt
new file mode 100644
index 00000000..bc6ea64a
--- /dev/null
+++ b/notes/test_cases.txt
@@ -0,0 +1,7 @@
+
+Many co-authors (group):
+
+    "Precision measurement of the top-quark mass in lepton+jets final states"
+    https://arxiv.org/abs/1405.1756
+
+
diff --git a/plan.txt b/plan.txt
index 9e8d957b..33b40663 100644
--- a/plan.txt
+++ b/plan.txt
@@ -1,4 +1,7 @@
 
+Avoiding ORM and splitting into two apps seems to be like making water flow up
+hill. Going to just make this a generic flask-sqlalchemy thing for now.
+
 - backend test setup: generate temporary database, insert rows (?)
 
 backend/api:
-- 
cgit v1.2.3