summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorBryan Newbold <bnewbold@robocracy.org>2019-11-12 19:03:42 -0800
committerBryan Newbold <bnewbold@robocracy.org>2019-11-12 19:03:42 -0800
commitcc44c4aa3ed7a5a27653091ce448f9684c9145a9 (patch)
treec680fb2f2a6047f941010f6156bc44f736e1405e
parent14bac364cd9478a3554e155d4bb7fe211977943d (diff)
downloadfatcat-cc44c4aa3ed7a5a27653091ce448f9684c9145a9.tar.gz
fatcat-cc44c4aa3ed7a5a27653091ce448f9684c9145a9.zip
old proposals for 'next' schema update
-rw-r--r--proposals/20190911_v04_schema_tweaks.md38
1 files changed, 38 insertions, 0 deletions
diff --git a/proposals/20190911_v04_schema_tweaks.md b/proposals/20190911_v04_schema_tweaks.md
new file mode 100644
index 00000000..3d1e04c1
--- /dev/null
+++ b/proposals/20190911_v04_schema_tweaks.md
@@ -0,0 +1,38 @@
+
+status: work-in-progress
+
+Proposed schema changes for next fatcat iteration (v0.4? v0.5?).
+
+SQL (and API, and elasticsearch):
+
+- container:`container_status` as a string enum: eg, "stub",
+ "out-of-print"/"ended" (?), "active", "new"/"small" (?). Particularly to
+ deal with disambiguation of multiple containers by the same title but
+ separate ISSN-L. For example, "The Lancet".
+- release: `release_month` (to complement `release_date` and `release_year`)
+- file: `file_scope` as a string enum indicating how much content this file
+ includes. Eg, `book`, `chapter`, `article`/`work`, `issue`, `volume`,
+ `abstract`, `component`. Unclear how to initialize this field; default to
+ `article`/`work`?
+- TODO: release: switch how pages work? first/last?
+
+API tweaks:
+
+- add regex restrictions on more `ext_ids`, especially `wikidata_qid`
+- add explicit enums for more keyword fields
+
+API endpoints:
+
+- `GET /auth/token/<editor_id>` endpoint to generate new API token for given
+ editor. Used by web interface, or bot wranglers.
+- create editor endpoint, to allow bot account creation
+- `GET /editor/<ident>/bots` (?) endpoint to enumerate bots wrangled by a
+ specific editor
+
+Elasticsearch schema:
+
+- releases *may* need an "_all" field (or `biblio`?) containing most fields to
+ make some search experiences work
+- releases should include volume, issue, pages
+- releases *could* include reference and creator lists, as a faster/cheaper
+ mechanism for doing reverse lookups