diff options
author | Bryan Newbold <bnewbold@robocracy.org> | 2023-01-04 19:55:30 -0800 |
---|---|---|
committer | Bryan Newbold <bnewbold@robocracy.org> | 2023-01-04 20:18:25 -0800 |
commit | 276ac2aa24166660bc6ffe7601cee44b5d848dae (patch) | |
tree | 8a35ce06e7ab9e6755b24abc41dee1115cf62788 /proposals/2019-09-11_NEXT_schema_tweaks.md | |
parent | ee46c33544941a5104182a2e221e841a32cbbf78 (diff) | |
download | fatcat-276ac2aa24166660bc6ffe7601cee44b5d848dae.tar.gz fatcat-276ac2aa24166660bc6ffe7601cee44b5d848dae.zip |
proposals: update status; add some old ones; consistent file names
Diffstat (limited to 'proposals/2019-09-11_NEXT_schema_tweaks.md')
-rw-r--r-- | proposals/2019-09-11_NEXT_schema_tweaks.md | 42 |
1 files changed, 42 insertions, 0 deletions
diff --git a/proposals/2019-09-11_NEXT_schema_tweaks.md b/proposals/2019-09-11_NEXT_schema_tweaks.md new file mode 100644 index 00000000..dcbc2f5f --- /dev/null +++ b/proposals/2019-09-11_NEXT_schema_tweaks.md @@ -0,0 +1,42 @@ + +Status: planned + +## Schema Changes for Next Release + +Proposed schema changes for next fatcat iteration with SQL changes (v0.6? v1.0?). + +SQL (and API, and elasticsearch): + +- `db_get_range_for_editor` is slow when there are many editgroups for editor; add sorted index? meh. +- release: `release_month` (to complement `release_date` and `release_year`) +- file: `file_scope` as a string enum indicating how much content this file + includes. Eg, `book`, `chapter`, `article`/`work`, `issue`, `volume`, + `abstract`, `component`. Unclear how to initialize this field; default to + `article`/`work`? +- file: some way of marking bad/bogus files... by scope? type? status? +- TODO: webcapture: lookup by primary URL sha1? +- TODO: release: switch how pages work? first/last? +- TODO: indication of peer-review process? at release or container level? +- TODO: container: separate canonical and disambiguating titles (?) +- TODO: container: "imprint" field? +- TODO: container: "series" field? eg for conferences +- TODO: release inter-references using SCHOLIX/Datacite schema + https://zenodo.org/record/1120265 + https://support.datacite.org/docs/connecting-research-outputs#section-related-identifiers +- TODO: fileset: some sort of lookup; hashes of hashes? +- TODO: fileset: some indication/handling of git repositories + +API tweaks: + +- add regex restrictions on more `ext_ids`, especially `wikidata_qid` +- add explicit enums for more keyword fields + +API endpoints: + +- `GET /auth/token/<editor_id>` endpoint to generate new API token for given + editor. Used by web interface, or bot wranglers. +- create editor endpoint, to allow bot account creation +- `GET /editor/<ident>/bots` (?) endpoint to enumerate bots wrangled by a + specific editor + +See `2020_search_improvements` for elasticsearch-only schema updates. |