## Next Up - basic webface creation, editing, merging, editgroup approval - elastic schema/transform for releases; bulk and continuous scripts ## QA Blockers - refactors and correctness in rust/TODO - importers have editor accounts and include editgroup metadata ## Production blockers - enforce single-ident-edit-per-editgroup => entity_edit: entity_ident/entity_editgroup should be UNIQ index => UPDATE/REPLACE edits? - crossref importer sets release_type as "stub" when appropriate - re-implement old python tests - real auth - metrics, jwt, config, sentry ## Metadata Import - manifest: multiple URLs per SHA1 - crossref: relations ("is-preprint-of") - crossref: two phse: no citations, then matched citations (via DOI table) - container import (extra?): lang, region, subject - crossref: filter works => content-type whitelist => title length and title/slug blacklist => at least one author (?) => make this a method on Release object => or just set release_stub as "stub"? new importers: - pubmed (medline) (filtered) => and/or, use pubmed ID lookups on crossref import - CORE (filtered) - semantic scholar (up to 39 million; author de-dupe) ## Entity/Edit Lifecycle - redirects and merges (API, webface, etc) - test: release pointing to a collection that has been deleted/redirected => UI crash? - commenting and accepting editgroups - editgroup state machine? - enforce "single ident edit per editgroup" => how to "edit an edit"? clobber existing? ## Guide / Book / Style - release_type, release_status, url.rel schemas (and enforce in API?) - more+better terms+policies: https://tosdr.org/index.html ## Fun Features - "save paper now" => is it in GWB? if not, SPN => get hash + url from GWB, verify mimetype acceptable => is file in fatcat? => what about HBase? GROBID? => create edit, redirect user to editgroup submit page - python client tool and library in pypi => or maybe rust? - bibtext (etc) export ## Schema / Entity Fields - `doi` field for containers (at least for "journal" type; maybe for "series" as well?) - `retracted`, `translation`, and perhaps `corrected` as flags on releases, instead of release_status? ## Other - refactor openapi schema to use shared response types - consider using "HTTP 202: Accepted" for entity-mutating calls - basic python hbase/elastic matcher => takes sha1 keys => checks fatcat API + hbase => if not matched yet, tries elastic search => simple ~exact match heuristic => proof-of-concept, no tests - add_header Strict-Transport-Security "max-age=3600"; => 12 hours? 24? - elastic pipeline - kong or oauth2_proxy for auth, rate-limit, etc - feature flags: consul? - secrets: vault? - "authn" microservice: https://keratin.tech/ better API docs - https://sourcey.com/spectacle/ - https://github.com/DapperDox/dapperdox CSL: - https://citationstyles.org/ - https://github.com/citation-style-language/documentation/blob/master/primer.txt - https://citeproc-js.readthedocs.io/en/latest/csl-json/markup.html - https://github.com/citation-style-language/schema/blob/master/csl-types.rnc - perhaps a "create from CSL" endpoint?