diff options
Diffstat (limited to 'TODO')
-rw-r--r-- | TODO | 22 |
1 files changed, 22 insertions, 0 deletions
@@ -1,6 +1,7 @@ ## Next Up +- some significant slow-down has happened? transactions, or regexes? summer roadmap: - PUT/UPDATE, DELETE, and merge code paths - faster UPDATE-free bulk import code path @@ -19,12 +20,28 @@ importers: - manifest: multiple URLs per SHA1 - pubmed (medline), if not in CORE => and/or, use pubmed ID lookups on crossref import +- core - semantic scholar (up to 39 million; author de-dupe) +- wikidata (if they have a dump) +- crossref: relations ("is-preprint-of") +- crossref: filter works + => content-type whitelist + => title length and title/slug blacklist + => at least one author (?) + => make this a method on Release object + => or just set release_stub as "stub"? bugs: - test: release pointing to a collection that has been deleted/redirected => UI crash? +july roadmap: +- complete and test this round of schema changes +- container import (extra?): lang, region, subject +- re-run imports +- basic API+webface creation, editing, merging, editgroup approval +- elastic schema/transform for releases; bulk and continuous scripts + ## Schema / Alignment / Scope - "container" -> "venue"? @@ -55,6 +72,11 @@ name ref: https://www.w3.org/International/questions/qa-personal-names - batch inserts automerge: create editgroup and changelog, mark all edits as accepted, all in a single transaction +## API + +- hydrate entities in API + ? "expand" query param + ## Other - basic python hbase/elastic matcher |