aboutsummaryrefslogtreecommitdiffstats
path: root/TODO
diff options
context:
space:
mode:
Diffstat (limited to 'TODO')
-rw-r--r--TODO26
1 files changed, 24 insertions, 2 deletions
diff --git a/TODO b/TODO
index d5e10629..35f81500 100644
--- a/TODO
+++ b/TODO
@@ -2,18 +2,34 @@
## Next Up
- some significant slow-down has happened? transactions, or regexes?
+summer roadmap:
+- PUT/UPDATE, DELETE, and merge code paths
+- faster UPDATE-free bulk import code path
+- container import (extra?): lang, region, subject
+- basic API+webface creation, editing, merging, editgroup approval
+- elastic schema/transform for releases; bulk and continuous scripts
features:
- fast database dump command: both changelog-based and entity-based (rust)
=> lighter, more complete dumps for each entity type?
+- guide skeleton (mdbook; guide.fatcat.wiki)
importers:
+- CORE
+- wikidata cross-ref (if they have a dump)
- manifest: multiple URLs per SHA1
-- pubmed (medline)
+- pubmed (medline), if not in CORE
=> and/or, use pubmed ID lookups on crossref import
- core
- semantic scholar (up to 39 million; author de-dupe)
- wikidata (if they have a dump)
+- crossref: relations ("is-preprint-of")
+- crossref: filter works
+ => content-type whitelist
+ => title length and title/slug blacklist
+ => at least one author (?)
+ => make this a method on Release object
+ => or just set release_stub as "stub"?
bugs:
- test: release pointing to a collection that has been deleted/redirected
@@ -29,10 +45,16 @@ july roadmap:
## Schema / Alignment / Scope
- "container" -> "venue"?
-- release_type, release_status, url.rel enums (and others?)
+- release_type, release_status, url.rel write-time schema(and others?)
name ref: https://www.w3.org/International/questions/qa-personal-names
+## API
+
+- how to send edit "extra" metadata?
+- hydrate entities in API
+ ? "expand" query param
+
## High-Level Priorities
- full database dump (export)