aboutsummaryrefslogtreecommitdiffstats
path: root/TODO
diff options
context:
space:
mode:
authorBryan Newbold <bnewbold@robocracy.org>2018-07-18 18:55:17 -0700
committerBryan Newbold <bnewbold@robocracy.org>2018-07-18 18:55:17 -0700
commit2d2c0668ebb5e6277766d0b14f1a5ac306fc8062 (patch)
treead1435760b891e0565f25659421b2c0930026d1d /TODO
parent7233c2b12ada5cccda0a30fd74d127df711aa279 (diff)
downloadfatcat-2d2c0668ebb5e6277766d0b14f1a5ac306fc8062.tar.gz
fatcat-2d2c0668ebb5e6277766d0b14f1a5ac306fc8062.zip
update TODO
Diffstat (limited to 'TODO')
-rw-r--r--TODO46
1 files changed, 46 insertions, 0 deletions
diff --git a/TODO b/TODO
index 591c91b8..ea856c4c 100644
--- a/TODO
+++ b/TODO
@@ -1,6 +1,43 @@
+## Next Up
+
+bugs:
+- UI: handle file null size field
+- not pulling in orcid given names correctly (?)
+- test: release pointing to a collection that has been deleted/redirected
+ => UI crash?
+- multiple URLs per file
+
+schema:
+- encoding of ident UUIDs (but not other UUIDs) (no schema change)
+- revisions, edits, editgroups, editor_id as UUIDs
+- external idents: citation IDs, medline (PMID), pubmed (PMCID), wikidata, CORE
+ => http://opencitations.net/index/coci
+ => but should just appear in regular dumps? shrug
+
+features:
+- fast database dump command: both changelog-based and entity-based (rust)
+
+importers:
+- citations
+- medline
+- core
+- wikidata (if they have a dump)
+
+other:
+- update RFC
+- basic python hbase/elastic matcher
+ => takes sha1 keys
+ => checks fatcat API + hbase
+ => if not matched yet, tries elastic search
+ => simple ~exact match heuristic
+ => proof-of-concept, no tests
+
+
## Schema / Alignment / Scope
+- add Open Citation Identifiers... and COCI importer script instead of refs
+ during crossref import?
- wikidata IDs are first-class identifiers (release, container, creator)
- switch a bunch more primary keys to UUID: revs, editor ids, edit numbers
- multiple URLs
@@ -28,8 +65,17 @@ name ref: https://www.w3.org/International/questions/qa-personal-names
- batch inserts automerge: create editgroup and changelog, mark all edits as
accepted, all in a single transaction
+## API
+
+- hydrate entities in API
+ ? "expand" query param
+ ? "full entity" field
+ ? refactor file_releases to have objects as type
+
## Other
+- bulk endpoint auto-merge mode (huge postgres speedup on import)
+- elastic pipeline
- kong or oauth2_proxy for auth, rate-limit, etc
- "authn" microservice: https://keratin.tech/
- PUT for mid-edit revisions