python/TODO


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80


next/high-level:
- quick python ORCID and ISSN import scripts
- client export:
    => one json-nl file per entity type
- flask-apispec
- swagger API docs?
- naive API-based import scripts for: journals (norwegian), orcid, crossref
- switch to marshmallow in create APIs (at least for revs)

- kong or oauth2_proxy for auth, rate-limit, etc
- "authn" microservice: https://keratin.tech/

api:
- PUT for mid-edit revisions
/ use marshmallow in POST for all entities
/ consider refactoring into method-method (classes)

model:
- 'parent rev' for revisions (vs. container parent)
- "submit" status for editgroups?

tests
- full object fields actually getting passed e2e (for rich_app)
- implicit editor.active_edit_group behavior
- modify existing release via edit mechanism (and commit)
- redirect a release to another (merge)
- update (via edit) a redirect release
- api: try to reuse an accepted edit group
- api: try to modify an accepted release
- api: multiple edits, same entity, same editgroup

review
- what does openlibrary API look like?
- hydrate in files for releases... nested good enough?
- add a 'live' (or 'immutable') flag to revision tables
- how to encode proposed redirects? history goes in changelog
    => proposed_ident_action table, which points to edits
    => ident in edit as a partial solution (not redirects)
    => extend edit object to have "to/from" info, and be per-entity

views
- oldest un-merged edits/edit-groups

later:
- switch extra_json to just be JSONB column
- public IDs are UUID (sqlite hack, or just require postgres)

## High-Level Priorities

- bulk loading of releases, files, containers, creators
- manual editing of containers and releases
- accurate auto-matching matching of containers (eg, via ISSN)
- full database dump and reload

## Planning...

before switching to golang:
x swap extra_json to simple text field
x profile slow bulk imports
  client:
    78% waiting for POST
  api:
    56% / 22ms api_release_create
    36% / 13ms api_work_create
    7% / 4ms container lookup
- flush out web interface (POST, etc)
    x create release
    => edit existing release
    => edit editgroup (remove edits)
    => approve editgroup
- "model" issues above
- look at "review" issues above
- try cockroach

after switching:
- UUID identifiers
- faster bulk importers (API client; parallel)
- editor accounts