diff options
| author | Bryan Newbold <bnewbold@robocracy.org> | 2019-05-15 10:16:40 -0700 | 
|---|---|---|
| committer | Bryan Newbold <bnewbold@robocracy.org> | 2019-05-15 10:16:40 -0700 | 
| commit | ffee70b116f2683ca24e8046144fa078f2964774 (patch) | |
| tree | 628539c703de198b556bf368b21d7f7b8f3fb574 | |
| parent | 4f0be9ae6447073ffe252376d88228083f01f837 (diff) | |
| download | fatcat-ffee70b116f2683ca24e8046144fa078f2964774.tar.gz fatcat-ffee70b116f2683ca24e8046144fa078f2964774.zip | |
TODO progress (v0.3)
| -rw-r--r-- | TODO.md | 49 | 
1 files changed, 34 insertions, 15 deletions
| @@ -1,10 +1,26 @@  ## In Progress -- update existing 1.5 mil longtail OA PDFs with container/ISSN-L +x webcapture `size_bytes`/`size (consistency with file and fileset) +x final decision on `version` field +    => useful for repositories with multiple versions as incrementing integers +    => also useful for "unstructuring" some identifiers (arxiv, zenodo DOIs) +    => but CSL wants to use it (only?) for software versions +    => what about book editions, or draft revisions? +    => let's keep, but carefully document scope +x verifiers for all extid types (including new ark, mag) +x creation of editgroup via auto_batch needs extra checks +- test: edit_extra set for each entity type +- merge new importers branch +    => fix schema changes +    => use new schema fields +    => tests +- update guide with new schema +- elasticsearch schema changes (and transforms)  ## Next Up +- update existing 1.5 mil longtail OA PDFs with container/ISSN-L  ## Bugs @@ -17,24 +33,29 @@  Changes to SQL (and swagger): -- structured names in contribs (given/sur) -- `release_status` => `release_stage` -- `withdrawn_date`, `withdrawn_state`, and retraction as a release stage -- subtitle as a string field +X missing SQL indices: `ENTITY_edit.editgroup_id, ENTITY_edit.ident_id` +X structured names in contribs (given/sur) +X `release_status` => `release_stage` +X size on webcapture CDX lines (we fetch for sha256 anyways, so easy to calculate) +X `ark_id` release identifier +X `mag_id` (microsoft academic graph) release identifier + +X `withdrawn_date`, `withdrawn_state`, and retraction as a release stage +    => and `withdrawn_year`? +X subtitle as a string field      => but what about translation? `original_subtitle`? just combine them?      => combine in elasticsearch 'title' field -- size on webcapture CDX lines (we fetch for sha256 anyways, so easy to calculate) -- `ark_id` release identifier -- `mag_id` (microsoft academic graph) release identifier -- releases: 'number' (eg, report numbers) and 'version' (for numbered variants) fields -- missing SQL indices: `ENTITY_edit.editgroup_id, ENTITY_edit.ident_id` +X releases: 'number' (eg, report numbers) and 'version' (for numbered variants) fields  Changes to swagger only: -- edit URLs: `editgroup_id` in URL, not a query param +- refactor entity mutation (CUD) endpoints to be like `/editgroup/{editgroup_id}/release/{ident}` +    => changes editgroup_id from query param to URL param  - changelog API endpoint should needs expand=editors option +    => editors in a bunch of other return types also?  - include 'created' in editgroup object (already in SQL) -- FileEntityUrls => FileEntityUrl (and similar) +x FileEntityUrls => FileEntityUrl (and similar) +? refactor bulk POST to include editgroup plus array of entity objects (instead of just a couple fields as query params)  ## Next Full Release "Touch" @@ -195,9 +216,7 @@ new importers:  ## API Schema / Design -- refactor entity mutation (CUD) endpoints to be like `/editgroup/{editgroup_id}/release/{ident}` -    => changes editgroup_id from query param to URL param -- refactor bulk POST to include editgroup plus array of entity objects (instead of just a couple fields as query params) +- `release_month` field. for journals, having the year and month but not day is relatively common (citation needed)  ## Web Interface | 
