diff options
Diffstat (limited to 'proposals')
| -rw-r--r-- | proposals/20200807_dblp.md | 17 | 
1 files changed, 10 insertions, 7 deletions
| diff --git a/proposals/20200807_dblp.md b/proposals/20200807_dblp.md index b955268f..8569712e 100644 --- a/proposals/20200807_dblp.md +++ b/proposals/20200807_dblp.md @@ -35,17 +35,20 @@ Fulltext ingest:  ## Plan -- get martin review of this plan +x get martin review of this plan  x read full XML DTD -- scrape container metadata (for ~6k containers): ISSN, Wikidata QID, name +x scrape container metadata (for ~6k containers): ISSN, Wikidata QID, name      => selectolax? -    => title, issn, wikidata, "is OA" -- implement basic release import, with tests (no container/creator linking) +    => title, issn, wikidata +x implement basic release import, with tests (no container/creator linking)      => surface any unexpected issues -- estimate number of entities with/without external identifier (DOI) +x estimate number of entities with/without external identifier (DOI) +    Counter({'total': 7953365, 'has-doi': 4277307, 'skip': 2953841, 'skip-key-type': 2640968, 'skip-arxiv-corr': 312872, 'skip-title': 1, 'insert': 0, 'update': 0, 'exists': 0}) +/ update container and creator schemas to have lookup-able dblp identifiers (creator:`dblp_pid`, container:`dblp_prefix`) +. run orcid import/update of creators +- container creator/update for `dblp_prefix` +    => chocula import first?  - investigate journal+conference ISSN mapping -- run orcid import/update of creators -- update container and creator schemas to have lookup-able dblp identifiers (creator:`dblp_pid`, container:`dblp_prefix`)  ## Creator Metadata | 
