diff options
author | Bryan Newbold <bnewbold@robocracy.org> | 2020-01-07 15:23:22 -0800 |
---|---|---|
committer | Bryan Newbold <bnewbold@robocracy.org> | 2020-01-07 15:23:22 -0800 |
commit | 4b7ceb92eb684f4e69286352bd9cf2a9f24099d3 (patch) | |
tree | b5a9bdb46ae643b7c1ddd699cdf08a6ae57bf2ee /notes | |
parent | b289da087453f13571c5570d6be4a3fb4ac08acd (diff) | |
download | fatcat-4b7ceb92eb684f4e69286352bd9cf2a9f24099d3.tar.gz fatcat-4b7ceb92eb684f4e69286352bd9cf2a9f24099d3.zip |
chocula bulk edit note
Diffstat (limited to 'notes')
-rw-r--r-- | notes/bulk_edits/2019-12-20_updates.md | 10 | ||||
-rw-r--r-- | notes/bulk_edits/CHANGELOG.md | 5 |
2 files changed, 15 insertions, 0 deletions
diff --git a/notes/bulk_edits/2019-12-20_updates.md b/notes/bulk_edits/2019-12-20_updates.md index a8f62ea9..83c8d9da 100644 --- a/notes/bulk_edits/2019-12-20_updates.md +++ b/notes/bulk_edits/2019-12-20_updates.md @@ -80,3 +80,13 @@ x fix bad DOI error (real error, skip these) x remove newline after "unparsable medline date" error x remove extra line like "existing.ident, existing.ext_ids.pmid, re.ext_ids.pmid))" in warning +## Chocula + +Command: + + export FATCAT_AUTH_WORKER_JOURNAL_METADATA=[...] + ./fatcat_import.py chocula /srv/fatcat/datasets/export_fatcat.2019-12-26.json + +Result: + + Counter({'total': 144455, 'exists': 139807, 'insert': 2384, 'skip': 2264, 'skip-unknown-new-issnl': 2264, 'exists-by-issnl': 306, 'update': 0}) diff --git a/notes/bulk_edits/CHANGELOG.md b/notes/bulk_edits/CHANGELOG.md index 773d09ef..2db0c72d 100644 --- a/notes/bulk_edits/CHANGELOG.md +++ b/notes/bulk_edits/CHANGELOG.md @@ -9,6 +9,11 @@ this file should probably get merged into the guide at some point. This file should not turn in to a TODO list! +## 2020-01 + +Imported around 2,500 new containers (journals, by ISSN-L) from chocula +analysis script. + ## 2019-12 Started continuous harvesting Datacite DOI metadata; first date harvested was |