summaryrefslogtreecommitdiffstats
path: root/notes
diff options
context:
space:
mode:
authorBryan Newbold <bnewbold@robocracy.org>2020-01-07 15:23:22 -0800
committerBryan Newbold <bnewbold@robocracy.org>2020-01-07 15:23:22 -0800
commit4b7ceb92eb684f4e69286352bd9cf2a9f24099d3 (patch)
treeb5a9bdb46ae643b7c1ddd699cdf08a6ae57bf2ee /notes
parentb289da087453f13571c5570d6be4a3fb4ac08acd (diff)
downloadfatcat-4b7ceb92eb684f4e69286352bd9cf2a9f24099d3.tar.gz
fatcat-4b7ceb92eb684f4e69286352bd9cf2a9f24099d3.zip
chocula bulk edit note
Diffstat (limited to 'notes')
-rw-r--r--notes/bulk_edits/2019-12-20_updates.md10
-rw-r--r--notes/bulk_edits/CHANGELOG.md5
2 files changed, 15 insertions, 0 deletions
diff --git a/notes/bulk_edits/2019-12-20_updates.md b/notes/bulk_edits/2019-12-20_updates.md
index a8f62ea9..83c8d9da 100644
--- a/notes/bulk_edits/2019-12-20_updates.md
+++ b/notes/bulk_edits/2019-12-20_updates.md
@@ -80,3 +80,13 @@ x fix bad DOI error (real error, skip these)
x remove newline after "unparsable medline date" error
x remove extra line like "existing.ident, existing.ext_ids.pmid, re.ext_ids.pmid))" in warning
+## Chocula
+
+Command:
+
+ export FATCAT_AUTH_WORKER_JOURNAL_METADATA=[...]
+ ./fatcat_import.py chocula /srv/fatcat/datasets/export_fatcat.2019-12-26.json
+
+Result:
+
+ Counter({'total': 144455, 'exists': 139807, 'insert': 2384, 'skip': 2264, 'skip-unknown-new-issnl': 2264, 'exists-by-issnl': 306, 'update': 0})
diff --git a/notes/bulk_edits/CHANGELOG.md b/notes/bulk_edits/CHANGELOG.md
index 773d09ef..2db0c72d 100644
--- a/notes/bulk_edits/CHANGELOG.md
+++ b/notes/bulk_edits/CHANGELOG.md
@@ -9,6 +9,11 @@ this file should probably get merged into the guide at some point.
This file should not turn in to a TODO list!
+## 2020-01
+
+Imported around 2,500 new containers (journals, by ISSN-L) from chocula
+analysis script.
+
## 2019-12
Started continuous harvesting Datacite DOI metadata; first date harvested was