diff options
author | Bryan Newbold <bnewbold@robocracy.org> | 2020-12-29 19:22:33 -0800 |
---|---|---|
committer | Bryan Newbold <bnewbold@robocracy.org> | 2020-12-29 19:22:33 -0800 |
commit | 3b0a8b8f9d94fbdbbb0034e46e725b138e7bd712 (patch) | |
tree | 6ea56165f6dbeebaceaaa0054756756ee3ad06aa /notes/bulk_edits/CHANGELOG.md | |
parent | 85732f776c38db7c181f628993a29dcd6776ffde (diff) | |
download | fatcat-3b0a8b8f9d94fbdbbb0034e46e725b138e7bd712.tar.gz fatcat-3b0a8b8f9d94fbdbbb0034e46e725b138e7bd712.zip |
dblp import notes; bulk edit changelog update
Diffstat (limited to 'notes/bulk_edits/CHANGELOG.md')
-rw-r--r-- | notes/bulk_edits/CHANGELOG.md | 9 |
1 files changed, 8 insertions, 1 deletions
diff --git a/notes/bulk_edits/CHANGELOG.md b/notes/bulk_edits/CHANGELOG.md index 5f25d769..c5f133f8 100644 --- a/notes/bulk_edits/CHANGELOG.md +++ b/notes/bulk_edits/CHANGELOG.md @@ -13,7 +13,14 @@ This file should not turn in to a TODO list! Updated ORCIDs from 2020 dump. About 2.4 million new `creator` entities. -Imported DOAJ article metadata from a 2020-11 dump. +Imported DOAJ article metadata from a 2020-11 dump. Crawled and imported +several hundred thousand file entities matched by DOAJ identifier. Updated +journal metadata using chocula took (before the release ingest). Filtered out +fuzzy-matching papers before importing. + +Imported dblp from a 2020 snapshot, both containers (primarily for conferences +lacking an ISSN) and release entities (primarily conference papers). Filtered +out fuzzy-matching papers before importing. ## 2020-03 |