aboutsummaryrefslogtreecommitdiffstats
path: root/notes/bulk_edits/CHANGELOG.md
diff options
context:
space:
mode:
authorBryan Newbold <bnewbold@robocracy.org>2020-12-29 19:22:33 -0800
committerBryan Newbold <bnewbold@robocracy.org>2020-12-29 19:22:33 -0800
commit3b0a8b8f9d94fbdbbb0034e46e725b138e7bd712 (patch)
tree6ea56165f6dbeebaceaaa0054756756ee3ad06aa /notes/bulk_edits/CHANGELOG.md
parent85732f776c38db7c181f628993a29dcd6776ffde (diff)
downloadfatcat-3b0a8b8f9d94fbdbbb0034e46e725b138e7bd712.tar.gz
fatcat-3b0a8b8f9d94fbdbbb0034e46e725b138e7bd712.zip
dblp import notes; bulk edit changelog update
Diffstat (limited to 'notes/bulk_edits/CHANGELOG.md')
-rw-r--r--notes/bulk_edits/CHANGELOG.md9
1 files changed, 8 insertions, 1 deletions
diff --git a/notes/bulk_edits/CHANGELOG.md b/notes/bulk_edits/CHANGELOG.md
index 5f25d769..c5f133f8 100644
--- a/notes/bulk_edits/CHANGELOG.md
+++ b/notes/bulk_edits/CHANGELOG.md
@@ -13,7 +13,14 @@ This file should not turn in to a TODO list!
Updated ORCIDs from 2020 dump. About 2.4 million new `creator` entities.
-Imported DOAJ article metadata from a 2020-11 dump.
+Imported DOAJ article metadata from a 2020-11 dump. Crawled and imported
+several hundred thousand file entities matched by DOAJ identifier. Updated
+journal metadata using chocula took (before the release ingest). Filtered out
+fuzzy-matching papers before importing.
+
+Imported dblp from a 2020 snapshot, both containers (primarily for conferences
+lacking an ISSN) and release entities (primarily conference papers). Filtered
+out fuzzy-matching papers before importing.
## 2020-03