diff options
Diffstat (limited to 'notes/bulk_edits/CHANGELOG.md')
-rw-r--r-- | notes/bulk_edits/CHANGELOG.md | 24 |
1 files changed, 11 insertions, 13 deletions
diff --git a/notes/bulk_edits/CHANGELOG.md b/notes/bulk_edits/CHANGELOG.md index 80760938..2db0c72d 100644 --- a/notes/bulk_edits/CHANGELOG.md +++ b/notes/bulk_edits/CHANGELOG.md @@ -9,8 +9,19 @@ this file should probably get merged into the guide at some point. This file should not turn in to a TODO list! +## 2020-01 + +Imported around 2,500 new containers (journals, by ISSN-L) from chocula +analysis script. + ## 2019-12 +Started continuous harvesting Datacite DOI metadata; first date harvested was +`2019-12-13`. No importer running yet. + +Imported about 3.3m new ORCID identifiers from 2019 bulk dump (after converting +from XML to JSON): <https://archive.org/details/orcid-dump-2019> + Inserted about 154k new arxiv release entities. Still no automatic daily harvesting. @@ -45,22 +56,9 @@ invalid ISSN checksum). Imported files (matched to releases by DOI) from Semantic Scholar (`DIRECT-OA-CRAWL-2019` crawl). - Arabesque importer - crawl-bot - `s2_doi.sqlite` - TODO: archive.org link - TODO: rough count - TODO: date - Imported files (matched to releases by DOI) from pre-1923/pre-1909 items uploaded by a user to archive.org. - Matched importer - internetarchive-bot (TODO:) - TODO: archive.org link - TODO: counts - TODO: date - Imported files (matched to releases by DOI) from CORE.ac.uk (`DIRECT-OA-CRAWL-2019` crawl). |