diff options
Diffstat (limited to 'extra/bulk_edits')
-rw-r--r-- | extra/bulk_edits/2022-04-20_isiarticles.md | 26 | ||||
-rw-r--r-- | extra/bulk_edits/CHANGELOG.md | 8 |
2 files changed, 34 insertions, 0 deletions
diff --git a/extra/bulk_edits/2022-04-20_isiarticles.md b/extra/bulk_edits/2022-04-20_isiarticles.md new file mode 100644 index 00000000..ca2cc6f9 --- /dev/null +++ b/extra/bulk_edits/2022-04-20_isiarticles.md @@ -0,0 +1,26 @@ + +See metadata cleanups for context. Basically a couple tens of thousands of sample/spam articles hosted on the domain isiarticles.com. + +## Prod Updates + +Start small: + + export FATCAT_API_HOST=https://api.fatcat.wiki + export FATCAT_AUTH_WORKER_CLEANUP=[...] + export FATCAT_API_AUTH_TOKEN=$FATCAT_AUTH_WORKER_CLEANUP + + fatcat-cli search file domain:isiarticles.com --entity-json -n0 \ + | rg -v '"content_scope"' \ + | rg 'isiarticles.com/' \ + | head -n50 \ + | pv -l \ + | fatcat-cli batch update file release_ids= content_scope=sample --description 'Un-link and mark isiarticles PDFs as content_scope=sample' --auto-accept + # editgroup_ihx75kzsebgzfisgjrv67zew5e + +The full batch: + + fatcat-cli search file domain:isiarticles.com --entity-json -n0 \ + | rg -v '"content_scope"' \ + | rg 'isiarticles.com/' \ + | pv -l \ + | fatcat-cli batch update file release_ids= content_scope=sample --description 'Un-link and mark isiarticles PDFs as content_scope=sample' --auto-accept diff --git a/extra/bulk_edits/CHANGELOG.md b/extra/bulk_edits/CHANGELOG.md index b6bfcb96..94a32947 100644 --- a/extra/bulk_edits/CHANGELOG.md +++ b/extra/bulk_edits/CHANGELOG.md @@ -9,6 +9,14 @@ this file should probably get merged into the guide at some point. This file should not turn in to a TODO list! +## 2022-04 + +Imported some initial fileset entities. + +Updated about 25k file entities from isiarticles.com, which are samples (spam +for translation service) to remove release linkage and set +`content_scope=sample` (similar to the springer "page one" case). + ## 2022-03 Ran a journal-level metadata update, using chocula. |