aboutsummaryrefslogtreecommitdiffstats
path: root/extra/bulk_edits/2022-04-20_isiarticles.md
blob: ca2cc6f900d5f192742f0460b10ec249aa21850a (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26

See metadata cleanups for context. Basically a couple tens of thousands of sample/spam articles hosted on the domain isiarticles.com.

## Prod Updates

Start small:

    export FATCAT_API_HOST=https://api.fatcat.wiki
    export FATCAT_AUTH_WORKER_CLEANUP=[...]
    export FATCAT_API_AUTH_TOKEN=$FATCAT_AUTH_WORKER_CLEANUP

    fatcat-cli search file domain:isiarticles.com --entity-json -n0 \
        | rg -v '"content_scope"' \
        | rg 'isiarticles.com/' \
        | head -n50 \
        | pv -l \
        | fatcat-cli batch update file release_ids= content_scope=sample --description 'Un-link and mark isiarticles PDFs as content_scope=sample' --auto-accept
    # editgroup_ihx75kzsebgzfisgjrv67zew5e

The full batch:

    fatcat-cli search file domain:isiarticles.com --entity-json -n0 \
        | rg -v '"content_scope"' \
        | rg 'isiarticles.com/' \
        | pv -l \
        | fatcat-cli batch update file release_ids= content_scope=sample --description 'Un-link and mark isiarticles PDFs as content_scope=sample' --auto-accept