diff options
author | Martin Czygan <martin@archive.org> | 2020-11-24 19:29:07 +0000 |
---|---|---|
committer | Martin Czygan <martin@archive.org> | 2020-11-24 19:29:07 +0000 |
commit | cfd13852d7cb58fcc3387373960adaf3680f0faf (patch) | |
tree | 675954b8b34324fe22fc5a00f3fbb99a21a77a21 /python/README_import.md | |
parent | fcfcd3224a113fa90da2045a3c7fe90127088ebe (diff) | |
parent | 1fca5a9822944d0646d2dcba6cf54f27a0ffe5c0 (diff) | |
download | fatcat-cfd13852d7cb58fcc3387373960adaf3680f0faf.tar.gz fatcat-cfd13852d7cb58fcc3387373960adaf3680f0faf.zip |
Merge branch 'bnewbold-doaj-metadata' into 'master'
DOAJ article metadata import
See merge request webgroup/fatcat!89
Diffstat (limited to 'python/README_import.md')
-rw-r--r-- | python/README_import.md | 7 |
1 files changed, 7 insertions, 0 deletions
diff --git a/python/README_import.md b/python/README_import.md index 65c08f8b..71b15eee 100644 --- a/python/README_import.md +++ b/python/README_import.md @@ -126,3 +126,10 @@ Run import in parallel: zcat /srv/fatcat/datasets/crossref-pre-1923-scholarly-works.matched.json.gz | time parallel -j12 --round-robin --pipe ./fatcat_import.py matched - --default-mime 'application/pdf' +## DOAJ + +Takes a few hours. + + export FATCAT_API_AUTH_TOKEN=... (FATCAT_AUTH_WORKER_DOAJ) + + zcat /srv/fatcat/datasets/doaj_article_data_2020-11-13_all.json.gz | pv -l | parallel -j12 --round-robin --pipe ./fatcat_import.py doaj-article --issn-map-file /srv/fatcat/datasets/ISSN-to-ISSN-L.txt - |