summaryrefslogtreecommitdiffstats
path: root/python/fatcat_tools/harvest/pubmed.py
Commit message (Collapse)AuthorAgeFilesLines
* pubmed: log to stderrMartin Czygan2020-03-101-1/+1
|
* pubmed: move mapping generation out of fetch_dateMartin Czygan2020-03-101-7/+8
| | | | | * fetch_date will fail on missing mapping * adjust tests (test will require access to pubmed ftp)
* pubmed: citations is a bit more preciseMartin Czygan2020-03-091-1/+1
| | | | | > Each day, NLM produces update files that include new, revised and deleted citations. -- ftp://ftp.ncbi.nlm.nih.gov/pubmed/updatefiles/README.txt
* pubmed: we sync from FTPMartin Czygan2020-03-091-1/+1
|
* more pubmed adjustmentsMartin Czygan2020-02-221-70/+117
| | | | | * regenerate map in continuous mode * add tests
* pubmed ftp: fix urlMartin Czygan2020-02-191-4/+6
|
* pubmed ftp harvest and KafkaBs4XmlPusherMartin Czygan2020-02-191-0/+199
* add PubmedFTPWorker * utils are currently stored alongside pubmed (e.g. ftpretr, xmlstream) but may live elsewhere, as they are more generic * add KafkaBs4XmlPusher