| Commit message (Expand) | Author | Age | Files | Lines |
* | pubmed: workaround a networking issue | Martin Czygan | 2021-09-09 | 1 | -24/+21 |
* | pubmed: add option to ftp download with lftp | Martin Czygan | 2021-09-08 | 1 | -2/+31 |
* | pubmed harvester: add basic retry logic | Martin Czygan | 2021-08-20 | 1 | -8/+21 |
* | pubmed: update docs | Martin Czygan | 2021-07-17 | 1 | -2/+3 |
* | pubmed: do not fail when accessing missing file | Martin Czygan | 2021-07-17 | 1 | -2/+8 |
* | pubmed: reconnect on error | Martin Czygan | 2021-07-16 | 1 | -4/+30 |
* | small python lint fixes (no behavior change) | Bryan Newbold | 2021-05-25 | 1 | -1/+1 |
* | harvest: datacite API yields HTTP 200 with broken JSON | Martin Czygan | 2020-08-10 | 1 | -1/+8 |
* | arxiv: do retry five times of HTTP 503 | Martin Czygan | 2020-07-10 | 1 | -1/+1 |
* | lint (flake8) tool python files | Bryan Newbold | 2020-07-01 | 4 | -19/+6 |
* | harvest: fail on HTTP 400 | Martin Czygan | 2020-05-29 | 1 | -4/+0 |
* | rename HarvestState.next() to HarvestState.next_span() | Bryan Newbold | 2020-05-26 | 4 | -5/+5 |
* | HACK: skip pylint errors on lines that seem to be fine | Bryan Newbold | 2020-05-22 | 3 | -3/+3 |
* | crossref: switch from index-date to update-date | Bryan Newbold | 2020-03-30 | 1 | -1/+1 |
* | crossref: longer comment about crossref API date fields | Bryan Newbold | 2020-03-30 | 1 | -2/+22 |
* | Merge pull request #53 from EdwardBetts/spelling | bnewbold | 2020-03-27 | 1 | -2/+2 |
|\ |
|
| * | Correct spelling mistakes | Edward Betts | 2020-03-27 | 1 | -2/+2 |
* | | pubmed: log to stderr | Martin Czygan | 2020-03-10 | 1 | -1/+1 |
* | | pubmed: move mapping generation out of fetch_date | Martin Czygan | 2020-03-10 | 1 | -7/+8 |
* | | harvest: fix imports from HarvestPubmedWorker cleanup | Martin Czygan | 2020-03-10 | 1 | -2/+2 |
* | | pubmed: citations is a bit more precise | Martin Czygan | 2020-03-09 | 1 | -1/+1 |
* | | pubmed: we sync from FTP | Martin Czygan | 2020-03-09 | 1 | -1/+1 |
* | | oaipmh: HarvestPubmedWorker obsoleted by PubmedFTPWorker | Martin Czygan | 2020-03-09 | 1 | -34/+0 |
* | | more pubmed adjustments | Martin Czygan | 2020-02-22 | 2 | -70/+118 |
* | | pubmed ftp: fix url | Martin Czygan | 2020-02-19 | 1 | -4/+6 |
* | | pubmed ftp harvest and KafkaBs4XmlPusher | Martin Czygan | 2020-02-19 | 2 | -0/+214 |
|/ |
|
* | harvest: log state on startup and use stderr for diagnostics | Martin Czygan | 2020-02-14 | 3 | -17/+22 |
* | datacite: extend range search query | Martin Czygan | 2019-12-27 | 1 | -1/+1 |
* | avoid usage of short links | Martin Czygan | 2019-12-27 | 1 | -2/+2 |
* | Datacite API v2 throws 400, we cannot recover from, currently. | Martin Czygan | 2019-12-27 | 1 | -0/+4 |
* | datacite: update documentation, add links to issues | Martin Czygan | 2019-12-27 | 1 | -10/+5 |
* | datacite: use v2 of the API (flaky) | Martin Czygan | 2019-12-27 | 1 | -5/+28 |
* | refactor kafka producer in crossref harvester | Bryan Newbold | 2019-12-06 | 1 | -21/+26 |
* | crossref is_update isn't what I thought | Bryan Newbold | 2019-12-03 | 1 | -6/+2 |
* | review/fix all confluent-kafka produce code | Bryan Newbold | 2019-09-20 | 3 | -14/+49 |
* | small fixes to confluent-kafka importers/workers | Bryan Newbold | 2019-09-20 | 2 | -2/+2 |
* | small kafka tweaks for robustness | Bryan Newbold | 2019-09-20 | 1 | -0/+2 |
* | bump max message size to ~20 MBytes | Bryan Newbold | 2019-09-20 | 2 | -0/+2 |
* | fixes to confluent-kafka harvesters | Bryan Newbold | 2019-09-20 | 3 | -20/+21 |
* | first draft harvesters using confluent-kafka | Bryan Newbold | 2019-09-20 | 3 | -48/+104 |
* | increase default harvest window to 14 days | Bryan Newbold | 2019-04-01 | 1 | -2/+2 |
* | HACK: force pylint to ignore urllib3 Retry import | Bryan Newbold | 2019-03-15 | 1 | -1/+3 |
* | MEDLINE/Pubmed note | Bryan Newbold | 2019-03-15 | 1 | -2/+6 |
* | fix harvester session.get() params | Bryan Newbold | 2019-03-06 | 1 | -5/+8 |
* | retry/backoff for Crossref harvester | Bryan Newbold | 2019-03-06 | 2 | -2/+24 |
* | bunch of lint/whitespace cleanups | Bryan Newbold | 2019-02-22 | 3 | -9/+6 |
* | check request status codes idiomatically | Bryan Newbold | 2018-12-29 | 1 | -2/+2 |
* | clean up harvester comments/docs | Bryan Newbold | 2018-11-21 | 3 | -50/+31 |
* | use isoformat() to format dates | Bryan Newbold | 2018-11-21 | 2 | -4/+4 |
* | fix loop_sleep typo | Bryan Newbold | 2018-11-21 | 2 | -2/+2 |