aboutsummaryrefslogtreecommitdiffstats
path: root/python/fatcat_tools/workers
Commit message (Expand)AuthorAgeFilesLines
* Merge branch 'bnewbold-container-web' into 'master'bnewbold2022-03-101-2/+2
|\
| * move container_status ES query code from fatcat_web to fatcat_toolsBryan Newbold2022-02-091-2/+2
* | entity updates: don't try to ingest arxiv DOIs (for now)Bryan Newbold2022-02-281-0/+2
|/
* entity worker: expand creators in release entitiesBryan Newbold2021-12-151-1/+1
* small default config typo fixes for elasticsearch workersBryan Newbold2021-12-151-2/+2
* file elasticsearch index workerBryan Newbold2021-12-152-1/+35
* typing: add assertions to fatcat_tool code to make type assumptions explicitBryan Newbold2021-11-031-0/+1
* typing: add annotations to remaining fatcat_tools codeBryan Newbold2021-11-033-51/+70
* re-fix some lint issues after big 'fmt'Bryan Newbold2021-11-021-2/+3
* fmt (black): fatcat_tools/Bryan Newbold2021-11-023-196/+263
* python: isort everythingBryan Newbold2021-11-021-1/+2
* hacks to work around new pylint false positivesBryan Newbold2021-11-021-2/+3
* cleanup imports after fatcat_tools.transforms changeBryan Newbold2021-11-021-5/+8
* re-fmt all the fatcat_tools __init__ files for readabilityBryan Newbold2021-11-021-3/+6
* changelog worker: fix file/fileset typo, caught by lintBryan Newbold2021-05-251-1/+1
* es worker: ensure kafka messages get clearedBryan Newbold2021-04-121-0/+2
* es indexing: more 'wip' fixesBryan Newbold2021-04-121-1/+5
* ES indexing: skip 'wip' entities with a warningBryan Newbold2021-04-121-11/+16
* container ES index worker: support for querying statusBryan Newbold2021-04-061-5/+32
* indexing: don't use document namesBryan Newbold2021-04-061-14/+4
* entity update worker: treat fileset and webcapture updates like file updatesBryan Newbold2020-12-161-3/+25
* entity updates: don't ingest JSTOR DOI prefixesBryan Newbold2020-10-231-0/+2
* entity updater: new work update feed (ident and changelog metadata only)Bryan Newbold2020-10-161-2/+24
* ingest: default to crawl protocols.io DOIsBryan Newbold2020-09-101-0/+2
* entity updater: handle doi=None case betterBryan Newbold2020-08-141-1/+1
* entity updater: es['publisher_type'] not always setBryan Newbold2020-08-141-1/+1
* entity update: change big5 ingest behaviorBryan Newbold2020-08-111-9/+15
* entity update: default to ingest non-OA worksBryan Newbold2020-08-111-9/+10
* entity update: skip ingest of figshare+zenodo 'group' DOIsBryan Newbold2020-08-111-0/+15
* update crawl blocklist for SPNv2 requests which mostly failBryan Newbold2020-08-101-2/+10
* lint (flake8) tool python filesBryan Newbold2020-07-013-12/+0
* more changelog ES fixesBryan Newbold2020-04-171-4/+6
* ES changelog worker: fixes for ident; fetch update from API if neededBryan Newbold2020-04-171-2/+9
* Merge branch 'martin-changelog-to-es' into 'master'bnewbold2020-04-172-2/+23
|\
| * derive changelog worker from release workerMartin Czygan2020-04-172-2/+23
* | changelog: limit typesMartin Czygan2020-04-161-5/+1
* | changelog: extend release_types considered documentsMartin Czygan2020-04-161-10/+19
|/
* ingest: more DOI patterns to treat as OABryan Newbold2020-03-281-0/+26
* ingest: always try some lancet journalsBryan Newbold2020-03-191-0/+3
* entity worker: ingest more releasesBryan Newbold2020-02-221-1/+37
* always crawl researchgate DOIsBryan Newbold2020-02-181-0/+2
* add acceptlist override for biorxiv/medrxivBryan Newbold2020-02-101-2/+12
* fix KafkaError worker reporting for partition errorsBryan Newbold2020-01-292-2/+2
* additional DOI prefix filtersBryan Newbold2020-01-281-0/+8
* apply ingest request filtering in entity workerBryan Newbold2020-01-281-3/+34
* update ingest request schemaBryan Newbold2019-12-131-1/+1
* project -> ingest_request_sourceBryan Newbold2019-11-151-1/+1
* add ingest request feature to entity_updates workerBryan Newbold2019-11-151-4/+20
* review/fix all confluent-kafka produce codeBryan Newbold2019-09-202-12/+26
* small fixes to confluent-kafka importers/workersBryan Newbold2019-09-203-12/+41