summaryrefslogtreecommitdiffstats
path: root/fatcat_scholar/worker.py
Commit message (Collapse)AuthorAgeFilesLines
* lint fixes, and run fmtBryan Newbold2021-06-021-8/+2
|
* schema: add 'crossref' to bundle schema, and add from_json() helperBryan Newbold2021-06-021-13/+2
| | | | | from_json() refactor was an earlier TODO, to reduce duplication when updating fields on this class
* increase indexing timeout to 50sec (60sec for kafka)Bryan Newbold2021-02-151-2/+2
|
* enable sentry exceptions for workers and pipelinesBryan Newbold2021-01-301-1/+10
| | | | It is otherwise difficult to debug multi-million record pipelines.
* refactor ES configuration setting namesBryan Newbold2021-01-251-2/+2
|
* worker: switch to ES helper for bulk indexingBryan Newbold2021-01-181-8/+10
| | | | | This seems to resolve the problems with index workers failing after a couple hundred docs.
* worker: check for error responses from ESBryan Newbold2021-01-051-1/+4
|
* basic HTML transform/index supportBryan Newbold2020-11-181-0/+1
|
* initial fetch and index workersBryan Newbold2020-10-161-0/+238