aboutsummaryrefslogtreecommitdiffstats
path: root/python/sandcrawler/pdftrio.py
Commit message (Expand)AuthorAgeFilesLines
* mypy lint fixesBryan Newbold2023-01-041-2/+2
* codespell typos in python (comments)Bryan Newbold2021-11-241-1/+1
* pdftrio client: use HTTP session for POSTsBryan Newbold2021-11-031-1/+1
* make fmt (black 21.9b0)Bryan Newbold2021-10-271-36/+42
* fix type annotations for petabox body fetch helperBryan Newbold2021-10-261-1/+2
* more progress on type annotationsBryan Newbold2021-10-261-1/+1
* more progress on type annotations and lintingBryan Newbold2021-10-261-9/+20
* start handling trivial lint cleanups: unused imports, 'is None', etcBryan Newbold2021-10-261-1/+0
* make fmtBryan Newbold2021-10-261-8/+2
* python: isort all importsBryan Newbold2021-10-261-1/+2
* differential wayback-error from wayback-content-errorBryan Newbold2020-10-211-1/+0
* workers: refactor to pass key to process()Bryan Newbold2020-06-171-2/+2
* refactor worker fetch code into wrapper classBryan Newbold2020-06-161-80/+14
* pdftrio: tweaks to avoid connection errorsBryan Newbold2020-02-241-1/+9
* unpaywall2ingestrequest transform scriptBryan Newbold2020-02-181-1/+1
* pdftrio: mode controlled by CLI argBryan Newbold2020-02-181-4/+5
* pdftrio: fix error nesting in pdftrio keyBryan Newbold2020-02-181-12/+20
* pdftrio fixes from testingBryan Newbold2020-02-131-3/+9
* move pdf_trio results back under key in JSON/KafkaBryan Newbold2020-02-131-6/+22
* pdftrio: small fixes from testingBryan Newbold2020-02-121-2/+2
* pdftrio basic python codeBryan Newbold2020-02-121-0/+158