index
:
sandcrawler
bnewbold-args
bnewbold-backfill
bnewbold-persist-grobid-errors
bnewbold-refactor-loggging
master
trawler
[no description]
about
summary
refs
log
tree
commit
diff
stats
log msg
author
committer
range
path:
root
/
python
/
sandcrawler
/
pdftrio.py
Commit message (
Expand
)
Author
Age
Files
Lines
*
mypy lint fixes
Bryan Newbold
2023-01-04
1
-2
/
+2
*
codespell typos in python (comments)
Bryan Newbold
2021-11-24
1
-1
/
+1
*
pdftrio client: use HTTP session for POSTs
Bryan Newbold
2021-11-03
1
-1
/
+1
*
make fmt (black 21.9b0)
Bryan Newbold
2021-10-27
1
-36
/
+42
*
fix type annotations for petabox body fetch helper
Bryan Newbold
2021-10-26
1
-1
/
+2
*
more progress on type annotations
Bryan Newbold
2021-10-26
1
-1
/
+1
*
more progress on type annotations and linting
Bryan Newbold
2021-10-26
1
-9
/
+20
*
start handling trivial lint cleanups: unused imports, 'is None', etc
Bryan Newbold
2021-10-26
1
-1
/
+0
*
make fmt
Bryan Newbold
2021-10-26
1
-8
/
+2
*
python: isort all imports
Bryan Newbold
2021-10-26
1
-1
/
+2
*
differential wayback-error from wayback-content-error
Bryan Newbold
2020-10-21
1
-1
/
+0
*
workers: refactor to pass key to process()
Bryan Newbold
2020-06-17
1
-2
/
+2
*
refactor worker fetch code into wrapper class
Bryan Newbold
2020-06-16
1
-80
/
+14
*
pdftrio: tweaks to avoid connection errors
Bryan Newbold
2020-02-24
1
-1
/
+9
*
unpaywall2ingestrequest transform script
Bryan Newbold
2020-02-18
1
-1
/
+1
*
pdftrio: mode controlled by CLI arg
Bryan Newbold
2020-02-18
1
-4
/
+5
*
pdftrio: fix error nesting in pdftrio key
Bryan Newbold
2020-02-18
1
-12
/
+20
*
pdftrio fixes from testing
Bryan Newbold
2020-02-13
1
-3
/
+9
*
move pdf_trio results back under key in JSON/Kafka
Bryan Newbold
2020-02-13
1
-6
/
+22
*
pdftrio: small fixes from testing
Bryan Newbold
2020-02-12
1
-2
/
+2
*
pdftrio basic python code
Bryan Newbold
2020-02-12
1
-0
/
+158