Commit message (Expand) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | fetch pdftotext and pdf_meta from blobs, postgrest | Bryan Newbold | 2020-06-29 | 1 | -18/+45 |
* | flake8 fixes (partial) | Bryan Newbold | 2020-06-03 | 1 | -5/+2 |
* | reformat python code with black | Bryan Newbold | 2020-06-03 | 1 | -68/+120 |
* | more petabox timeout handling | Bryan Newbold | 2020-05-21 | 1 | -0/+3 |
* | handle petabox read timeouts a bit | Bryan Newbold | 2020-05-21 | 1 | -1/+6 |
* | fix typo with UnicodeDecodeError catch | Bryan Newbold | 2020-05-21 | 1 | -1/+1 |
* | skip pdftotext loading on unicode error | Bryan Newbold | 2020-05-20 | 1 | -0/+2 |
* | skip SIM items w/o page_numbers (instead of asserting) | Bryan Newbold | 2020-05-20 | 1 | -1/+3 |
* | fixes from manual testing | Bryan Newbold | 2020-05-20 | 1 | -8/+13 |
* | local pdftotext cache dir hack | Bryan Newbold | 2020-05-20 | 1 | -1/+18 |
* | fixes to release+sim pipeline | Bryan Newbold | 2020-05-20 | 1 | -10/+16 |
* | first pass transform from pipelines to ES schema | Bryan Newbold | 2020-05-20 | 1 | -16/+1 |
* | WIP on SIM pipeline | Bryan Newbold | 2020-05-19 | 1 | -2/+2 |
* | WIP on release-to-sim fetching | Bryan Newbold | 2020-05-19 | 1 | -12/+49 |
* | initial progress on work pipeline | Bryan Newbold | 2020-05-16 | 1 | -0/+305 |