Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | pipenv: add pymupdf; update trafilatura | Bryan Newbold | 2021-12-15 | 1 | -1/+2 |
| | |||||
* | Revert "pipenv: update deps" | Bryan Newbold | 2021-12-01 | 1 | -1/+1 |
| | | | | | | This reverts commit 7a5b203dbb37958a452eb1be3bd1bf8ed94cbbce. There is a problem with `internetarchive` 2.2.0, so reverting for now. | ||||
* | pipenv: update deps | Bryan Newbold | 2021-12-01 | 1 | -1/+1 |
| | |||||
* | pipenv: bump grobid_tei_xml version to 0.1.2 | Bryan Newbold | 2021-11-04 | 1 | -1/+1 |
| | |||||
* | pipenv: flipflop from yapf back to black; more type packages; bump ↵ | Bryan Newbold | 2021-10-27 | 1 | -3/+9 |
| | | | | grobid_tei_xml | ||||
* | pipenv: import type annotations for requests and dateparser | Bryan Newbold | 2021-10-26 | 1 | -0/+2 |
| | |||||
* | pipenv: general update; add isort, yapf (over black), grobid_tei_xml | Bryan Newbold | 2021-10-26 | 1 | -7/+3 |
| | |||||
* | pipenv: lock minio S3 library to <7.0.0 | Bryan Newbold | 2021-01-14 | 1 | -1/+1 |
| | | | | | | | | | | | In this upstream commit: https://github.com/minio/minio-py/commit/b81883a98e6f8a09e2903609caabbf0956dd0ec9 The API for errors changes, which makes it harder for use to catch specific exceptions (such as "NoSuchKey" as a Not Found / 404 error). Instead of refactoring, just going to pin the library. We should probably remove this library for a non-implementation-specific S3 client at some point; minio seems simpler than, eg, boto3, but there is probably something ever simpler out there. | ||||
* | update to python3.8 | Bryan Newbold | 2021-01-05 | 1 | -1/+1 |
| | |||||
* | remove unused pytype tool | Bryan Newbold | 2020-11-06 | 1 | -1/+3 |
| | | | | | Having trouble getting this to install on Xenial, and we aren't even using it in tests/lint yet. Can revisit after Focal upgrade. | ||||
* | pipenv: fix lock file; add zstandard; update wayback+gwb deps | Bryan Newbold | 2020-11-04 | 1 | -2/+3 |
| | |||||
* | pipenv: braveblocker, dynaconf, sentry-sdk | Bryan Newbold | 2020-10-29 | 1 | -4/+3 |
| | |||||
* | new dependencies for HTML metadata parsing | Bryan Newbold | 2020-10-27 | 1 | -0/+8 |
| | |||||
* | pipenv: python-poppler 0.2.1 | Bryan Newbold | 2020-06-25 | 1 | -1/+1 |
| | |||||
* | pipenv: correct poppler; update lockfile | Bryan Newbold | 2020-06-16 | 1 | -1/+1 |
| | |||||
* | pipenv: flake8, pytype, black | Bryan Newbold | 2020-06-16 | 1 | -0/+7 |
| | |||||
* | pipenv: pillow and poppler (for PDF extraction) | Bryan Newbold | 2020-06-16 | 1 | -0/+2 |
| | |||||
* | pipenv: remove old python3.5 cruft; add mypy | Bryan Newbold | 2020-05-26 | 1 | -7/+2 |
| | |||||
* | pipenv: update to python3.7 | Bryan Newbold | 2020-04-15 | 1 | -1/+1 |
| | |||||
* | pipenv: work around zipp issue | Bryan Newbold | 2020-03-10 | 1 | -0/+3 |
| | |||||
* | pipenv: add urlcanon; update pipefile.lock | Bryan Newbold | 2020-03-10 | 1 | -0/+1 |
| | |||||
* | update Pipfile to be xenial-compatible | Bryan Newbold | 2020-01-09 | 1 | -1/+3 |
| | |||||
* | pipenv: mock testing library | Bryan Newbold | 2020-01-09 | 1 | -0/+1 |
| | |||||
* | pipfile update | Bryan Newbold | 2019-09-25 | 1 | -8/+13 |
| | | | | | | | - remove hadoop stuff (mrjob, happybase, etc) - add flask - add pytest-pylint plugin - reformat (automatic by newer pipenv) | ||||
* | update Pipfile with additional libraries | Bryan Newbold | 2019-09-23 | 1 | -2/+8 |
| | |||||
* | update Pipefile | Bryan Newbold | 2019-02-21 | 1 | -1/+1 |
| | |||||
* | add GWB-to-S3 delivery pipeline script | Bryan Newbold | 2019-02-19 | 1 | -0/+1 |
| | |||||
* | add python-snappy dep | Bryan Newbold | 2018-12-10 | 1 | -0/+1 |
| | |||||
* | initial work on kafka_grobid worker | Bryan Newbold | 2018-11-20 | 1 | -0/+1 |
| | |||||
* | rename ./mapreduce to ./python | Bryan Newbold | 2018-08-24 | 1 | -0/+30 |