Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | pipenv: add pymupdf; update trafilatura | Bryan Newbold | 2021-12-15 | 1 | -419/+642 |
| | |||||
* | Revert "pipenv: update deps" | Bryan Newbold | 2021-12-01 | 1 | -573/+381 |
| | | | | | | This reverts commit 7a5b203dbb37958a452eb1be3bd1bf8ed94cbbce. There is a problem with `internetarchive` 2.2.0, so reverting for now. | ||||
* | pipenv: update deps | Bryan Newbold | 2021-12-01 | 1 | -381/+573 |
| | |||||
* | pipenv: bump grobid_tei_xml version to 0.1.2 | Bryan Newbold | 2021-11-04 | 1 | -10/+10 |
| | |||||
* | pipenv: flipflop from yapf back to black; more type packages; bump ↵ | Bryan Newbold | 2021-10-27 | 1 | -24/+103 |
| | | | | grobid_tei_xml | ||||
* | pipenv: import type annotations for requests and dateparser | Bryan Newbold | 2021-10-26 | 1 | -1/+17 |
| | |||||
* | pipenv: general update; add isort, yapf (over black), grobid_tei_xml | Bryan Newbold | 2021-10-26 | 1 | -723/+877 |
| | |||||
* | pipenv: lock minio S3 library to <7.0.0 | Bryan Newbold | 2021-01-14 | 1 | -241/+195 |
| | | | | | | | | | | | In this upstream commit: https://github.com/minio/minio-py/commit/b81883a98e6f8a09e2903609caabbf0956dd0ec9 The API for errors changes, which makes it harder for use to catch specific exceptions (such as "NoSuchKey" as a Not Found / 404 error). Instead of refactoring, just going to pin the library. We should probably remove this library for a non-implementation-specific S3 client at some point; minio seems simpler than, eg, boto3, but there is probably something ever simpler out there. | ||||
* | update to python3.8 | Bryan Newbold | 2021-01-05 | 1 | -399/+412 |
| | |||||
* | pipenv: updates (mostly for trafilatura 0.6.0) | Bryan Newbold | 2020-11-10 | 1 | -25/+32 |
| | |||||
* | remove unused pytype tool | Bryan Newbold | 2020-11-06 | 1 | -74/+22 |
| | | | | | Having trouble getting this to install on Xenial, and we aren't even using it in tests/lint yet. Can revisit after Focal upgrade. | ||||
* | pipenv: fix lock file; add zstandard; update wayback+gwb deps | Bryan Newbold | 2020-11-04 | 1 | -25/+1066 |
| | |||||
* | pipenv: braveblocker, dynaconf, sentry-sdk | Bryan Newbold | 2020-10-29 | 1 | -25/+31 |
| | |||||
* | new dependencies for HTML metadata parsing | Bryan Newbold | 2020-10-27 | 1 | -926/+176 |
| | |||||
* | pipenv: python-poppler 0.2.1 | Bryan Newbold | 2020-06-25 | 1 | -48/+50 |
| | |||||
* | pipenv: correct poppler; update lockfile | Bryan Newbold | 2020-06-16 | 1 | -75/+254 |
| | |||||
* | pipenv: remove old python3.5 cruft; add mypy | Bryan Newbold | 2020-05-26 | 1 | -178/+194 |
| | |||||
* | pipenv: update to python3.7 | Bryan Newbold | 2020-04-15 | 1 | -196/+201 |
| | |||||
* | pipenv: work around zipp issue | Bryan Newbold | 2020-03-10 | 1 | -4/+13 |
| | |||||
* | pipenv: add urlcanon; update pipefile.lock | Bryan Newbold | 2020-03-10 | 1 | -209/+220 |
| | |||||
* | update Pipfile to be xenial-compatible | Bryan Newbold | 2020-01-09 | 1 | -15/+23 |
| | |||||
* | pipenv: mock testing library | Bryan Newbold | 2020-01-09 | 1 | -309/+324 |
| | |||||
* | pipfile update | Bryan Newbold | 2019-09-25 | 1 | -233/+57 |
| | | | | | | | - remove hadoop stuff (mrjob, happybase, etc) - add flask - add pytest-pylint plugin - reformat (automatic by newer pipenv) | ||||
* | update Pipfile with additional libraries | Bryan Newbold | 2019-09-23 | 1 | -75/+285 |
| | |||||
* | large pipfile update | Bryan Newbold | 2019-09-04 | 1 | -375/+402 |
| | | | | | | | | | Covers some security changes, but might need to revert if this breaks things. Should use version locking in Pipefile better to prevent unintentional large upgrades, especially when we don't have good test coverage in this repo. | ||||
* | update Pipefile | Bryan Newbold | 2019-02-21 | 1 | -265/+219 |
| | |||||
* | add python-snappy dep | Bryan Newbold | 2018-12-10 | 1 | -84/+95 |
| | |||||
* | updated Pipfile.lock (VERY SLOW) | Bryan Newbold | 2018-11-21 | 1 | -548/+431 |
| | |||||
* | rename ./mapreduce to ./python | Bryan Newbold | 2018-08-24 | 1 | -0/+1142 |