| Commit message (Collapse) | Author | Age | Files | Lines | |
|---|---|---|---|---|---|
| * | ingest file HTTP API: fixes from type checking | Bryan Newbold | 2021-10-26 | 1 | -3/+3 | 
| | | | | | | This code is deprecated and should be removed anyways, but still interesting to see the fixes | ||||
| * | more progress on type annotations | Bryan Newbold | 2021-10-26 | 8 | -34/+55 | 
| | | |||||
| * | grobid: fix a bug with consolidate_mode header, exposed by type annotations | Bryan Newbold | 2021-10-26 | 1 | -1/+2 | 
| | | |||||
| * | grobid: type annotations | Bryan Newbold | 2021-10-26 | 1 | -9/+19 | 
| | | |||||
| * | type annotations on SandcrawlerWorker | Bryan Newbold | 2021-10-26 | 1 | -46/+57 | 
| | | | | | | These annoations have a broad impact! Being conservative to start: Any-to-Any for process(), etc. | ||||
| * | more progress on type annotations and linting | Bryan Newbold | 2021-10-26 | 11 | -55/+87 | 
| | | |||||
| * | live tests: FTP wayback replay now returns 200, not 226 | Bryan Newbold | 2021-10-26 | 1 | -2/+2 | 
| | | |||||
| * | ia: more tweaks to delicate code to satisfy type checker | Bryan Newbold | 2021-10-26 | 1 | -10/+12 | 
| | | | | | | Ran the 'live' wayback tests after this commit as a check, and worked (once FTP status code behavior change is fixed) | ||||
| * | ia helpers: enforce max_redirects count correctly | Bryan Newbold | 2021-10-26 | 1 | -1/+1 | 
| | | | | | | AKA, should run fetch even if max_redirects = 0; the first loop iteration is not a redirect. | ||||
| * | set CDX request params are str, not int or datetime | Bryan Newbold | 2021-10-26 | 1 | -3/+6 | 
| | | | | | This might be a bugfix, changing CDX lookup behavior? | ||||
| * | bugfix: was setting 'from' parameter as a tuple, not a string | Bryan Newbold | 2021-10-26 | 1 | -1/+1 | 
| | | |||||
| * | start type annotating IA helper code | Bryan Newbold | 2021-10-26 | 1 | -37/+65 | 
| | | |||||
| * | start adding python type annotations to db and persist code | Bryan Newbold | 2021-10-26 | 2 | -97/+124 | 
| | | |||||
| * | Makefile: don't fail on isort error (consider these minor) | Bryan Newbold | 2021-10-26 | 1 | -1/+1 | 
| | | |||||
| * | tweak flake8 config | Bryan Newbold | 2021-10-26 | 1 | -2/+11 | 
| | | |||||
| * | flake8 clean (with current settings) | Bryan Newbold | 2021-10-26 | 9 | -25/+24 | 
| | | |||||
| * | pipenv: import type annotations for requests and dateparser | Bryan Newbold | 2021-10-26 | 2 | -1/+19 | 
| | | |||||
| * | start handling trivial lint cleanups: unused imports, 'is None', etc | Bryan Newbold | 2021-10-26 | 30 | -149/+86 | 
| | | |||||
| * | make fmt | Bryan Newbold | 2021-10-26 | 59 | -1225/+1582 | 
| | | |||||
| * | tweak lint/fmt settings | Bryan Newbold | 2021-10-26 | 2 | -4/+6 | 
| | | |||||
| * | update pytest warning filters (they are pretty expansive) | Bryan Newbold | 2021-10-26 | 1 | -0/+3 | 
| | | |||||
| * | ingest_html: update trafilatura TEI-XML output kwarg | Bryan Newbold | 2021-10-26 | 1 | -1/+1 | 
| | | |||||
| * | python: isort all imports | Bryan Newbold | 2021-10-26 | 57 | -178/+207 | 
| | | |||||
| * | add pyproject.toml (for isort and yapf config), and update 'lint' and 'fmt' ↵ | Bryan Newbold | 2021-10-26 | 2 | -3/+13 | 
| | | | | | make targets | ||||
| * | pipenv: general update; add isort, yapf (over black), grobid_tei_xml | Bryan Newbold | 2021-10-26 | 2 | -730/+880 | 
| | | |||||
| * | more small fileset ingest tweaks | Bryan Newbold | 2021-10-26 | 2 | -6/+21 | 
| | | |||||
| * | python: more aggressive gitignore | Bryan Newbold | 2021-10-15 | 1 | -0/+3 | 
| | | |||||
| * | persist support for ingest platform table, using existing persist worker | Bryan Newbold | 2021-10-15 | 2 | -2/+129 | 
| | | |||||
| * | improve fileset ingest integration with file ingest | Bryan Newbold | 2021-10-15 | 4 | -5/+25 | 
| | | |||||
| * | more fileset iteration | Bryan Newbold | 2021-10-15 | 5 | -45/+81 | 
| | | |||||
| * | move SPNv2 'simple_get' logic to SPN client | Bryan Newbold | 2021-10-15 | 3 | -52/+31 | 
| | | |||||
| * | filesets: iteration of implementation and docs | Bryan Newbold | 2021-10-15 | 4 | -82/+148 | 
| | | |||||
| * | fileset ingest: improve platform parsing | Bryan Newbold | 2021-10-15 | 1 | -12/+196 | 
| | | |||||
| * | fileset ingest: improve error handling | Bryan Newbold | 2021-10-15 | 4 | -48/+106 | 
| | | |||||
| * | initial implementation of zenodo platform import | Bryan Newbold | 2021-10-15 | 1 | -0/+100 | 
| | | |||||
| * | initial figshare platform helper | Bryan Newbold | 2021-10-15 | 1 | -0/+95 | 
| | | |||||
| * | improvements to platform helpers | Bryan Newbold | 2021-10-15 | 3 | -34/+44 | 
| | | |||||
| * | component ingest support for dataverse files (individual) | Bryan Newbold | 2021-10-15 | 2 | -13/+31 | 
| | | |||||
| * | progress on web ingest strategy | Bryan Newbold | 2021-10-15 | 3 | -12/+121 | 
| | | |||||
| * | fileset ingest progress for dataverse | Bryan Newbold | 2021-10-15 | 4 | -23/+291 | 
| | | |||||
| * | local-file version of gen_file_metadata | Bryan Newbold | 2021-10-15 | 3 | -3/+56 | 
| | | |||||
| * | progress on dataset ingest | Bryan Newbold | 2021-10-15 | 4 | -122/+333 | 
| | | |||||
| * | ingest tool: always require ingest type as part of 'single' command | Bryan Newbold | 2021-10-15 | 1 | -3/+3 | 
| | | |||||
| * | wrap up previous renaming work | Bryan Newbold | 2021-10-15 | 4 | -6/+4 | 
| | | |||||
| * | progress on fileset/dataset ingest | Bryan Newbold | 2021-10-15 | 4 | -0/+403 | 
| | | |||||
| * | scripts: example archiveorg-to-fileset importer | Bryan Newbold | 2021-10-15 | 1 | -0/+138 | 
| | | |||||
| * | refactoring; progress on filesets | Bryan Newbold | 2021-10-15 | 3 | -9/+27 | 
| | | |||||
| * | rename some python files for clarity | Bryan Newbold | 2021-10-15 | 3 | -0/+0 | 
| | | |||||
| * | pdf ingest: journals.uchicago.edu pattern | Bryan Newbold | 2021-10-11 | 1 | -0/+8 | 
| | | |||||
| * | spn: avoid 'None' job_id | Bryan Newbold | 2021-10-11 | 1 | -2/+2 | 
| | | | | | | | Thanks Vanglis for reporting these. Not sure this commit fixes *all* instances of the problem. | ||||
