Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | switch from 'raven' to 'sentry-sdk' | Bryan Newbold | 2022-02-24 | 3 | -17/+11 |
| | |||||
* | add CDX sha1hex lookup/fetch helper script | Bryan Newbold | 2021-11-30 | 1 | -0/+170 |
| | |||||
* | remove grobid2json helper file, replace with grobid_tei_xml | Bryan Newbold | 2021-10-27 | 1 | -2/+4 |
| | |||||
* | make fmt (black 21.9b0) | Bryan Newbold | 2021-10-27 | 18 | -525/+601 |
| | |||||
* | make fmt | Bryan Newbold | 2021-10-26 | 19 | -186/+230 |
| | |||||
* | python: isort all imports | Bryan Newbold | 2021-10-26 | 19 | -43/+51 |
| | |||||
* | scripts: example archiveorg-to-fileset importer | Bryan Newbold | 2021-10-15 | 1 | -0/+138 |
| | |||||
* | cdx_collection.py: minor lint issue | Bryan Newbold | 2021-10-04 | 1 | -1/+1 |
| | |||||
* | another lowercase DOI in an (unused?) script | Bryan Newbold | 2021-07-13 | 1 | -1/+1 |
| | |||||
* | add cdx_collection.py python script (from scratch repo) | Bryan Newbold | 2021-05-04 | 1 | -0/+80 |
| | |||||
* | doaj ingest request updates (from prod) | Bryan Newbold | 2021-01-05 | 1 | -1/+5 |
| | |||||
* | blacklist -> denylist | Bryan Newbold | 2020-11-10 | 1 | -9/+9 |
| | |||||
* | DOAJ and HTML ingest tweaks from QA run | Bryan Newbold | 2020-11-10 | 1 | -1/+1 |
| | |||||
* | basic DOAJ ingest request conversion script | Bryan Newbold | 2020-11-08 | 1 | -0/+139 |
| | |||||
* | poppler: correct RGBA buffer endian-ness | Bryan Newbold | 2020-06-25 | 1 | -1/+1 |
| | |||||
* | pdf_thumbnail script: demonstrate PDF thumbnail generation | Bryan Newbold | 2020-06-16 | 1 | -0/+35 |
| | |||||
* | first iteration of oai2ingestrequest script | Bryan Newbold | 2020-05-05 | 1 | -0/+137 |
| | |||||
* | COVID-19 chinese paper ingest | Bryan Newbold | 2020-04-15 | 1 | -0/+83 |
| | |||||
* | unpaywall2ingestrequest: canonicalize URL | Bryan Newbold | 2020-04-07 | 1 | -1/+9 |
| | |||||
* | use local env in python scripts | Bryan Newbold | 2020-03-10 | 3 | -3/+3 |
| | | | | | Without this correct/canonical shebang invocation, virtualenvs (pipenv) don't work. | ||||
* | ingestrequest_row2json: skip on unicode errors | Bryan Newbold | 2020-03-05 | 1 | -1/+4 |
| | |||||
* | unpaywall2ingestrequest transform script | Bryan Newbold | 2020-02-18 | 1 | -0/+103 |
| | |||||
* | add ingestrequest_row2json.py | Bryan Newbold | 2020-02-05 | 1 | -0/+48 |
| | |||||
* | arabesque2ingestrequest: ingest type flag | Bryan Newbold | 2020-01-14 | 1 | -1/+4 |
| | |||||
* | basic arabesque2ingestrequest script | Bryan Newbold | 2019-12-24 | 1 | -0/+69 |
| | |||||
* | grobid_affiliations fix from prod, and usage example | Bryan Newbold | 2019-10-02 | 1 | -0/+5 |
| | |||||
* | deliver_dumpgrobid_to_s3: typo fix from old prod | Bryan Newbold | 2019-10-02 | 1 | -3/+4 |
| | |||||
* | grobid affiliation extractor (script) | Bryan Newbold | 2019-10-02 | 1 | -0/+47 |
| | |||||
* | move a bunch of random old scripts to subdir | Bryan Newbold | 2019-09-25 | 9 | -0/+1088 |