Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | db (postgrest): actually use an HTTP session | Bryan Newbold | 2021-11-04 | 1 | -12/+24 |
| | | | | Not as important with GET as POST, I think, but still best practice. | ||||
* | glue, utils, and worker code for crossref and grobid_refs | Bryan Newbold | 2021-11-04 | 1 | -3/+106 |
| | |||||
* | make fmt (black 21.9b0) | Bryan Newbold | 2021-10-27 | 1 | -112/+151 |
| | |||||
* | lint collection membership (last lint for now) | Bryan Newbold | 2021-10-26 | 1 | -1/+1 |
| | |||||
* | more progress on type annotations | Bryan Newbold | 2021-10-26 | 1 | -3/+3 |
| | |||||
* | more progress on type annotations and linting | Bryan Newbold | 2021-10-26 | 1 | -11/+11 |
| | |||||
* | start adding python type annotations to db and persist code | Bryan Newbold | 2021-10-26 | 1 | -95/+120 |
| | |||||
* | make fmt | Bryan Newbold | 2021-10-26 | 1 | -82/+68 |
| | |||||
* | python: isort all imports | Bryan Newbold | 2021-10-26 | 1 | -1/+2 |
| | |||||
* | persist support for ingest platform table, using existing persist worker | Bryan Newbold | 2021-10-15 | 1 | -1/+67 |
| | |||||
* | add crossref postgrest fetch support to python db helpers | Bryan Newbold | 2021-06-02 | 1 | -0/+9 |
| | |||||
* | update default postgrest ('db') API endpoint | Bryan Newbold | 2021-04-09 | 1 | -1/+1 |
| | |||||
* | tweak html_meta SQL schema | Bryan Newbold | 2020-11-03 | 1 | -12/+19 |
| | |||||
* | html: start on SQL table | Bryan Newbold | 2020-11-03 | 1 | -0/+44 |
| | |||||
* | fixes and tweaks from testing locally | Bryan Newbold | 2020-06-17 | 1 | -0/+47 |
| | |||||
* | pdf_trio persist fixes from prod | Bryan Newbold | 2020-02-19 | 1 | -4/+4 |
| | |||||
* | include rel and oa_status in ingest request 'extra' | Bryan Newbold | 2020-02-18 | 1 | -1/+1 |
| | |||||
* | pdftrio basic python code | Bryan Newbold | 2020-02-12 | 1 | -0/+57 |
| | | | | This is basically just a copy/paste of GROBID code, only simpler! | ||||
* | fix bug where ingest_request extra fields not persisted | Bryan Newbold | 2020-02-05 | 1 | -1/+2 |
| | |||||
* | persist grobid: actually, status_code is required | Bryan Newbold | 2020-01-21 | 1 | -1/+1 |
| | | | | | | | Instead of working around when missing, force it to exist but skip in database insert section. Disk mode still needs to check if blank. | ||||
* | persist: work around GROBID timeouts with no status_code | Bryan Newbold | 2020-01-21 | 1 | -1/+1 |
| | |||||
* | persist: fix dupe field copying | Bryan Newbold | 2020-01-15 | 1 | -1/+8 |
| | | | | | | In testing hit: AttributeError: 'str' object has no attribute 'get' | ||||
* | persist worker: implement updated ingest result semantics | Bryan Newbold | 2020-01-15 | 1 | -1/+1 |
| | |||||
* | small fixups to SandcrawlerPostgrestClient | Bryan Newbold | 2020-01-14 | 1 | -1/+10 |
| | |||||
* | db: move duplicate row filtering into DB insert helpers | Bryan Newbold | 2020-01-02 | 1 | -0/+25 |
| | |||||
* | fix DB import counting | Bryan Newbold | 2020-01-02 | 1 | -4/+5 |
| | |||||
* | fix small errors found by pylint | Bryan Newbold | 2020-01-02 | 1 | -1/+1 |
| | |||||
* | db: fancy insert/update separation using postgres xmax | Bryan Newbold | 2020-01-02 | 1 | -15/+30 |
| | |||||
* | improve DB helpers | Bryan Newbold | 2020-01-02 | 1 | -26/+81 |
| | | | | | - return insert/update row counts - implement ON CONFLICT ... DO UPDATE on some tables | ||||
* | start work on DB connector and minio client | Bryan Newbold | 2020-01-02 | 1 | -0/+141 |