Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | grobid persist: if status_code is not set, default to 0bnewbold-persist-grobid-errors | Bryan Newbold | 2020-01-28 | 1 | -0/+1 |
| | | | | | | | | | | | | | | | We have to set something currently because of a NOT NULL constraint on the table. Originally I thought we would just not record rows if there was an error, and that is still sort of a valid stance. However, when doing bulk GROBID-ing from cdx table, there exist some "bad" CDX rows which cause wayback or petabox errors. We should fix bugs or delete these rows as a cleanup, but until that happens we should record the error state so we don't loop forever. One danger of this commit is that we can clobber existing good rows with new errors rapidly if there is wayback downtime or something like that. | ||||
* | sql stats: typo fix | Bryan Newbold | 2020-01-28 | 1 | -1/+1 |
| | |||||
* | sql howto: database dumps | Bryan Newbold | 2020-01-28 | 1 | -0/+7 |
| | |||||
* | clarify ingest result schema and semantics | Bryan Newbold | 2020-01-15 | 1 | -0/+16 |
| | |||||
* | database stats | Bryan Newbold | 2020-01-14 | 2 | -0/+289 |
| | |||||
* | sql: more cool random queries | Bryan Newbold | 2020-01-02 | 1 | -0/+5 |
| | |||||
* | SQL docs update for diesel change | Bryan Newbold | 2020-01-02 | 2 | -0/+48 |
| | |||||
* | move SQL schema to diesel migration pattern | Bryan Newbold | 2020-01-02 | 5 | -70/+157 |
| | |||||
* | add some GROBID metadata schema docs to SQL schema | Bryan Newbold | 2019-12-11 | 1 | -0/+11 |
| | |||||
* | add note to CDX backfill script that we should be filtering (oops) | Bryan Newbold | 2019-11-12 | 1 | -0/+1 |
| | |||||
* | SQL stats and commands (mostly from sept 2019) | Bryan Newbold | 2019-11-12 | 4 | -0/+96 |
| | |||||
* | rename postgrest directory sql | Bryan Newbold | 2019-09-23 | 9 | -0/+768 |