aboutsummaryrefslogtreecommitdiffstats
path: root/sql
Commit message (Collapse)AuthorAgeFilesLines
* grobid persist: if status_code is not set, default to 0bnewbold-persist-grobid-errorsBryan Newbold2020-01-281-0/+1
| | | | | | | | | | | | | | | We have to set something currently because of a NOT NULL constraint on the table. Originally I thought we would just not record rows if there was an error, and that is still sort of a valid stance. However, when doing bulk GROBID-ing from cdx table, there exist some "bad" CDX rows which cause wayback or petabox errors. We should fix bugs or delete these rows as a cleanup, but until that happens we should record the error state so we don't loop forever. One danger of this commit is that we can clobber existing good rows with new errors rapidly if there is wayback downtime or something like that.
* sql stats: typo fixBryan Newbold2020-01-281-1/+1
|
* sql howto: database dumpsBryan Newbold2020-01-281-0/+7
|
* clarify ingest result schema and semanticsBryan Newbold2020-01-151-0/+16
|
* database statsBryan Newbold2020-01-142-0/+289
|
* sql: more cool random queriesBryan Newbold2020-01-021-0/+5
|
* SQL docs update for diesel changeBryan Newbold2020-01-022-0/+48
|
* move SQL schema to diesel migration patternBryan Newbold2020-01-025-70/+157
|
* add some GROBID metadata schema docs to SQL schemaBryan Newbold2019-12-111-0/+11
|
* add note to CDX backfill script that we should be filtering (oops)Bryan Newbold2019-11-121-0/+1
|
* SQL stats and commands (mostly from sept 2019)Bryan Newbold2019-11-124-0/+96
|
* rename postgrest directory sqlBryan Newbold2019-09-239-0/+768