Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | tweak html_meta SQL schema | Bryan Newbold | 2020-11-03 | 1 | -2/+2 |
| | |||||
* | SQL: unmatched glutton query (old) | Bryan Newbold | 2020-11-03 | 1 | -0/+19 |
| | |||||
* | monitoring: past-7-days summary query | Bryan Newbold | 2020-11-03 | 1 | -0/+26 |
| | |||||
* | html: start on SQL table | Bryan Newbold | 2020-11-03 | 1 | -0/+15 |
| | |||||
* | SQL: update weekly/quarterly ingest retry scripts | Bryan Newbold | 2020-10-21 | 5 | -18/+119 |
| | |||||
* | sql stats: larger limits (more complete lists) | Bryan Newbold | 2020-10-21 | 1 | -8/+8 |
| | |||||
* | update SQL ingest monitoring commands to be past-month by default | Bryan Newbold | 2020-10-17 | 1 | -5/+5 |
| | |||||
* | dump_file_meta helper | Bryan Newbold | 2020-10-01 | 1 | -0/+12 |
| | |||||
* | updated sandcrawler-db stats | Bryan Newbold | 2020-09-15 | 2 | -6/+346 |
| | |||||
* | WIP weekly re-ingest script | Bryan Newbold | 2020-08-17 | 2 | -0/+97 |
| | |||||
* | grobid+pdftext missing catch-up commands | Bryan Newbold | 2020-08-05 | 4 | -10/+49 |
| | |||||
* | commit stats from a couple weeks back | Bryan Newbold | 2020-08-05 | 1 | -0/+347 |
| | |||||
* | sql stats commands updates | Bryan Newbold | 2020-08-05 | 1 | -2/+2 |
| | |||||
* | commented special modes for dump_unextracted_pdf.sql | Bryan Newbold | 2020-06-25 | 1 | -1/+4 |
| | |||||
* | pdftrio SQL queries | Bryan Newbold | 2020-06-25 | 1 | -0/+65 |
| | |||||
* | SQL commands for re-trying PDF ingests | Bryan Newbold | 2020-06-25 | 1 | -0/+158 |
| | |||||
* | unextracted PDF job dump command | Bryan Newbold | 2020-06-25 | 1 | -0/+16 |
| | |||||
* | tweak pdf_meta SQL schema | Bryan Newbold | 2020-06-17 | 1 | -0/+26 |
| | |||||
* | update sandcrawler stats for early may | Bryan Newbold | 2020-05-04 | 1 | -0/+418 |
| | |||||
* | more monitoring queries | Bryan Newbold | 2020-03-30 | 1 | -5/+29 |
| | |||||
* | make monitoring commands ingest_request local, not ingest_file_result | Bryan Newbold | 2020-03-17 | 1 | -2/+2 |
| | |||||
* | DOI prefix example queries (SQL) | Bryan Newbold | 2020-03-10 | 1 | -3/+17 |
| | |||||
* | helpful daily/weekly monitoring SQL queries | Bryan Newbold | 2020-03-10 | 1 | -0/+94 |
| | |||||
* | sandcrawler schema: add MD5 index | Bryan Newbold | 2020-03-05 | 1 | -0/+1 |
| | |||||
* | more SQL queries | Bryan Newbold | 2020-03-02 | 1 | -0/+57 |
| | |||||
* | recent sandcrawler-db / ingest stats (interesting) | Bryan Newbold | 2020-02-24 | 2 | -0/+488 |
| | |||||
* | dump_regrobid_pdf_petabox.sql script | Bryan Newbold | 2020-02-12 | 1 | -0/+15 |
| | |||||
* | sandcrawler-db extra stats | Bryan Newbold | 2020-02-12 | 1 | -0/+42 |
| | |||||
* | pdftrio proposal and start on schema+kafka | Bryan Newbold | 2020-02-12 | 1 | -0/+13 |
| | |||||
* | more random sandcrawler-db queries | Bryan Newbold | 2020-02-03 | 2 | -32/+62 |
| | |||||
* | more SQL commands | Bryan Newbold | 2020-02-02 | 1 | -0/+15 |
| | |||||
* | sql stats: typo fix | Bryan Newbold | 2020-01-28 | 1 | -1/+1 |
| | |||||
* | sql howto: database dumps | Bryan Newbold | 2020-01-28 | 1 | -0/+7 |
| | |||||
* | clarify ingest result schema and semantics | Bryan Newbold | 2020-01-15 | 1 | -0/+16 |
| | |||||
* | database stats | Bryan Newbold | 2020-01-14 | 2 | -0/+289 |
| | |||||
* | sql: more cool random queries | Bryan Newbold | 2020-01-02 | 1 | -0/+5 |
| | |||||
* | SQL docs update for diesel change | Bryan Newbold | 2020-01-02 | 2 | -0/+48 |
| | |||||
* | move SQL schema to diesel migration pattern | Bryan Newbold | 2020-01-02 | 5 | -70/+157 |
| | |||||
* | add some GROBID metadata schema docs to SQL schema | Bryan Newbold | 2019-12-11 | 1 | -0/+11 |
| | |||||
* | add note to CDX backfill script that we should be filtering (oops) | Bryan Newbold | 2019-11-12 | 1 | -0/+1 |
| | |||||
* | SQL stats and commands (mostly from sept 2019) | Bryan Newbold | 2019-11-12 | 4 | -0/+96 |
| | |||||
* | rename postgrest directory sql | Bryan Newbold | 2019-09-23 | 9 | -0/+768 |