index
:
sandcrawler
bnewbold-args
bnewbold-backfill
bnewbold-persist-grobid-errors
bnewbold-refactor-loggging
master
trawler
[no description]
about
summary
refs
log
tree
commit
diff
stats
log msg
author
committer
range
path:
root
/
sql
Commit message (
Expand
)
Author
Age
Files
Lines
*
2021-12-02 database table size stats
Bryan Newbold
2021-12-07
1
-0
/
+22
*
sandcrawler SQL dump and upload updates
Bryan Newbold
2021-12-07
1
-4
/
+12
*
update fatcat_file SQL table schema, and add backfill notes
Bryan Newbold
2021-12-07
1
-1
/
+3
*
update fatcat_file SQL table schema, and add backfill notes
Bryan Newbold
2021-12-01
1
-0
/
+13
*
sandcrawler SQL stats
Bryan Newbold
2021-11-27
2
-12
/
+425
*
sql: grobid_refs table JSON as 'JSON' not 'JSONB'
Bryan Newbold
2021-11-04
1
-1
/
+1
*
record SQL table sizes at start of crossref re-ingest
Bryan Newbold
2021-11-04
1
-0
/
+19
*
add grobid_refs and crossref_with_refs to sandcrawler-db SQL schema
Bryan Newbold
2021-11-04
1
-0
/
+21
*
SPN reingest: 6 hour minimum, 6 month max
Bryan Newbold
2021-11-03
1
-2
/
+2
*
sql: fix typo in quarterly (not weekly) script
Bryan Newbold
2021-11-03
1
-1
/
+1
*
sql: fixes to ingest_fileset_platform schema (from table creation)
Bryan Newbold
2021-11-01
1
-6
/
+6
*
commit old ingest domain summary
Bryan Newbold
2021-10-15
1
-0
/
+345
*
sql fileset ingest table iteration
Bryan Newbold
2021-10-15
1
-12
/
+11
*
sql: initial ingest fileset table
Bryan Newbold
2021-10-15
1
-0
/
+38
*
sql: fix typo in CHECK statement
Bryan Newbold
2021-10-15
1
-1
/
+1
*
new SQL recent SPN request monitoring query
Bryan Newbold
2021-10-04
1
-0
/
+32
*
refactor reingest scripts
Bryan Newbold
2021-09-30
6
-150
/
+90
*
new 'daily' and 'priority' ingest request topics
Bryan Newbold
2021-09-30
2
-2
/
+2
*
reingest: skip spn2 'unknown' errors
Bryan Newbold
2021-07-21
2
-0
/
+2
*
crossref DB proposal, and include in SQL schema
Bryan Newbold
2021-06-02
1
-0
/
+7
*
sql: do periodically retry spn2-wayback-error
Bryan Newbold
2021-04-27
2
-2
/
+0
*
reingest scripts to run as sandcrawler
Bryan Newbold
2021-04-09
2
-12
/
+12
*
sql: notes on sql restore
Bryan Newbold
2021-04-09
1
-0
/
+9
*
sql: update paths to work with svc506 machine
Bryan Newbold
2021-04-09
12
-49
/
+49
*
sql: before/after pg13 table size stats
Bryan Newbold
2021-04-09
2
-1
/
+43
*
sql: update periodic retry/reingest scripts
Bryan Newbold
2021-04-09
4
-6
/
+14
*
SQL snapshot doc update
Bryan Newbold
2021-04-07
1
-2
/
+5
*
2021-04-07 sandcrawler DB stats
Bryan Newbold
2021-04-07
1
-0
/
+428
*
SQL: more ingest monitoring
Bryan Newbold
2020-11-16
3
-1
/
+660
*
tweak html_meta SQL schema
Bryan Newbold
2020-11-03
1
-2
/
+2
*
SQL: unmatched glutton query (old)
Bryan Newbold
2020-11-03
1
-0
/
+19
*
monitoring: past-7-days summary query
Bryan Newbold
2020-11-03
1
-0
/
+26
*
html: start on SQL table
Bryan Newbold
2020-11-03
1
-0
/
+15
*
SQL: update weekly/quarterly ingest retry scripts
Bryan Newbold
2020-10-21
5
-18
/
+119
*
sql stats: larger limits (more complete lists)
Bryan Newbold
2020-10-21
1
-8
/
+8
*
update SQL ingest monitoring commands to be past-month by default
Bryan Newbold
2020-10-17
1
-5
/
+5
*
dump_file_meta helper
Bryan Newbold
2020-10-01
1
-0
/
+12
*
updated sandcrawler-db stats
Bryan Newbold
2020-09-15
2
-6
/
+346
*
WIP weekly re-ingest script
Bryan Newbold
2020-08-17
2
-0
/
+97
*
grobid+pdftext missing catch-up commands
Bryan Newbold
2020-08-05
4
-10
/
+49
*
commit stats from a couple weeks back
Bryan Newbold
2020-08-05
1
-0
/
+347
*
sql stats commands updates
Bryan Newbold
2020-08-05
1
-2
/
+2
*
commented special modes for dump_unextracted_pdf.sql
Bryan Newbold
2020-06-25
1
-1
/
+4
*
pdftrio SQL queries
Bryan Newbold
2020-06-25
1
-0
/
+65
*
SQL commands for re-trying PDF ingests
Bryan Newbold
2020-06-25
1
-0
/
+158
*
unextracted PDF job dump command
Bryan Newbold
2020-06-25
1
-0
/
+16
*
tweak pdf_meta SQL schema
Bryan Newbold
2020-06-17
1
-0
/
+26
*
update sandcrawler stats for early may
Bryan Newbold
2020-05-04
1
-0
/
+418
*
more monitoring queries
Bryan Newbold
2020-03-30
1
-5
/
+29
*
make monitoring commands ingest_request local, not ingest_file_result
Bryan Newbold
2020-03-17
1
-2
/
+2
[next]