index
:
sandcrawler
bnewbold-args
bnewbold-backfill
bnewbold-persist-grobid-errors
bnewbold-refactor-loggging
master
trawler
[no description]
about
summary
refs
log
tree
commit
diff
stats
log msg
author
committer
range
path:
root
/
sql
Commit message (
Expand
)
Author
Age
Files
Lines
*
commit old ingest domain summary
Bryan Newbold
2021-10-15
1
-0
/
+345
*
sql fileset ingest table iteration
Bryan Newbold
2021-10-15
1
-12
/
+11
*
sql: initial ingest fileset table
Bryan Newbold
2021-10-15
1
-0
/
+38
*
sql: fix typo in CHECK statement
Bryan Newbold
2021-10-15
1
-1
/
+1
*
new SQL recent SPN request monitoring query
Bryan Newbold
2021-10-04
1
-0
/
+32
*
refactor reingest scripts
Bryan Newbold
2021-09-30
6
-150
/
+90
*
new 'daily' and 'priority' ingest request topics
Bryan Newbold
2021-09-30
2
-2
/
+2
*
reingest: skip spn2 'unknown' errors
Bryan Newbold
2021-07-21
2
-0
/
+2
*
crossref DB proposal, and include in SQL schema
Bryan Newbold
2021-06-02
1
-0
/
+7
*
sql: do periodically retry spn2-wayback-error
Bryan Newbold
2021-04-27
2
-2
/
+0
*
reingest scripts to run as sandcrawler
Bryan Newbold
2021-04-09
2
-12
/
+12
*
sql: notes on sql restore
Bryan Newbold
2021-04-09
1
-0
/
+9
*
sql: update paths to work with svc506 machine
Bryan Newbold
2021-04-09
12
-49
/
+49
*
sql: before/after pg13 table size stats
Bryan Newbold
2021-04-09
2
-1
/
+43
*
sql: update periodic retry/reingest scripts
Bryan Newbold
2021-04-09
4
-6
/
+14
*
SQL snapshot doc update
Bryan Newbold
2021-04-07
1
-2
/
+5
*
2021-04-07 sandcrawler DB stats
Bryan Newbold
2021-04-07
1
-0
/
+428
*
SQL: more ingest monitoring
Bryan Newbold
2020-11-16
3
-1
/
+660
*
tweak html_meta SQL schema
Bryan Newbold
2020-11-03
1
-2
/
+2
*
SQL: unmatched glutton query (old)
Bryan Newbold
2020-11-03
1
-0
/
+19
*
monitoring: past-7-days summary query
Bryan Newbold
2020-11-03
1
-0
/
+26
*
html: start on SQL table
Bryan Newbold
2020-11-03
1
-0
/
+15
*
SQL: update weekly/quarterly ingest retry scripts
Bryan Newbold
2020-10-21
5
-18
/
+119
*
sql stats: larger limits (more complete lists)
Bryan Newbold
2020-10-21
1
-8
/
+8
*
update SQL ingest monitoring commands to be past-month by default
Bryan Newbold
2020-10-17
1
-5
/
+5
*
dump_file_meta helper
Bryan Newbold
2020-10-01
1
-0
/
+12
*
updated sandcrawler-db stats
Bryan Newbold
2020-09-15
2
-6
/
+346
*
WIP weekly re-ingest script
Bryan Newbold
2020-08-17
2
-0
/
+97
*
grobid+pdftext missing catch-up commands
Bryan Newbold
2020-08-05
4
-10
/
+49
*
commit stats from a couple weeks back
Bryan Newbold
2020-08-05
1
-0
/
+347
*
sql stats commands updates
Bryan Newbold
2020-08-05
1
-2
/
+2
*
commented special modes for dump_unextracted_pdf.sql
Bryan Newbold
2020-06-25
1
-1
/
+4
*
pdftrio SQL queries
Bryan Newbold
2020-06-25
1
-0
/
+65
*
SQL commands for re-trying PDF ingests
Bryan Newbold
2020-06-25
1
-0
/
+158
*
unextracted PDF job dump command
Bryan Newbold
2020-06-25
1
-0
/
+16
*
tweak pdf_meta SQL schema
Bryan Newbold
2020-06-17
1
-0
/
+26
*
update sandcrawler stats for early may
Bryan Newbold
2020-05-04
1
-0
/
+418
*
more monitoring queries
Bryan Newbold
2020-03-30
1
-5
/
+29
*
make monitoring commands ingest_request local, not ingest_file_result
Bryan Newbold
2020-03-17
1
-2
/
+2
*
DOI prefix example queries (SQL)
Bryan Newbold
2020-03-10
1
-3
/
+17
*
helpful daily/weekly monitoring SQL queries
Bryan Newbold
2020-03-10
1
-0
/
+94
*
sandcrawler schema: add MD5 index
Bryan Newbold
2020-03-05
1
-0
/
+1
*
more SQL queries
Bryan Newbold
2020-03-02
1
-0
/
+57
*
recent sandcrawler-db / ingest stats (interesting)
Bryan Newbold
2020-02-24
2
-0
/
+488
*
dump_regrobid_pdf_petabox.sql script
Bryan Newbold
2020-02-12
1
-0
/
+15
*
sandcrawler-db extra stats
Bryan Newbold
2020-02-12
1
-0
/
+42
*
pdftrio proposal and start on schema+kafka
Bryan Newbold
2020-02-12
1
-0
/
+13
*
more random sandcrawler-db queries
Bryan Newbold
2020-02-03
2
-32
/
+62
*
more SQL commands
Bryan Newbold
2020-02-02
1
-0
/
+15
*
sql stats: typo fix
Bryan Newbold
2020-01-28
1
-1
/
+1
[next]