aboutsummaryrefslogtreecommitdiffstats
path: root/notes/ingest/2020-05_oai_pmh.md
diff options
context:
space:
mode:
Diffstat (limited to 'notes/ingest/2020-05_oai_pmh.md')
-rw-r--r--notes/ingest/2020-05_oai_pmh.md13
1 files changed, 13 insertions, 0 deletions
diff --git a/notes/ingest/2020-05_oai_pmh.md b/notes/ingest/2020-05_oai_pmh.md
index de9bfba..fe22c75 100644
--- a/notes/ingest/2020-05_oai_pmh.md
+++ b/notes/ingest/2020-05_oai_pmh.md
@@ -192,6 +192,19 @@ And went from about 42,826,313 rows to 31,773,874 unique URLs to crawl, so
expecting at least 11,052,439 `no-capture` ingest results (and should probably
filter for these or even delete from the ingest request table).
+Ingest progress:
+
+ 2020-08-05 14:02: 32,571,018
+ 2020-08-06 13:49: 31,195,169
+ 2020-08-07 10:11: 29,986,169
+ 2020-08-10 10:43: 26,497,196
+ 2020-08-12 11:02: 23,811,845
+ 2020-08-17 13:34: 19,460,502
+ 2020-08-20 09:49: 15,069,507
+ 2020-08-25 09:56: 9,397,035
+ 2020-09-02 15:02: 305,889 (72k longest queue)
+ 2020-09-03 14:30: done
+
## Post-ingest stats
SELECT ingest_file_result.status, COUNT(*)