aboutsummaryrefslogtreecommitdiffstats
path: root/proposals/2020_pdf_meta_thumbnails.md
diff options
context:
space:
mode:
Diffstat (limited to 'proposals/2020_pdf_meta_thumbnails.md')
-rw-r--r--proposals/2020_pdf_meta_thumbnails.md4
1 files changed, 2 insertions, 2 deletions
diff --git a/proposals/2020_pdf_meta_thumbnails.md b/proposals/2020_pdf_meta_thumbnails.md
index 793d6b5..141ece8 100644
--- a/proposals/2020_pdf_meta_thumbnails.md
+++ b/proposals/2020_pdf_meta_thumbnails.md
@@ -1,5 +1,5 @@
-status: work-in-progress
+status: deployed
New PDF derivatives: thumbnails, metadata, raw text
===================================================
@@ -133,7 +133,7 @@ Deployment will involve:
Plan for processing/catchup is:
- test with COVID-19 PDF corpus
-- run extraction on all current fatcat files avaiable via IA
+- run extraction on all current fatcat files available via IA
- integrate with ingest pipeline for all new files
- run a batch catchup job over all GROBID-parsed files with no pdf meta
extracted, on basis of SQL table query