aboutsummaryrefslogtreecommitdiffstats
path: root/proposals/2020_pdf_meta_thumbnails.md
diff options
context:
space:
mode:
authorBryan Newbold <bnewbold@archive.org>2021-11-24 16:01:47 -0800
committerBryan Newbold <bnewbold@archive.org>2021-11-24 16:01:51 -0800
commitd93d542adf9d26633b0f3cfa361277ca677c46f3 (patch)
treec133d3030746afe25300a2e12a7645407a89b623 /proposals/2020_pdf_meta_thumbnails.md
parentb4ca684c83d77a9fc6e7844ea8c45dfcb72aacb4 (diff)
downloadsandcrawler-d93d542adf9d26633b0f3cfa361277ca677c46f3.tar.gz
sandcrawler-d93d542adf9d26633b0f3cfa361277ca677c46f3.zip
codespell fixes in proposals
Diffstat (limited to 'proposals/2020_pdf_meta_thumbnails.md')
-rw-r--r--proposals/2020_pdf_meta_thumbnails.md2
1 files changed, 1 insertions, 1 deletions
diff --git a/proposals/2020_pdf_meta_thumbnails.md b/proposals/2020_pdf_meta_thumbnails.md
index 793d6b5..f231a7f 100644
--- a/proposals/2020_pdf_meta_thumbnails.md
+++ b/proposals/2020_pdf_meta_thumbnails.md
@@ -133,7 +133,7 @@ Deployment will involve:
Plan for processing/catchup is:
- test with COVID-19 PDF corpus
-- run extraction on all current fatcat files avaiable via IA
+- run extraction on all current fatcat files available via IA
- integrate with ingest pipeline for all new files
- run a batch catchup job over all GROBID-parsed files with no pdf meta
extracted, on basis of SQL table query