aboutsummaryrefslogtreecommitdiffstats
path: root/notes/tasks
Commit message (Expand)AuthorAgeFilesLines
* enqueue PLATFORM PDFs for crawlBryan Newbold2022-01-071-0/+23
* document progress on re-GROBID-ingBryan Newbold2022-01-051-0/+89
* notes on re-GROBID-ing (and re-extracting) some filestrawlerBryan Newbold2021-12-091-0/+289
* wrap up crossref refs backfill notesBryan Newbold2021-11-101-0/+47
* update crossref/grobid refs generation notesBryan Newbold2021-11-041-4/+96
* grobid refs backfill progressBryan Newbold2021-11-041-1/+43
* start notes on crossref refs backfillBryan Newbold2021-11-041-0/+54
* old (2020) notes on pdfextract cleanupBryan Newbold2021-10-041-0/+74
* notes on dumping PDF URL lists for partnersBryan Newbold2021-10-041-0/+66
* notes on file_meta task (from august)Bryan Newbold2020-10-011-0/+66
* follow-up notes on processing 'holes'Bryan Newbold2020-09-021-0/+19
* grobid+pdftext missing catch-up commandsBryan Newbold2020-08-051-0/+101
* commit old notes on a one-off CDX table cleanupBryan Newbold2020-06-251-0/+34
* commit old (2020-02) pdftrio commandsBryan Newbold2020-06-251-0/+162
* update (and move) ingest notesBryan Newbold2020-03-033-294/+0
* ingest backfill notesBryan Newbold2020-02-243-0/+150
* add notes on recent ingest and backfill tasksBryan Newbold2020-02-053-0/+221