aboutsummaryrefslogtreecommitdiffstats
path: root/notes/tasks
Commit message (Expand)AuthorAgeFilesLines
* finished re-GROBID-ingBryan Newbold2022-05-031-5/+7
* PDF URL lists updateBryan Newbold2022-05-032-0/+76
* .ua crawling follow-up statsBryan Newbold2022-04-261-2/+2
* .ua ingest notesBryan Newbold2022-04-041-0/+29
* various ingest/task notesBryan Newbold2022-03-221-4/+4
* partial notes on .ua urgent crawlingBryan Newbold2022-03-111-0/+196
* enqueue PLATFORM PDFs for crawlBryan Newbold2022-01-071-0/+23
* document progress on re-GROBID-ingBryan Newbold2022-01-051-0/+89
* notes on re-GROBID-ing (and re-extracting) some filestrawlerBryan Newbold2021-12-091-0/+289
* wrap up crossref refs backfill notesBryan Newbold2021-11-101-0/+47
* update crossref/grobid refs generation notesBryan Newbold2021-11-041-4/+96
* grobid refs backfill progressBryan Newbold2021-11-041-1/+43
* start notes on crossref refs backfillBryan Newbold2021-11-041-0/+54
* old (2020) notes on pdfextract cleanupBryan Newbold2021-10-041-0/+74
* notes on dumping PDF URL lists for partnersBryan Newbold2021-10-041-0/+66
* notes on file_meta task (from august)Bryan Newbold2020-10-011-0/+66
* follow-up notes on processing 'holes'Bryan Newbold2020-09-021-0/+19
* grobid+pdftext missing catch-up commandsBryan Newbold2020-08-051-0/+101
* commit old notes on a one-off CDX table cleanupBryan Newbold2020-06-251-0/+34
* commit old (2020-02) pdftrio commandsBryan Newbold2020-06-251-0/+162
* update (and move) ingest notesBryan Newbold2020-03-033-294/+0
* ingest backfill notesBryan Newbold2020-02-243-0/+150
* add notes on recent ingest and backfill tasksBryan Newbold2020-02-053-0/+221