| Commit message (Collapse) | Author | Age | Files | Lines | |
|---|---|---|---|---|---|
| * | ingest: OAI-PMH count table | Bryan Newbold | 2020-05-28 | 1 | -0/+24 |
| | | |||||
| * | ingest notes | Bryan Newbold | 2020-05-26 | 2 | -6/+76 |
| | | |||||
| * | potential future backfill ingests | Bryan Newbold | 2020-05-26 | 1 | -0/+52 |
| | | |||||
| * | ingests: normalize file names; commit updates | Bryan Newbold | 2020-05-26 | 10 | -63/+279 |
| | | |||||
| * | summarize datacite and MAG 2020 crawls | Bryan Newbold | 2020-05-05 | 2 | -0/+200 |
| | | |||||
| * | update MAG crawl notes | Bryan Newbold | 2020-04-28 | 1 | -0/+71 |
| | | |||||
| * | COVID-19 chinese paper ingest | Bryan Newbold | 2020-04-15 | 1 | -0/+73 |
| | | |||||
| * | 2020-04 unpaywall ingest (in progress) | Bryan Newbold | 2020-04-15 | 1 | -0/+63 |
| | | |||||
| * | 2020-04 datacite ingest (in progress) | Bryan Newbold | 2020-04-15 | 1 | -0/+18 |
| | | |||||
| * | partial notes on S2 crawl ingest | Bryan Newbold | 2020-04-15 | 1 | -0/+35 |
| | | |||||
| * | MAG import notes | Bryan Newbold | 2020-04-13 | 1 | -0/+13 |
| | | |||||
| * | MAG 2020-03-04 ingest notes to date | Bryan Newbold | 2020-04-06 | 1 | -0/+395 |
| | | |||||
| * | unpaywall ingest notes update | Bryan Newbold | 2020-03-30 | 1 | -0/+138 |
| | | |||||
| * | unpaywall large ingest notes | Bryan Newbold | 2020-03-17 | 1 | -0/+10 |
| | | |||||
| * | more unpaywall ingest notes | Bryan Newbold | 2020-03-05 | 1 | -0/+416 |
| | | |||||
| * | update (and move) ingest notes | Bryan Newbold | 2020-03-03 | 6 | -0/+480 |
| | | |||||
| * | ingest backfill notes | Bryan Newbold | 2020-02-24 | 3 | -0/+150 |
| | | |||||
| * | jan 2020 bulk ingest notes | Bryan Newbold | 2020-02-12 | 1 | -0/+26 |
| | | |||||
| * | add notes on recent ingest and backfill tasks | Bryan Newbold | 2020-02-05 | 3 | -0/+221 |
| | | |||||
| * | hadoop job log rename and update | Bryan Newbold | 2019-12-27 | 1 | -0/+25 |
| | | |||||
| * | update job log with pig runs | Bryan Newbold | 2019-12-26 | 1 | -0/+10 |
| | | |||||
| * | updated re-GROBID job log entry | Bryan Newbold | 2019-11-15 | 1 | -0/+31 |
| | | |||||
| * | ingest/backfill notes | Bryan Newbold | 2019-11-13 | 3 | -0/+47 |
| | | |||||
| * | notes about running 'regrobid' batches manually (not kafka) | Bryan Newbold | 2019-11-13 | 1 | -0/+41 |
| | | |||||
| * | commit old notes about munging GROBID output | Bryan Newbold | 2019-11-13 | 1 | -0/+70 |
| | | |||||
| * | old groupworks job log | Bryan Newbold | 2019-09-20 | 1 | -0/+8 |
| | | |||||
| * | petabox journal files ingest updates | Bryan Newbold | 2019-06-20 | 1 | -0/+25 |
| | | |||||
| * | clearer CDX munge notes | Bryan Newbold | 2019-05-09 | 1 | -1/+1 |
| | | |||||
| * | give sort way more RAM by default | Bryan Newbold | 2019-02-01 | 3 | -6/+6 |
| | | |||||
| * | match_filter_enrich notes | Bryan Newbold | 2019-01-03 | 1 | -0/+12 |
| | | |||||
| * | notes on file-level metadata dump | Bryan Newbold | 2018-12-19 | 1 | -0/+31 |
| | | |||||
| * | update notes | Bryan Newbold | 2018-12-10 | 1 | -1/+14 |
| | | |||||
| * | match_filter_enrich: fix typo | Bryan Newbold | 2018-09-22 | 1 | -1/+1 |
| | | |||||
| * | match and enrich notes+script | Bryan Newbold | 2018-09-14 | 1 | -0/+19 |
| | | |||||
| * | crude job stats/metrics in a text file | Bryan Newbold | 2018-08-27 | 1 | -0/+95 |
| | | |||||
| * | update TODO | Bryan Newbold | 2018-08-24 | 1 | -0/+10 |
| | | |||||
| * | commit notes from my laptop | Bryan Newbold | 2018-08-24 | 6 | -0/+256 |
