aboutsummaryrefslogtreecommitdiffstats
path: root/backfill
Commit message (Expand)AuthorAgeFilesLines
* merge backfill into extraction directoryBryan Newbold2018-04-049-933/+0
* trivial whitespaceBryan Newbold2018-04-042-1/+2
* more TODOBryan Newbold2018-04-041-0/+5
* actually running hadoop job on clusterBryan Newbold2018-04-032-0/+18
* fix silly bugs in backfiller (need more tests)Bryan Newbold2018-04-031-3/+4
* add setuptools (can probably remove)Bryan Newbold2018-04-032-7/+8
* heritrix expects ints, not strings, for numbersBryan Newbold2018-04-021-7/+7
* backfill: sha1 prefix, cluster exampleBryan Newbold2018-03-303-8/+19
* clean up backfill code/testsBryan Newbold2018-03-302-24/+42
* refactor backfill for mrjobBryan Newbold2018-03-304-64/+145
* pytest helpersBryan Newbold2018-03-304-32/+564
* renamesBryan Newbold2018-03-303-0/+265