aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
* cleanup tests; add one for double-processingBryan Newbold2018-04-102-20/+43
|
* TODO updatesBryan Newbold2018-04-103-18/+3
|
* wayback 404 testBryan Newbold2018-04-102-5/+49
|
* extraction test fixesBryan Newbold2018-04-102-27/+50
|
* grobid2json test fixesBryan Newbold2018-04-102-1/+3
|
* failing tests!Bryan Newbold2018-04-102-16/+51
|
* configs and README updatesBryan Newbold2018-04-074-5/+27
|
* nitsBryan Newbold2018-04-062-1/+2
|
* bug fixesBryan Newbold2018-04-061-7/+14
|
* updates to runningBryan Newbold2018-04-061-5/+14
|
* disable pig tests for nowBryan Newbold2018-04-062-7/+10
|
* try pig env againBryan Newbold2018-04-062-2/+4
|
* use IA mirror for pig downloadBryan Newbold2018-04-061-1/+2
|
* lint fixesBryan Newbold2018-04-066-19/+11
|
* fetch deps in pig scriptBryan Newbold2018-04-061-0/+1
|
* show coverageBryan Newbold2018-04-061-1/+1
|
* renamed do_teiBryan Newbold2018-04-061-3/+3
|
* switch to newer test imageBryan Newbold2018-04-061-1/+1
|
* temporarily skip pylint on extractionBryan Newbold2018-04-061-0/+3
|
* add pylint to CIBryan Newbold2018-04-065-41/+123
|
* iterate gitlab-ci.ymlBryan Newbold2018-04-061-3/+5
|
* add test for grobid2jsonBryan Newbold2018-04-061-0/+14
|
* coverage defaultsBryan Newbold2018-04-061-0/+3
|
* gitlab test scriptBryan Newbold2018-04-062-2/+20
|
* small grobid2json testBryan Newbold2018-04-064-2/+164
|
* make happybase mock injection slightly less horribleBryan Newbold2018-04-054-36/+31
|
* progress on extractorBryan Newbold2018-04-053-56/+93
|
* improve test coverageBryan Newbold2018-04-056-6/+39
|
* test coverage infoBryan Newbold2018-04-054-7/+67
|
* README/TODO updatesBryan Newbold2018-04-043-9/+20
|
* refactor out some common codeBryan Newbold2018-04-045-184/+133
|
* extraction -> mapreduceBryan Newbold2018-04-0414-0/+0
|
* merge backfill into extraction directoryBryan Newbold2018-04-0411-653/+27
|
* pep8Bryan Newbold2018-04-042-3/+3
|
* testing stuff as dev depsBryan Newbold2018-04-042-73/+109
|
* more testing depsBryan Newbold2018-04-042-8/+133
|
* trivial whitespaceBryan Newbold2018-04-042-1/+2
|
* more TODOBryan Newbold2018-04-042-0/+17
|
* more WIP on extractorBryan Newbold2018-04-045-52/+427
|
* add example XML output (open access)Bryan Newbold2018-04-031-0/+2004
|
* WIP on extractor-with-mrjobBryan Newbold2018-04-034-0/+954
|
* fix very important typoBryan Newbold2018-04-031-1/+1
|
* shift docs around a bitBryan Newbold2018-04-032-9/+12
|
* actually running hadoop job on clusterBryan Newbold2018-04-032-0/+18
|
* fix silly bugs in backfiller (need more tests)Bryan Newbold2018-04-031-3/+4
|
* add setuptools (can probably remove)Bryan Newbold2018-04-032-7/+8
|
* heritrix expects ints, not strings, for numbersBryan Newbold2018-04-021-7/+7
|
* backfill: sha1 prefix, cluster exampleBryan Newbold2018-03-303-8/+19
|
* clean up backfill code/testsBryan Newbold2018-03-302-24/+42
|
* refactor backfill for mrjobBryan Newbold2018-03-304-64/+145
|