aboutsummaryrefslogtreecommitdiffstats
BranchCommit messageAuthorAge
masterpytest: skip warning in gwbBryan Newbold16 months
bnewbold-refactor-logggingWIP: refactor logging calls in ingest pipelinesBryan Newbold21 months
trawlernotes on re-GROBID-ing (and re-extracting) some filesBryan Newbold2 years
bnewbold-persist-grobid-errorsgrobid persist: if status_code is not set, default to 0Bryan Newbold4 years
bnewbold-argsmake hbase_table and zookeeper_hosts CLI argsBryan Newbold6 years
bnewbold-backfillmake hbase_table and zookeeper_hosts CLI argsBryan Newbold6 years
 
 
AgeCommit messageAuthorFilesLines
2018-06-08make hbase_table and zookeeper_hosts CLI argsbnewbold-argsBryan Newbold4-17/+32
2018-06-07Merge branch 'groupby' into 'master'bnewbold4-0/+178
2018-06-07Added status count.Ellen Spertus2-0/+89
2018-06-07Merge branch 'spertus-packages' into 'master'bnewbold4-16/+13
2018-06-06Made test data more robust.Ellen Spertus1-2/+2
2018-06-06Removed copied comment.Ellen Spertus1-8/+1
2018-06-06Added job and test for counting mime types.Ellen Spertus2-0/+96
2018-06-05Made package names match directory names. Cleaned up imports.Ellen Spertus4-16/+13
2018-06-04Merge branch 'refactoring' into 'master'bnewbold4-20/+101
2018-06-04Merge branch 'bnewbold-scala-build-fixes' into 'master'bnewbold3-21/+19
[...]
 
Clone
git@git.bnewbold.net:sandcrawler
https://git.bnewbold.net/sandcrawler
git://git.bnewbold.net/sandcrawler