aboutsummaryrefslogtreecommitdiffstats
BranchCommit messageAuthorAge
masterpytest: skip warning in gwbBryan Newbold16 months
bnewbold-refactor-logggingWIP: refactor logging calls in ingest pipelinesBryan Newbold21 months
trawlernotes on re-GROBID-ing (and re-extracting) some filesBryan Newbold2 years
bnewbold-persist-grobid-errorsgrobid persist: if status_code is not set, default to 0Bryan Newbold4 years
bnewbold-argsmake hbase_table and zookeeper_hosts CLI argsBryan Newbold6 years
bnewbold-backfillmake hbase_table and zookeeper_hosts CLI argsBryan Newbold6 years
 
 
AgeCommit messageAuthorFilesLines
2018-06-08make hbase_table and zookeeper_hosts CLI argsbnewbold-backfillBryan Newbold4-17/+32
2018-06-06Made test data more robust.Ellen Spertus1-2/+2
2018-06-06Removed copied comment.Ellen Spertus1-8/+1
2018-06-06Added job and test for counting mime types.Ellen Spertus2-0/+96
2018-06-05Made package names match directory names. Cleaned up imports.Ellen Spertus4-16/+13
2018-06-04Merge branch 'refactoring' into 'master'bnewbold4-20/+101
2018-06-04Merge branch 'bnewbold-scala-build-fixes' into 'master'bnewbold3-21/+19
2018-06-04Made changes suggested in merge request review.Ellen Spertus3-15/+10
2018-06-04try to run scala tests in gitlab CIBryan Newbold1-2/+12
2018-06-04fetch SpyGlass jar from archive.org (not local)Bryan Newbold2-19/+7
[...]
 
Clone
git@git.bnewbold.net:sandcrawler
https://git.bnewbold.net/sandcrawler
git://git.bnewbold.net/sandcrawler