aboutsummaryrefslogtreecommitdiffstats
path: root/python/sandcrawler/grobid.py
Commit message (Expand)AuthorAgeFilesLines
* grobid: disable biblio-glutton consolidationBryan Newbold2021-04-071-3/+3
* differential wayback-error from wayback-content-errorBryan Newbold2020-10-211-1/+0
* workers: refactor to pass key to process()Bryan Newbold2020-06-171-2/+2
* refactor worker fetch code into wrapper classBryan Newbold2020-06-161-60/+9
* timeout message implementation for GROBID and ingest workersBryan Newbold2020-04-271-0/+9
* grobid petabox: fix fetch body/contentBryan Newbold2020-02-031-1/+1
* grobid worker: catch PetaboxError alsoBryan Newbold2020-01-281-2/+2
* grobid worker: always set a key in responseBryan Newbold2020-01-281-4/+25
* grobid: fix error_msg typo; set status_code for timeoutsBryan Newbold2020-01-211-1/+2
* add 200 second timeout to GROBID requestsBryan Newbold2020-01-171-8/+15
* grobid worker fixes for newer ia lib refactorsBryan Newbold2020-01-141-3/+9
* fix grobid tests for new wayback refactorsBryan Newbold2020-01-091-3/+3
* be more parsimonious with GROBID metadataBryan Newbold2020-01-021-2/+4
* fixes for large GROBID result skipBryan Newbold2019-12-021-2/+2
* count empty blobs as 'failed' instead of crashingBryan Newbold2019-12-011-1/+2
* cleanup unused importBryan Newbold2019-12-011-1/+0
* filter out very large GROBID XML bodiesBryan Newbold2019-12-011-0/+6
* much progress on file ingest pathBryan Newbold2019-10-221-0/+14
* we do actually want consolidateHeader=2, not 1Bryan Newbold2019-10-041-3/+3
* grobid: consolidateHeaders typoBryan Newbold2019-10-041-1/+1
* disable citation consolidation by defaultBryan Newbold2019-10-041-1/+1
* fix GROBID POST flagsBryan Newbold2019-10-041-1/+3
* handle GROBID fetch empty blob conditionBryan Newbold2019-10-031-1/+2
* have grobidworker error status indicate issues instead of bailingBryan Newbold2019-10-021-4/+13
* more counts and bugfixes in grobid_toolBryan Newbold2019-09-261-4/+0
* small improvements to GROBID toolBryan Newbold2019-09-261-0/+4
* lots of grobid tool implementation (still WIP)Bryan Newbold2019-09-261-3/+63
* start refactoring sandcrawler python common codeBryan Newbold2019-09-231-0/+44