aboutsummaryrefslogtreecommitdiffstats
path: root/python/tests/test_grobid.py
diff options
context:
space:
mode:
authorBryan Newbold <bnewbold@archive.org>2020-03-18 18:49:05 -0700
committerBryan Newbold <bnewbold@archive.org>2020-03-18 18:49:09 -0700
commitcb16d18137c936a634b75bf0eb6acb43c77d9290 (patch)
tree4b8b72aa7cd1d5a9da81c6233ea10b6cdc837d2a /python/tests/test_grobid.py
parente1b3edd7af59fe0fd4272a4696387ea09a22a6c0 (diff)
downloadsandcrawler-cb16d18137c936a634b75bf0eb6acb43c77d9290.tar.gz
sandcrawler-cb16d18137c936a634b75bf0eb6acb43c77d9290.zip
implement (unused) force_get flag for SPN2
I hoped this feature would make it possible to crawl journals.lww.com PDFs, because the token URLs work with `wget`, but it still doesn't seem to work. Maybe because of user agent? Anyways, this feature might be useful for crawling efficiency, so adding to master.
Diffstat (limited to 'python/tests/test_grobid.py')
0 files changed, 0 insertions, 0 deletions