diff options
author | Bryan Newbold <bnewbold@archive.org> | 2020-11-08 19:31:14 -0800 |
---|---|---|
committer | Bryan Newbold <bnewbold@archive.org> | 2020-11-08 19:35:15 -0800 |
commit | ecd36863e607e3c9e71fd91ece44a422f88dbe2e (patch) | |
tree | c9f06dcb7b6a3b1b24fa03b79088110cee811a8b /python_hadoop | |
parent | 0850b7fe7d5266ee0c4153b3e333d93eff164857 (diff) | |
download | sandcrawler-ecd36863e607e3c9e71fd91ece44a422f88dbe2e.tar.gz sandcrawler-ecd36863e607e3c9e71fd91ece44a422f88dbe2e.zip |
ingest: default to html_biblio for PDF URL extraction
Diffstat (limited to 'python_hadoop')
0 files changed, 0 insertions, 0 deletions