diff options
author | Bryan Newbold <bnewbold@archive.org> | 2020-11-08 19:23:31 -0800 |
---|---|---|
committer | Bryan Newbold <bnewbold@archive.org> | 2020-11-08 19:23:36 -0800 |
commit | 0850b7fe7d5266ee0c4153b3e333d93eff164857 (patch) | |
tree | 08eaa9cb6420a67c6375d6fb1c8eaf27cd204f79 /notes/url_pattern_heuristic_verification.txt | |
parent | a8ff73617a16a8b8b524c454247bde2399f34bf1 (diff) | |
download | sandcrawler-0850b7fe7d5266ee0c4153b3e333d93eff164857.tar.gz sandcrawler-0850b7fe7d5266ee0c4153b3e333d93eff164857.zip |
ingest: shorted scope+platform keys; use html_biblio extraction for PDFs
Diffstat (limited to 'notes/url_pattern_heuristic_verification.txt')
0 files changed, 0 insertions, 0 deletions