diff options
author | Bryan Newbold <bnewbold@archive.org> | 2021-09-03 10:37:37 -0700 |
---|---|---|
committer | Bryan Newbold <bnewbold@archive.org> | 2021-09-03 10:37:37 -0700 |
commit | 2ebef36c083b59d158fae7098da49bf972141f1c (patch) | |
tree | c66f18e72312ce5598c8355164a9dfbe241ef5bc /match_test_data/NOTES.txt | |
parent | d963a61ea3e4bf278fd62047b258722967cd20c9 (diff) | |
download | sandcrawler-2ebef36c083b59d158fae7098da49bf972141f1c.tar.gz sandcrawler-2ebef36c083b59d158fae7098da49bf972141f1c.zip |
HTML ingest: several more PDF fulltext URL patterns
Diffstat (limited to 'match_test_data/NOTES.txt')
0 files changed, 0 insertions, 0 deletions