diff options
author | Bryan Newbold <bnewbold@archive.org> | 2020-10-30 17:33:37 -0700 |
---|---|---|
committer | Bryan Newbold <bnewbold@archive.org> | 2020-10-30 17:33:37 -0700 |
commit | cefbc6fa46e6586d8735f40b3b5432a759edd5f1 (patch) | |
tree | 8f9d0aaa8ac4ab09a3fa7b8891bede586aa953db /scalding/src/main/resources/slug-denylist.txt | |
parent | e61d6e8cc3b6824816a83dff56ffbdbbb6329e57 (diff) | |
download | sandcrawler-cefbc6fa46e6586d8735f40b3b5432a759edd5f1.tar.gz sandcrawler-cefbc6fa46e6586d8735f40b3b5432a759edd5f1.zip |
html: syntax fixes; resolve relative URLs; extract more XML fulltext URLs
Diffstat (limited to 'scalding/src/main/resources/slug-denylist.txt')
0 files changed, 0 insertions, 0 deletions