diff options
author | Bryan Newbold <bnewbold@archive.org> | 2020-08-11 17:16:39 -0700 |
---|---|---|
committer | Bryan Newbold <bnewbold@archive.org> | 2020-08-11 17:16:39 -0700 |
commit | d5f0602e80847adf3d359a7fd06cc131c07cb6dd (patch) | |
tree | fa89dadd5f19d7a1c26069254748f5142f5fce06 /python_hadoop/tests/files/small.xml | |
parent | 5c7f9bc60b372006adac8e47ee2f4f1f73b84897 (diff) | |
download | sandcrawler-d5f0602e80847adf3d359a7fd06cc131c07cb6dd.tar.gz sandcrawler-d5f0602e80847adf3d359a7fd06cc131c07cb6dd.zip |
ingest: check for URL blocklist and cookie URL patterns on every hop
Diffstat (limited to 'python_hadoop/tests/files/small.xml')
0 files changed, 0 insertions, 0 deletions