diff options
| author | Bryan Newbold <bnewbold@archive.org> | 2020-08-11 17:16:39 -0700 | 
|---|---|---|
| committer | Bryan Newbold <bnewbold@archive.org> | 2020-08-11 17:16:39 -0700 | 
| commit | d5f0602e80847adf3d359a7fd06cc131c07cb6dd (patch) | |
| tree | fa89dadd5f19d7a1c26069254748f5142f5fce06 /python_hadoop/tests/files/example_grobid_metadata.json | |
| parent | 5c7f9bc60b372006adac8e47ee2f4f1f73b84897 (diff) | |
| download | sandcrawler-d5f0602e80847adf3d359a7fd06cc131c07cb6dd.tar.gz sandcrawler-d5f0602e80847adf3d359a7fd06cc131c07cb6dd.zip | |
ingest: check for URL blocklist and cookie URL patterns on every hop
Diffstat (limited to 'python_hadoop/tests/files/example_grobid_metadata.json')
0 files changed, 0 insertions, 0 deletions
