diff options
author | Bryan Newbold <bnewbold@archive.org> | 2020-08-08 16:00:36 -0700 |
---|---|---|
committer | Bryan Newbold <bnewbold@archive.org> | 2020-08-08 16:55:08 -0700 |
commit | c19b73f13b021a6d3026d0526b7dfa7a9fdda3a6 (patch) | |
tree | f3c593557fbce0e712b1e8d98f0ddcf663da9a4f /pig/filter-cdx-tarball.pig | |
parent | 0aa723392c1c72a354731aa21c06c55adeacab30 (diff) | |
download | sandcrawler-c19b73f13b021a6d3026d0526b7dfa7a9fdda3a6.tar.gz sandcrawler-c19b73f13b021a6d3026d0526b7dfa7a9fdda3a6.zip |
rwth-aachen.de HTML extract, and a generic URL guess method
Diffstat (limited to 'pig/filter-cdx-tarball.pig')
0 files changed, 0 insertions, 0 deletions