aboutsummaryrefslogtreecommitdiffstats
path: root/pig/filter-cdx-join-urls.pig
diff options
context:
space:
mode:
authorBryan Newbold <bnewbold@archive.org>2020-11-04 18:10:00 -0800
committerBryan Newbold <bnewbold@archive.org>2020-11-04 18:10:00 -0800
commitde71aa92d4c7c9d14dfccc0188032d4e7b10090f (patch)
tree45e231fab99d4e5f576323dae8734ae71568c8f7 /pig/filter-cdx-join-urls.pig
parent2fdba24da0e0bf3d300cfb959514bf57a3cf6701 (diff)
downloadsandcrawler-de71aa92d4c7c9d14dfccc0188032d4e7b10090f.tar.gz
sandcrawler-de71aa92d4c7c9d14dfccc0188032d4e7b10090f.zip
html: actually publish HTML TEI-XML to body; fix dataflow though ingest a bit
Diffstat (limited to 'pig/filter-cdx-join-urls.pig')
0 files changed, 0 insertions, 0 deletions