diff options
author | Bryan Newbold <bnewbold@archive.org> | 2020-11-04 18:10:00 -0800 |
---|---|---|
committer | Bryan Newbold <bnewbold@archive.org> | 2020-11-04 18:10:00 -0800 |
commit | de71aa92d4c7c9d14dfccc0188032d4e7b10090f (patch) | |
tree | 45e231fab99d4e5f576323dae8734ae71568c8f7 /pig/filter-cdx-join-urls.pig | |
parent | 2fdba24da0e0bf3d300cfb959514bf57a3cf6701 (diff) | |
download | sandcrawler-de71aa92d4c7c9d14dfccc0188032d4e7b10090f.tar.gz sandcrawler-de71aa92d4c7c9d14dfccc0188032d4e7b10090f.zip |
html: actually publish HTML TEI-XML to body; fix dataflow though ingest a bit
Diffstat (limited to 'pig/filter-cdx-join-urls.pig')
0 files changed, 0 insertions, 0 deletions