diff options
author | Bryan Newbold <bnewbold@archive.org> | 2020-10-19 15:46:37 -0700 |
---|---|---|
committer | Bryan Newbold <bnewbold@archive.org> | 2020-10-19 15:46:39 -0700 |
commit | b672a6fe5b0e51f9d2844443bf9f7e82e1fd41b1 (patch) | |
tree | 82e03127ff94c9fb1c0d1807f9f76f367a0f37de /pig/hbase-count-rows.pig | |
parent | cc26ea975e29eefa2e2d3565c55ba0ac0a491bb7 (diff) | |
download | sandcrawler-b672a6fe5b0e51f9d2844443bf9f7e82e1fd41b1.tar.gz sandcrawler-b672a6fe5b0e51f9d2844443bf9f7e82e1fd41b1.zip |
CDX fetch: more permissive fuzzy/normalization check
This might the source of some `spn2-cdx-lookup-failure`.
Wayback/CDX does this check via full-on SURT, with many more changes,
and potentially we should be doing that here as well.
Diffstat (limited to 'pig/hbase-count-rows.pig')
0 files changed, 0 insertions, 0 deletions