aboutsummaryrefslogtreecommitdiffstats
path: root/python_hadoop/backfill_hbase_from_cdx.py
diff options
context:
space:
mode:
authorBryan Newbold <bnewbold@archive.org>2020-01-15 12:12:02 -0800
committerBryan Newbold <bnewbold@archive.org>2020-01-15 12:19:02 -0800
commit2d052b610ed02341aebab865f174671f8381146e (patch)
tree847f51dee899b9e4dd0930668f8f127c4fc9fa60 /python_hadoop/backfill_hbase_from_cdx.py
parent9c97db0ffcb2350a7231ab388c643d953d77274f (diff)
downloadsandcrawler-2d052b610ed02341aebab865f174671f8381146e.tar.gz
sandcrawler-2d052b610ed02341aebab865f174671f8381146e.zip
fix revisit resolution
Returns the *original* CDX record, but keeps the terminal_url and terminal_sha1hex info.
Diffstat (limited to 'python_hadoop/backfill_hbase_from_cdx.py')
0 files changed, 0 insertions, 0 deletions