aboutsummaryrefslogtreecommitdiffstats
path: root/python_hadoop/README.md
diff options
context:
space:
mode:
authorBryan Newbold <bnewbold@archive.org>2020-01-14 15:30:42 -0800
committerBryan Newbold <bnewbold@archive.org>2020-01-14 15:38:20 -0800
commit648f04bfdcf441ce4a396d09bdd0443b2a2ca51e (patch)
tree58553c0854e81e46df934b011be7e2d817c14319 /python_hadoop/README.md
parent49c4f4a4050a76e772f6ef9bf9ca544e2d54e2ab (diff)
downloadsandcrawler-648f04bfdcf441ce4a396d09bdd0443b2a2ca51e.tar.gz
sandcrawler-648f04bfdcf441ce4a396d09bdd0443b2a2ca51e.zip
basic FTP ingest support; revist record resolution
- supporting revisits means more wayback hits (fewer crawls) => faster - ... but this is only partial support. will also need to work through sandcrawler db schema, etc. current status should be safe to merge/use. - ftp support via treating an ftp hit as a 200
Diffstat (limited to 'python_hadoop/README.md')
0 files changed, 0 insertions, 0 deletions