diff options
| author | Bryan Newbold <bnewbold@archive.org> | 2020-01-14 15:30:42 -0800 | 
|---|---|---|
| committer | Bryan Newbold <bnewbold@archive.org> | 2020-01-14 15:38:20 -0800 | 
| commit | 648f04bfdcf441ce4a396d09bdd0443b2a2ca51e (patch) | |
| tree | 58553c0854e81e46df934b011be7e2d817c14319 /python_hadoop/tests/files/small.json | |
| parent | 49c4f4a4050a76e772f6ef9bf9ca544e2d54e2ab (diff) | |
| download | sandcrawler-648f04bfdcf441ce4a396d09bdd0443b2a2ca51e.tar.gz sandcrawler-648f04bfdcf441ce4a396d09bdd0443b2a2ca51e.zip | |
basic FTP ingest support; revist record resolution
- supporting revisits means more wayback hits (fewer crawls) => faster
- ... but this is only partial support. will also need to work through
  sandcrawler db schema, etc. current status should be safe to merge/use.
- ftp support via treating an ftp hit as a 200
Diffstat (limited to 'python_hadoop/tests/files/small.json')
0 files changed, 0 insertions, 0 deletions
