aboutsummaryrefslogtreecommitdiffstats
path: root/notes/url_pattern_heuristic_backfill.txt
diff options
context:
space:
mode:
authorBryan Newbold <bnewbold@archive.org>2020-02-22 14:05:03 -0800
committerBryan Newbold <bnewbold@archive.org>2020-02-22 14:05:03 -0800
commit04cb1a01cbd1bc4f017ebd61d8b6732ea060ee44 (patch)
tree1a7b906c6d88355bc20759196dea6ca915da598a /notes/url_pattern_heuristic_backfill.txt
parentd08aac7381a392cecfe8931821df5e149b58f32a (diff)
downloadsandcrawler-04cb1a01cbd1bc4f017ebd61d8b6732ea060ee44.tar.gz
sandcrawler-04cb1a01cbd1bc4f017ebd61d8b6732ea060ee44.zip
ingest: skip more non-pdf, non-paper domains
Diffstat (limited to 'notes/url_pattern_heuristic_backfill.txt')
0 files changed, 0 insertions, 0 deletions