diff options
| author | Bryan Newbold <bnewbold@archive.org> | 2021-08-16 20:17:30 -0700 |
|---|---|---|
| committer | Bryan Newbold <bnewbold@archive.org> | 2021-08-16 20:17:30 -0700 |
| commit | e1cde3c95e5176f232ecbc22a8619149078dc91f (patch) | |
| tree | 2624b700015663272e5d9edd21d7bf180e3803b6 /kafka | |
| parent | 26d90505bda2d1dfcc25af6b8a0270faa11729e7 (diff) | |
| download | sandcrawler-e1cde3c95e5176f232ecbc22a8619149078dc91f.tar.gz sandcrawler-e1cde3c95e5176f232ecbc22a8619149078dc91f.zip | |
html ingest: detect some blog platforms, and allow lower wordcount threshold
Diffstat (limited to 'kafka')
0 files changed, 0 insertions, 0 deletions
