aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
-rw-r--r--python/TODO8
1 files changed, 7 insertions, 1 deletions
diff --git a/python/TODO b/python/TODO
index 6b05646..89cec83 100644
--- a/python/TODO
+++ b/python/TODO
@@ -1 +1,7 @@
-- refactor extractor common code into a shared file
+
+ingest crawler:
+- SPNv2 only
+ - remove most SPNv1/v2 path selection
+- landing page + fulltext hops only (short recursion depth)
+- use wayback client library instead of requests to fetch content
+