aboutsummaryrefslogtreecommitdiffstats
path: root/python/sandcrawler/fileset_strategies.py
diff options
context:
space:
mode:
authorBryan Newbold <bnewbold@archive.org>2021-10-27 13:41:21 -0700
committerBryan Newbold <bnewbold@archive.org>2021-10-27 13:41:23 -0700
commit50270a9152c8e88e66187ce755920e35c31bd0b5 (patch)
treea412a64b8b0ac138155cdae805f3603c87a3c720 /python/sandcrawler/fileset_strategies.py
parent69cfb2c38f68fc009d6c7f5107fc36cd7168e69e (diff)
downloadsandcrawler-50270a9152c8e88e66187ce755920e35c31bd0b5.tar.gz
sandcrawler-50270a9152c8e88e66187ce755920e35c31bd0b5.zip
fileset: refactor out tables of helpers
Having these objects invoked in tables resulted in a whole bunch of objects (including children) getting initialized, which seems like the wrong thing to do. Defer this until the actual ingest fileset worker is initialized.
Diffstat (limited to 'python/sandcrawler/fileset_strategies.py')
-rw-r--r--python/sandcrawler/fileset_strategies.py8
1 files changed, 0 insertions, 8 deletions
diff --git a/python/sandcrawler/fileset_strategies.py b/python/sandcrawler/fileset_strategies.py
index 4e44d97..9d3bae3 100644
--- a/python/sandcrawler/fileset_strategies.py
+++ b/python/sandcrawler/fileset_strategies.py
@@ -294,11 +294,3 @@ class WebFileStrategy(WebFilesetStrategy):
super().__init__(**kwargs)
self.ingest_strategy = IngestStrategy.WebFile
self.success_status = "success-file"
-
-
-FILESET_STRATEGY_HELPER_TABLE = {
- IngestStrategy.ArchiveorgFileset: ArchiveorgFilesetStrategy(),
- IngestStrategy.ArchiveorgFile: ArchiveorgFileStrategy(),
- IngestStrategy.WebFileset: WebFilesetStrategy(),
- IngestStrategy.WebFile: WebFileStrategy(),
-}