diff options
author | Bryan Newbold <bnewbold@archive.org> | 2022-05-03 17:12:48 -0700 |
---|---|---|
committer | Bryan Newbold <bnewbold@archive.org> | 2022-05-03 17:12:48 -0700 |
commit | 00ae74378413e87f230c88113ff8163a6f969d63 (patch) | |
tree | 16cdcbde7a002704e80f494b7fd13fc5c19dd695 /blobs | |
parent | ef0421567dd67a248d0f92f32ad4e14ae0776920 (diff) | |
download | sandcrawler-00ae74378413e87f230c88113ff8163a6f969d63.tar.gz sandcrawler-00ae74378413e87f230c88113ff8163a6f969d63.zip |
switch default kafka-broker host from wbgrp-svc263 to wbgrp-svc350
Diffstat (limited to 'blobs')
-rw-r--r-- | blobs/tasks.md | 4 |
1 files changed, 2 insertions, 2 deletions
diff --git a/blobs/tasks.md b/blobs/tasks.md index 34dec8f..beb765f 100644 --- a/blobs/tasks.md +++ b/blobs/tasks.md @@ -19,7 +19,7 @@ didn't try to connect to postgresql. Commands: - ./sandcrawler_worker.py --kafka-hosts wbgrp-svc263.us.archive.org:9092 --env prod --s3-bucket sandcrawler --s3-url wbgrp-svc169.us.archive.org:8333 persist-grobid --s3-only + ./sandcrawler_worker.py --kafka-hosts wbgrp-svc350.us.archive.org:9092 --env prod --s3-bucket sandcrawler --s3-url wbgrp-svc169.us.archive.org:8333 persist-grobid --s3-only => Consuming from kafka topic sandcrawler-prod.grobid-output-pg, group persist-grobid-seaweed => run briefly, then kill @@ -29,7 +29,7 @@ On kafka-broker worker: Then run 2x instances of worker (same command as above): - ./sandcrawler_worker.py --kafka-hosts wbgrp-svc263.us.archive.org:9092 --env prod --s3-bucket sandcrawler --s3-url wbgrp-svc169.us.archive.org:8333 persist-grobid --s3-only + ./sandcrawler_worker.py --kafka-hosts wbgrp-svc350.us.archive.org:9092 --env prod --s3-bucket sandcrawler --s3-url wbgrp-svc169.us.archive.org:8333 persist-grobid --s3-only At this point CPU-limited on this worker by the python processes (only 4 cores on this machine). |