diff options
author | Bryan Newbold <bnewbold@archive.org> | 2022-05-03 17:12:48 -0700 |
---|---|---|
committer | Bryan Newbold <bnewbold@archive.org> | 2022-05-03 17:12:48 -0700 |
commit | 00ae74378413e87f230c88113ff8163a6f969d63 (patch) | |
tree | 16cdcbde7a002704e80f494b7fd13fc5c19dd695 /python/pdftrio_tool.py | |
parent | ef0421567dd67a248d0f92f32ad4e14ae0776920 (diff) | |
download | sandcrawler-00ae74378413e87f230c88113ff8163a6f969d63.tar.gz sandcrawler-00ae74378413e87f230c88113ff8163a6f969d63.zip |
switch default kafka-broker host from wbgrp-svc263 to wbgrp-svc350
Diffstat (limited to 'python/pdftrio_tool.py')
-rwxr-xr-x | python/pdftrio_tool.py | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/python/pdftrio_tool.py b/python/pdftrio_tool.py index 9d3010e..24b749d 100755 --- a/python/pdftrio_tool.py +++ b/python/pdftrio_tool.py @@ -5,7 +5,7 @@ text extraction. Example of large parallel run, locally: -cat /srv/sandcrawler/tasks/something.cdx | pv -l | parallel -j30 --pipe ./pdftrio_tool.py --kafka-env prod --kafka-hosts wbgrp-svc263.us.archive.org:9092,wbgrp-svc284.us.archive.org:9092,wbgrp-svc285.us.archive.org:9092 --kafka-mode --pdftrio-host http://localhost:3939 -j0 classify-pdf-json - +cat /srv/sandcrawler/tasks/something.cdx | pv -l | parallel -j30 --pipe ./pdftrio_tool.py --kafka-env prod --kafka-hosts wbgrp-svc350.us.archive.org:9092,wbgrp-svc284.us.archive.org:9092,wbgrp-svc285.us.archive.org:9092 --kafka-mode --pdftrio-host http://localhost:3939 -j0 classify-pdf-json - """ import argparse |