diff options
author | Bryan Newbold <bnewbold@archive.org> | 2018-06-15 00:41:33 +0000 |
---|---|---|
committer | Bryan Newbold <bnewbold@archive.org> | 2018-06-15 00:41:33 +0000 |
commit | c23ccd1f2d03ad65ee83b8eca8c407d12ecd54e1 (patch) | |
tree | d70394e2b57e824abbcb7fff2c960c812d09da6d /mapreduce | |
parent | 5f4904158c07061edb6b3afd210d3b15dc946dab (diff) | |
download | sandcrawler-c23ccd1f2d03ad65ee83b8eca8c407d12ecd54e1.tar.gz sandcrawler-c23ccd1f2d03ad65ee83b8eca8c407d12ecd54e1.zip |
doc improvements and fixes to 'please' helper
Diffstat (limited to 'mapreduce')
-rw-r--r-- | mapreduce/README.md | 7 |
1 files changed, 4 insertions, 3 deletions
diff --git a/mapreduce/README.md b/mapreduce/README.md index b63e84b..aebc160 100644 --- a/mapreduce/README.md +++ b/mapreduce/README.md @@ -33,6 +33,7 @@ running on a devbox and GROBID running on a dedicated machine: Running from the cluster: # Create tarball of virtualenv + export PIPENV_VENV_IN_PROJECT=1 pipenv shell export VENVSHORT=`basename $VIRTUAL_ENV` tar -czf $VENVSHORT.tar.gz -C /home/bnewbold/.local/share/virtualenvs/$VENVSHORT . @@ -60,9 +61,9 @@ Actual invocation to run on Hadoop cluster (running on an IA devbox, where hadoop environment is configured): # Create tarball of virtualenv - pipenv shell - export VENVSHORT=`basename $VIRTUAL_ENV` - tar -czf $VENVSHORT.tar.gz -C /home/bnewbold/.local/share/virtualenvs/$VENVSHORT . + export PIPENV_VENV_IN_PROJECT=1 + pipenv install --deploy + tar -czf venv-current.tar.gz -C .venv . ./backfill_hbase_from_cdx.py \ --hbase-host wbgrp-svc263.us.archive.org \ |