From c23ccd1f2d03ad65ee83b8eca8c407d12ecd54e1 Mon Sep 17 00:00:00 2001 From: Bryan Newbold Date: Fri, 15 Jun 2018 00:41:33 +0000 Subject: doc improvements and fixes to 'please' helper --- scalding/README.md | 22 ++++++++++++---------- scalding/ia_cluster.conf | 0 2 files changed, 12 insertions(+), 10 deletions(-) create mode 100644 scalding/ia_cluster.conf (limited to 'scalding') diff --git a/scalding/README.md b/scalding/README.md index c40da5c..45b62d0 100644 --- a/scalding/README.md +++ b/scalding/README.md @@ -3,12 +3,19 @@ the JVM) using the Scalding framework. See the other markdown files in this directory for more background and tips. -## Building and Running +## Dependencies Locally, you need to have the JVM (eg, OpenJDK 1.8), `sbt` build tool, and might need (exactly) Scala version 2.11.8. -See section below on building and installing custom SpyGlass jar. +On a debian/ubuntu machine: + + echo "deb https://dl.bintray.com/sbt/debian /" | sudo tee -a /etc/apt/sources.list.d/sbt.list + sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 2EE0EA64E40A89B84B2DF73499E82A75642AC823 + sudo apt-get update + sudo apt install scala sbt + +## Building and Running Run tests: @@ -26,17 +33,12 @@ Run on cluster: com.twitter.scalding.Tool sandcrawler.HBaseRowCountJob --hdfs \ --app.conf.path thing.conf \ --output hdfs:///user/bnewbold/spyglass_out_test - + +## Troubleshooting + If your `sbt` task fails with this error: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Metaspace try restarting `sbt` with more memory (e.g., `sbt -mem 2048`). -## SpyGlass Jar - -SpyGlass is a "scalding-to-HBase" connector. It isn't maintained, so we needed -to rebuild to support our versions of HBase/scalding/etc. Our fork (including -build instructions) is at -(`bnewbold-scala2.11` branch); compiled .jar files are available from -. diff --git a/scalding/ia_cluster.conf b/scalding/ia_cluster.conf new file mode 100644 index 0000000..e69de29 -- cgit v1.2.3