diff options
author | Bryan Newbold <bnewbold@archive.org> | 2018-05-21 10:56:13 -0700 |
---|---|---|
committer | Bryan Newbold <bnewbold@archive.org> | 2018-05-21 10:56:13 -0700 |
commit | faad676833e1c4f39c0152fb53c23eb2dc83876c (patch) | |
tree | 99fad9ff5ad8797dc397f9152dbe55506b5cb3a5 | |
parent | 5d5b828730fdf34dcd2a6aeba64c7df2c1be23c5 (diff) | |
download | sandcrawler-faad676833e1c4f39c0152fb53c23eb2dc83876c.tar.gz sandcrawler-faad676833e1c4f39c0152fb53c23eb2dc83876c.zip |
copy in jvm ecosystem notes
-rw-r--r-- | jvm-mapreduce/learning.txt | 46 |
1 files changed, 46 insertions, 0 deletions
diff --git a/jvm-mapreduce/learning.txt b/jvm-mapreduce/learning.txt new file mode 100644 index 0000000..1fe64bd --- /dev/null +++ b/jvm-mapreduce/learning.txt @@ -0,0 +1,46 @@ + +## proof of concept on hadoop: + +This seemed to work: + + yarn jar tutorial/execution-tutorial/target/scala-2.11/execution-tutorial-assembly-0.18.0-SNAPSHOT.jar Tutorial1 --hdfs --input test_cdx --output test_scalding_out1 + +Or, with actual files on hadoop: + + yarn jar tutorial/execution-tutorial/target/scala-2.11/execution-tutorial-assembly-0.18.0-SNAPSHOT.jar Tutorial1 --hdfs --input hdfs:///user/bnewbold/dummy.txt --output hdfs:///user/bnewbold/test_scalding_out2 + +Horray! One issue with this was that building scalding took *forever* (meaning +30+ minutes). + +## sbt + +Uncommenting this line in scalding:build.sbt sped things way up (don't need to +run *all* the tests): + + // Uncomment if you don't want to run all the tests before building assembly + // test in assembly := {}, + +Also get the following error (in a different context): + + bnewbold@orithena$ sbt new typesafehub/scala-sbt + [info] Loading project definition from /home/bnewbold/src/scala-sbt.g8/project/project + [info] Compiling 1 Scala source to /home/bnewbold/src/scala-sbt.g8/project/project/target/scala-2.9.1/sbt-0.11.2/classes... + [error] error while loading CharSequence, class file '/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/lang/CharSequence.class)' is broken + [error] (bad constant pool tag 18 at byte 10) + [error] one error found + [error] {file:/home/bnewbold/src/scala-sbt.g8/project/project/}default-46da7b/compile:compile: Compilation failed + Project loading failed: (r)etry, (q)uit, (l)ast, or (i)gnore? + +## resources/tutorials + +Whole bunch of example commands (sbt, maven, gradle) to build scalding: + + https://medium.com/@gayani.nan/how-to-run-a-scalding-job-567160fa193 + +Also looks good: + + https://blog.matthewrathbone.com/2015/10/20/scalding-tutorial.html + +Possibly related: + + http://sujitpal.blogspot.com/2012/08/scalding-for-impatient.html |