aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorBryan Newbold <bnewbold@archive.org>2018-05-23 12:27:59 -0700
committerBryan Newbold <bnewbold@archive.org>2018-05-24 00:02:36 -0700
commit4ba428db30593b67283dd90b92141f99840dc78e (patch)
treef63c8e146e7f90a530abfebdb993ab45d57426d5
parent29e4a83ff76da07bc6ad5d3f49d746ee0bc72023 (diff)
downloadsandcrawler-4ba428db30593b67283dd90b92141f99840dc78e.tar.gz
sandcrawler-4ba428db30593b67283dd90b92141f99840dc78e.zip
rename jvm/scalding directories
-rw-r--r--jvm-mapreduce/TODO16
-rw-r--r--jvm-mapreduce/learning.txt55
-rw-r--r--scalding/.gitignore (renamed from scald-mvp/.gitignore)0
-rw-r--r--scalding/README.md (renamed from scald-mvp/README.md)0
-rw-r--r--scalding/build.sbt (renamed from scald-mvp/build.sbt)0
-rw-r--r--scalding/project/Dependencies.scala (renamed from scald-mvp/project/Dependencies.scala)0
-rw-r--r--scalding/project/build.properties (renamed from scald-mvp/project/build.properties)0
-rw-r--r--scalding/project/plugins.sbt (renamed from scald-mvp/project/plugins.sbt)0
-rw-r--r--scalding/src/main/scala/example/SimpleHBaseSourceExample.scala (renamed from scald-mvp/src/main/scala/example/SimpleHBaseSourceExample.scala)0
-rw-r--r--scalding/src/main/scala/example/WordCountJob.scala (renamed from scald-mvp/src/main/scala/example/WordCountJob.scala)0
-rw-r--r--scalding/src/main/scala/sandcrawler/HBaseRowCountJob.scala (renamed from scald-mvp/src/main/scala/sandcrawler/HBaseRowCountJob.scala)0
-rw-r--r--scalding/src/test/scala/example/SimpleHBaseSourceExampleTest.scala (renamed from scald-mvp/src/test/scala/example/SimpleHBaseSourceExampleTest.scala)0
-rw-r--r--scalding/src/test/scala/example/WordCountTest.scala (renamed from scald-mvp/src/test/scala/example/WordCountTest.scala)0
-rw-r--r--scalding/src/test/scala/sandcrawler/HBaseRowCountTest.scala (renamed from scald-mvp/src/test/scala/sandcrawler/HBaseRowCountTest.scala)0
14 files changed, 0 insertions, 71 deletions
diff --git a/jvm-mapreduce/TODO b/jvm-mapreduce/TODO
deleted file mode 100644
index 46b3b15..0000000
--- a/jvm-mapreduce/TODO
+++ /dev/null
@@ -1,16 +0,0 @@
-
-Libraries:
-- sbt? or gradle? (build tool)
- => debian packages: https://www.scala-sbt.org/download.html
- (or just a single .deb...)
-- scalding (mapreduce framework)
-- scala (java also fine?)
- => will scala work with java 1.7?
- => scala 2.11 (~2014) works with java 7; scala 2.12 and up require 8
- => debian stretch: scala 2.11.8-1
- => ubuntu xenial: scala/xenial 2.11.6-6
- => "Scalding works with Scala 2.10 and 2.11 is recommended"
-- testing
-- hbase connector library
- => maybe spyglass?
-- hbase mock
diff --git a/jvm-mapreduce/learning.txt b/jvm-mapreduce/learning.txt
deleted file mode 100644
index 6fe1442..0000000
--- a/jvm-mapreduce/learning.txt
+++ /dev/null
@@ -1,55 +0,0 @@
-
-## proof of concept on hadoop:
-
-This seemed to work:
-
- yarn jar tutorial/execution-tutorial/target/scala-2.11/execution-tutorial-assembly-0.18.0-SNAPSHOT.jar Tutorial1 --hdfs --input test_cdx --output test_scalding_out1
-
-Or, with actual files on hadoop:
-
- yarn jar tutorial/execution-tutorial/target/scala-2.11/execution-tutorial-assembly-0.18.0-SNAPSHOT.jar Tutorial1 --hdfs --input hdfs:///user/bnewbold/dummy.txt --output hdfs:///user/bnewbold/test_scalding_out2
-
-Horray! One issue with this was that building scalding took *forever* (meaning
-30+ minutes).
-
-potentially instead:
-
- hadoop jar scald-mvp-assembly-0.1.0-SNAPSHOT.jar com.twitter.scalding.Tool main.scala.example.WordCountJob --hdfs --input hdfs:///user/bnewbold/dummy.txt --output hdfs:///user/bnewbold/test_scalding_out2
-
-Hypothesis: class name should be same as file name. Don't need `main` function
-if using Scalding Tool wrapper jar. Don't need scald.rb.
-
- hadoop jar scald-mvp-assembly-0.1.0-SNAPSHOT.jar com.twitter.scalding.Tool example.WordCount --hdfs --input hdfs:///user/bnewbold/dummy.txt --output hdfs:///user/bnewbold/test_scalding_out2
-
-## sbt
-
-Uncommenting this line in scalding:build.sbt sped things way up (don't need to
-run *all* the tests):
-
- // Uncomment if you don't want to run all the tests before building assembly
- // test in assembly := {},
-
-Also get the following error (in a different context):
-
- bnewbold@orithena$ sbt new typesafehub/scala-sbt
- [info] Loading project definition from /home/bnewbold/src/scala-sbt.g8/project/project
- [info] Compiling 1 Scala source to /home/bnewbold/src/scala-sbt.g8/project/project/target/scala-2.9.1/sbt-0.11.2/classes...
- [error] error while loading CharSequence, class file '/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/rt.jar(java/lang/CharSequence.class)' is broken
- [error] (bad constant pool tag 18 at byte 10)
- [error] one error found
- [error] {file:/home/bnewbold/src/scala-sbt.g8/project/project/}default-46da7b/compile:compile: Compilation failed
- Project loading failed: (r)etry, (q)uit, (l)ast, or (i)gnore?
-
-## resources/tutorials
-
-Whole bunch of example commands (sbt, maven, gradle) to build scalding:
-
- https://medium.com/@gayani.nan/how-to-run-a-scalding-job-567160fa193
-
-Also looks good:
-
- https://blog.matthewrathbone.com/2015/10/20/scalding-tutorial.html
-
-Possibly related:
-
- http://sujitpal.blogspot.com/2012/08/scalding-for-impatient.html
diff --git a/scald-mvp/.gitignore b/scalding/.gitignore
index 7798ee0..7798ee0 100644
--- a/scald-mvp/.gitignore
+++ b/scalding/.gitignore
diff --git a/scald-mvp/README.md b/scalding/README.md
index e41e9ec..e41e9ec 100644
--- a/scald-mvp/README.md
+++ b/scalding/README.md
diff --git a/scald-mvp/build.sbt b/scalding/build.sbt
index aae8506..aae8506 100644
--- a/scald-mvp/build.sbt
+++ b/scalding/build.sbt
diff --git a/scald-mvp/project/Dependencies.scala b/scalding/project/Dependencies.scala
index 558929d..558929d 100644
--- a/scald-mvp/project/Dependencies.scala
+++ b/scalding/project/Dependencies.scala
diff --git a/scald-mvp/project/build.properties b/scalding/project/build.properties
index 31334bb..31334bb 100644
--- a/scald-mvp/project/build.properties
+++ b/scalding/project/build.properties
diff --git a/scald-mvp/project/plugins.sbt b/scalding/project/plugins.sbt
index 084d4bf..084d4bf 100644
--- a/scald-mvp/project/plugins.sbt
+++ b/scalding/project/plugins.sbt
diff --git a/scald-mvp/src/main/scala/example/SimpleHBaseSourceExample.scala b/scalding/src/main/scala/example/SimpleHBaseSourceExample.scala
index fe2a120..fe2a120 100644
--- a/scald-mvp/src/main/scala/example/SimpleHBaseSourceExample.scala
+++ b/scalding/src/main/scala/example/SimpleHBaseSourceExample.scala
diff --git a/scald-mvp/src/main/scala/example/WordCountJob.scala b/scalding/src/main/scala/example/WordCountJob.scala
index 0e63fed..0e63fed 100644
--- a/scald-mvp/src/main/scala/example/WordCountJob.scala
+++ b/scalding/src/main/scala/example/WordCountJob.scala
diff --git a/scald-mvp/src/main/scala/sandcrawler/HBaseRowCountJob.scala b/scalding/src/main/scala/sandcrawler/HBaseRowCountJob.scala
index 5df6b2e..5df6b2e 100644
--- a/scald-mvp/src/main/scala/sandcrawler/HBaseRowCountJob.scala
+++ b/scalding/src/main/scala/sandcrawler/HBaseRowCountJob.scala
diff --git a/scald-mvp/src/test/scala/example/SimpleHBaseSourceExampleTest.scala b/scalding/src/test/scala/example/SimpleHBaseSourceExampleTest.scala
index cf068c1..cf068c1 100644
--- a/scald-mvp/src/test/scala/example/SimpleHBaseSourceExampleTest.scala
+++ b/scalding/src/test/scala/example/SimpleHBaseSourceExampleTest.scala
diff --git a/scald-mvp/src/test/scala/example/WordCountTest.scala b/scalding/src/test/scala/example/WordCountTest.scala
index c42770f..c42770f 100644
--- a/scald-mvp/src/test/scala/example/WordCountTest.scala
+++ b/scalding/src/test/scala/example/WordCountTest.scala
diff --git a/scald-mvp/src/test/scala/sandcrawler/HBaseRowCountTest.scala b/scalding/src/test/scala/sandcrawler/HBaseRowCountTest.scala
index 598f45d..598f45d 100644
--- a/scald-mvp/src/test/scala/sandcrawler/HBaseRowCountTest.scala
+++ b/scalding/src/test/scala/sandcrawler/HBaseRowCountTest.scala