diff options
Diffstat (limited to 'scratch/README.md')
-rw-r--r-- | scratch/README.md | 10 |
1 files changed, 0 insertions, 10 deletions
diff --git a/scratch/README.md b/scratch/README.md deleted file mode 100644 index 4c3fa65..0000000 --- a/scratch/README.md +++ /dev/null @@ -1,10 +0,0 @@ -# PySpark Test Run - -* 2020-04-02 - -Goal: We want to understand, which URLs of the citation corpus have been -preserved. Also we want the GWB URL if possible. We'll try pyspark. - -Our cluster runs Hadoop 2.6, so we'll try: - - $ PYSPARK_HADOOP_VERSION=2.7 pip install pyspark |