aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorBryan Newbold <bnewbold@archive.org>2018-03-29 16:04:44 -0700
committerBryan Newbold <bnewbold@archive.org>2018-03-29 16:04:44 -0700
commitd2203182c9ed6e1ff13fa70fb25f049ef87c75a0 (patch)
treea888f769b6580f84225d7e7b3e88effe4e982acd
parente336b389b489bcefd601eba631c395d8a37d5ab3 (diff)
downloadsandcrawler-d2203182c9ed6e1ff13fa70fb25f049ef87c75a0.tar.gz
sandcrawler-d2203182c9ed6e1ff13fa70fb25f049ef87c75a0.zip
sandcrawler
-rw-r--r--README.md11
1 files changed, 9 insertions, 2 deletions
diff --git a/README.md b/README.md
index 6ea387f..1a251eb 100644
--- a/README.md
+++ b/README.md
@@ -1,8 +1,15 @@
+ _ _
+ _________ ___ __ _ _ __ __| | ___ _ __ __ ___ _| | ___ _ __
+ \ | / __|/ _` | '_ \ / _` |/ __| '__/ _` \ \ /\ / / |/ _ \ '__|
+ \ | \__ \ (_| | | | | (_| | (__| | | (_| |\ V V /| | __/ |
+ \@@@@@@| |___/\__,_|_| |_|\__,_|\___|_| \__,_| \_/\_/ |_|\___|_|
+
+
This repo contains hadoop tasks (mapreduce and pig), luigi jobs, and other
-scripts and code for the journal ingest pipeline.
+scripts and code for the internet archive (web group) journal ingest pipeline.
-This repository is potentially public. Maybe we'll rename it "sandcrawler"?
+This repository is potentially public.
Archive-specific deployment/production guides and ansible scripts at:
[journal-infra](https://git.archive.org/bnewbold/journal-infra)