From 63186a218e2e10848d4b014eacbc4ad3a51a20ca Mon Sep 17 00:00:00 2001 From: Bryan Newbold Date: Thu, 29 Mar 2018 11:59:28 -0700 Subject: init repo --- .gitignore | 21 +++++++++++++++++++++ README.md | 9 +++++++++ 2 files changed, 30 insertions(+) create mode 100644 .gitignore create mode 100644 README.md diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..81a4762 --- /dev/null +++ b/.gitignore @@ -0,0 +1,21 @@ +*.o +*.a +*.pyc +#*# +*~ +*.swp +.* +*.tmp +*.old +*.profile +*.bkp +*.bak +[Tt]humbs.db +*.DS_Store +build/ +_build/ +src/build/ +*.log + +# Don't ignore this file itself +!.gitignore diff --git a/README.md b/README.md new file mode 100644 index 0000000..6ea387f --- /dev/null +++ b/README.md @@ -0,0 +1,9 @@ + +This repo contains hadoop tasks (mapreduce and pig), luigi jobs, and other +scripts and code for the journal ingest pipeline. + +This repository is potentially public. Maybe we'll rename it "sandcrawler"? + +Archive-specific deployment/production guides and ansible scripts at: +[journal-infra](https://git.archive.org/bnewbold/journal-infra) + -- cgit v1.2.3