diff options
-rw-r--r-- | .gitignore | 21 | ||||
-rw-r--r-- | README.md | 9 |
2 files changed, 30 insertions, 0 deletions
diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..81a4762 --- /dev/null +++ b/.gitignore @@ -0,0 +1,21 @@ +*.o +*.a +*.pyc +#*# +*~ +*.swp +.* +*.tmp +*.old +*.profile +*.bkp +*.bak +[Tt]humbs.db +*.DS_Store +build/ +_build/ +src/build/ +*.log + +# Don't ignore this file itself +!.gitignore diff --git a/README.md b/README.md new file mode 100644 index 0000000..6ea387f --- /dev/null +++ b/README.md @@ -0,0 +1,9 @@ + +This repo contains hadoop tasks (mapreduce and pig), luigi jobs, and other +scripts and code for the journal ingest pipeline. + +This repository is potentially public. Maybe we'll rename it "sandcrawler"? + +Archive-specific deployment/production guides and ansible scripts at: +[journal-infra](https://git.archive.org/bnewbold/journal-infra) + |