aboutsummaryrefslogtreecommitdiffstats
path: root/TODO
blob: 5363428324e11717721f9da0fc3943fc23de7fa0 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14

scalding:
- less verbose sbt test output (set log level to WARN)
- auto-formatting: addSbtPlugin("com.geirsson" % "sbt-scalafmt" % "1.6.0-RC3")

pig:
- potentially want to *not* de-dupe CDX lines by uniq sha1 in all cases; run
  this as a second-stage filter? for example, may want many URL links in fatcat
  for a single file (different links, different policies)
- fix pig gitlab-ci tests (JAVA_HOME)

python:
- include input file name (and chunk? and CDX?) in sentry context
- how to get argument (like --hbase-table) into mrjob.conf, or similar?