Mode | Name | Size | |
---|---|---|---|
-rw-r--r-- | .coveragerc | 32 | logstatsplain |
-rw-r--r-- | .gitignore | 29 | logstatsplain |
-rw-r--r-- | .pylintrc | 327 | logstatsplain |
-rw-r--r-- | Pipfile | 508 | logstatsplain |
-rw-r--r-- | Pipfile.lock | 55017 | logstatsplain |
-rw-r--r-- | README.md | 3563 | logstatsplain |
-rw-r--r-- | TODO | 52 | logstatsplain |
-rwxr-xr-x | backfill_hbase_from_cdx.py | 2896 | logstatsplain |
-rw-r--r-- | common.py | 2618 | logstatsplain |
-rwxr-xr-x | enrich_scored_matches.py | 938 | logstatsplain |
-rwxr-xr-x | extraction_cdx_grobid.py | 11023 | logstatsplain |
-rwxr-xr-x | extraction_ungrobided.py | 10653 | logstatsplain |
-rwxr-xr-x | filter_scored_matches.py | 3421 | logstatsplain |
-rwxr-xr-x | grobid2json.py | 5122 | logstatsplain |
-rwxr-xr-x | manifest_converter.py | 1594 | logstatsplain |
-rw-r--r-- | mrjob.conf | 466 | logstatsplain |
-rw-r--r-- | pytest.ini | 171 | logstatsplain |
d--------- | tests | 294 | logstatsplain |
-rw-r--r-- | xml2json.py | 199 | logstatsplain |