Mode | Name | Size | |
---|---|---|---|
-rw-r--r-- | .coveragerc | 46 | logstatsplain |
-rw-r--r-- | .flake8 | 809 | logstatsplain |
-rw-r--r-- | .gitignore | 124 | logstatsplain |
-rw-r--r-- | .pylintrc | 551 | logstatsplain |
-rw-r--r-- | Makefile | 877 | logstatsplain |
-rw-r--r-- | Pipfile | 1276 | logstatsplain |
-rw-r--r-- | Pipfile.lock | 102847 | logstatsplain |
-rw-r--r-- | TODO | 236 | logstatsplain |
-rw-r--r-- | example.env | 238 | logstatsplain |
-rwxr-xr-x | grobid_tool.py | 6984 | logstatsplain |
-rwxr-xr-x | ia_pdf_match.py | 3036 | logstatsplain |
-rwxr-xr-x | ingest_tool.py | 6526 | logstatsplain |
-rwxr-xr-x | pdfextract_tool.py | 5162 | logstatsplain |
-rwxr-xr-x | pdftrio_tool.py | 4877 | logstatsplain |
-rwxr-xr-x | persist_tool.py | 8822 | logstatsplain |
-rw-r--r-- | pyproject.toml | 150 | logstatsplain |
-rw-r--r-- | pytest.ini | 823 | logstatsplain |
d--------- | sandcrawler | 751 | logstatsplain |
-rwxr-xr-x | sandcrawler_worker.py | 14819 | logstatsplain |
d--------- | scripts | 956 | logstatsplain |
d--------- | tests | 605 | logstatsplain |
l--------- | title_slug_denylist.txt -> ../scalding/src/main/resources/slug-denylist.txt | 48 | logstatsplain |