summaryrefslogtreecommitdiffstats
path: root/python/fatcat_ingest.py
Commit message (Collapse)AuthorAgeFilesLines
* add ingest-container command (new CLI tool)Bryan Newbold2019-12-101-0/+136
The intent of this tool is to make it easy to enque ingest requests into kafka, to be processed by a worker pool and eventually end up inserted into fatcat (for ingest hits that pass various checks). As a specific example use-case, we have pretty good coverage of eLife (a prominent OA publisher), but have missed some publications in the past, and have a large gap for the year 2019: https://fatcat.wiki/container/en4qj5ijrbf5djxx7p5zzpjyoq/coverage This tool would make it trivial to enqueue all the missing releases to be crawled. Future variants on this tool could query for, eg, long-tail OA works.