aboutsummaryrefslogtreecommitdiffstats
path: root/python/fatcat_tools
Commit message (Collapse)AuthorAgeFilesLines
* fix file extraction (and transforms)Bryan Newbold2018-11-261-6/+6
|
* clean up harvester comments/docsBryan Newbold2018-11-213-50/+31
|
* crossref importer doesn't require author/title attributesBryan Newbold2018-11-211-6/+6
|
* crossref importer checks for existing DOIsBryan Newbold2018-11-212-4/+19
|
* use isoformat() to format datesBryan Newbold2018-11-213-5/+6
| | | | This shouldn't change behavior; it's just more consistent.
* grobid importer: release_date as a dateBryan Newbold2018-11-211-1/+1
|
* fix loop_sleep typoBryan Newbold2018-11-212-2/+2
|
* fix datacite DOI extractionBryan Newbold2018-11-211-1/+1
|
* fix OAI-PMH name/finished messageBryan Newbold2018-11-211-1/+6
|
* fix oai-pmh issue againBryan Newbold2018-11-211-13/+14
|
* oaipmh: handle NoRecordsMatchBryan Newbold2018-11-211-5/+8
|
* start supporting kafka importersBryan Newbold2018-11-192-1/+18
| | | | A nice feature would be some/any log output as to progress.
* fix some broken importer argsBryan Newbold2018-11-191-5/+7
|
* monograph isn't a CSL typeBryan Newbold2018-11-191-1/+1
|
* not as strong a todo (timestamps)Bryan Newbold2018-11-191-1/+1
|
* initial OAI-PMH harvestersBryan Newbold2018-11-193-5/+167
|
* better DOI registrar harvestersBryan Newbold2018-11-193-48/+145
|
* bunch of pylint cleanupBryan Newbold2018-11-156-24/+38
|
* large refactor of python names/pathsBryan Newbold2018-11-1513-30/+78
| | | | | | | - Add __init__.py files for fatcat_tools submodules, and use them in imports - Add a bunch of comments to files. - rename a number of classes and functions to be less verbose
* have recent message helper cleanup consumerBryan Newbold2018-11-151-1/+5
|
* refactoring harvestersBryan Newbold2018-11-155-196/+210
|
* initial work on metadata harvest botsBryan Newbold2018-11-144-0/+197
|
* fix worker codeBryan Newbold2018-11-142-2/+5
|
* most_recent_message as reusable functionBryan Newbold2018-11-142-26/+26
|
* update crossref controlled vocabBryan Newbold2018-11-142-3/+32
|
* python tweaks for date/datetime rust fixBryan Newbold2018-11-142-10/+3
|
* switch to auto consumer offset updatesBryan Newbold2018-11-132-2/+11
| | | | | | This is the classic/correct way to do consumer group updates for higher throughput, when "at least once" semantics are acceptible (as they are here; double processing should be safe/fine).
* to_elastic_dict -> release_elastic_dictBryan Newbold2018-11-131-1/+2
|
* use Counter object instead of per-metric intsBryan Newbold2018-11-136-17/+17
|
* more simple fatcat_client importsBryan Newbold2018-11-132-3/+2
|
* shuffle around fatcat_tools layoutBryan Newbold2018-11-1310-73/+7
|
* more python module refactoringBryan Newbold2018-11-128-8/+8
|
* refactor python modulesBryan Newbold2018-11-1212-0/+1243