summaryrefslogtreecommitdiffstats
path: root/python/fatcat_tools/importers/common.py
Commit message (Collapse)AuthorAgeFilesLines
* better importer 'total' countingBryan Newbold2019-09-031-4/+2
|
* make importer extid lookups faster by hidingBryan Newbold2019-05-291-2/+2
|
* is_cjk() handles kanji betterBryan Newbold2019-05-291-4/+6
|
* faster LargeFile XML importer for PubMedBryan Newbold2019-05-291-0/+50
|
* more MARC languages, and less verbose reportingBryan Newbold2019-05-241-3/+14
|
* missing MARC/ISO languagesBryan Newbold2019-05-221-0/+2
|
* Gaelic!Bryan Newbold2019-05-221-0/+3
|
* creative importer for bulk JSTOR importsBryan Newbold2019-05-221-0/+22
|
* bs4 XML parse cleanupBryan Newbold2019-05-221-0/+2
|
* JALC bulk file importerBryan Newbold2019-05-211-0/+20
|
* updates to pubmed importerBryan Newbold2019-05-211-1/+20
|
* tweaks to new imports/testsBryan Newbold2019-05-211-5/+78
|
* initial flesh out of JALC parserBryan Newbold2019-05-211-0/+36
|
* python implBryan Newbold2019-05-141-3/+3
|
* add limits to match importersBryan Newbold2019-04-231-0/+3
|
* archive.org isn't really a repositoryBryan Newbold2019-04-221-1/+3
|
* mechanism to not double-update entitiesBryan Newbold2019-04-181-0/+3
|
* update URL rel listBryan Newbold2019-04-181-1/+10
|
* add SqlitePusher importer optionBryan Newbold2019-04-121-0/+20
|
* bunch of lint/whitespace cleanupsBryan Newbold2019-02-221-2/+2
|
* fix bug in clean() resulting in many consistency check failsBryan Newbold2019-01-291-2/+3
|
* add stub parse_record() to make pylint happyBryan Newbold2019-01-281-0/+4
|
* don't allow empty or single-character clean stringsBryan Newbold2019-01-281-1/+1
|
* transform and import fixes/tweaksBryan Newbold2019-01-251-2/+2
|
* refactor _get_editgroup => get_editgroup_idBryan Newbold2019-01-241-4/+5
|
* refactor make_rel_urlBryan Newbold2019-01-241-0/+60
|
* clean() checks if it returns null-length stringBryan Newbold2019-01-231-1/+5
|
* ftfy all over (needs Pipfile.lock)Bryan Newbold2019-01-231-0/+31
|
* more tests; fix some importer behaviorBryan Newbold2019-01-231-10/+20
|
* improve changelog testsBryan Newbold2019-01-231-1/+0
|
* refactor remaining importersBryan Newbold2019-01-221-174/+90
|
* refactored crossref importer to new styleBryan Newbold2019-01-221-17/+107
|
* new importer API interfacesBryan Newbold2019-01-221-0/+166
|
* use full-on autoaccept modeBryan Newbold2019-01-111-2/+3
| | | | | | | | Now that editor_id is infered from token, don't *need* to create ahead of time. This backend change simplifies things greatly (either update an existing editgroup, or create new and *only* include entities in the batch transaction), at the cost of being able to configure the editgroup in any way, including setting a description.
* importers and tests all use new api-passingBryan Newbold2019-01-081-0/+1
|
* start updating importer auth with crossref importerBryan Newbold2019-01-081-9/+23
|
* don't need to supply editor_id nowBryan Newbold2018-12-311-6/+3
|
* python impl of API ident harmonizationBryan Newbold2018-12-241-6/+6
|
* start supporting kafka importersBryan Newbold2018-11-191-0/+17
| | | | A nice feature would be some/any log output as to progress.
* fix some broken importer argsBryan Newbold2018-11-191-5/+7
|
* bunch of pylint cleanupBryan Newbold2018-11-151-3/+12
|
* large refactor of python names/pathsBryan Newbold2018-11-151-0/+3
| | | | | | | - Add __init__.py files for fatcat_tools submodules, and use them in imports - Add a bunch of comments to files. - rename a number of classes and functions to be less verbose
* use Counter object instead of per-metric intsBryan Newbold2018-11-131-6/+6
|
* shuffle around fatcat_tools layoutBryan Newbold2018-11-131-0/+137