aboutsummaryrefslogtreecommitdiffstats
path: root/python/fatcat_tools/importers/grobid_metadata.py
Commit message (Collapse)AuthorAgeFilesLines
* refactor importer metadata tables into separate file; move some helpers aroundBryan Newbold2021-11-101-4/+2
| | | | | | | - MAX_ABSTRACT_LENGTH set in a single place (importer common) - merge datacite license slug table in to common table, removing some TDM-specific licenses (which do not apply in the context of preserving the full work)
* importers: refactor imports of clean() and other normalization helpersBryan Newbold2021-11-101-15/+15
|
* importers: use clean_doi() in many more (all?) importersBryan Newbold2021-11-091-3/+6
|
* typing: relatively simple type check fixesBryan Newbold2021-11-031-8/+5
| | | | | | | These mostly add new variable names so that existing variables aren't overwritten with a new type; delay coercing '{}' or '[]' to 'None' until the last minute; adding is-not-None checks to conditional clauses; and similar small changes.
* typing: initial annotations on importersBryan Newbold2021-11-031-11/+15
| | | | | This commit just adds the type annotations, doesn't do fixes to code to make type checking pass.
* fmt (black): fatcat_tools/Bryan Newbold2021-11-021-62/+74
|
* python: isort everythingBryan Newbold2021-11-021-1/+3
|
* simple lint (flake8) fixes over python codebaseBryan Newbold2020-07-231-1/+1
| | | | | | These should not have any behavior changes, though a number of exception catches are now more general, and there may be long-tail exceptions getting thrown in these statements.
* lint (flake8) tool python filesBryan Newbold2020-07-011-2/+0
|
* refactor all python source for client lib nameBryan Newbold2019-09-051-12/+12
|
* python implBryan Newbold2019-05-141-4/+5
|
* python implBryan Newbold2019-05-141-4/+4
|
* importer code updatesBryan Newbold2019-05-131-0/+2
|
* partial python impl of ext_id and release_stage refactorsBryan Newbold2019-05-131-0/+1
|
* bunch of lint/whitespace cleanupsBryan Newbold2019-02-221-1/+0
|
* tweaks to GROBID metadata importBryan Newbold2019-01-291-3/+2
|
* pass through kwargs (fixes bezerk imports)Bryan Newbold2019-01-291-1/+2
|
* many fixes in GROBID importerBryan Newbold2019-01-281-14/+10
|
* fix GROBID null/short abstract additionsBryan Newbold2019-01-281-1/+2
|
* enforce title len>1 for release importsBryan Newbold2019-01-281-1/+5
|
* grobid import extra metadata tweaksBryan Newbold2019-01-241-6/+7
|
* refactor make_rel_urlBryan Newbold2019-01-241-15/+3
|
* importer bugfixesBryan Newbold2019-01-231-3/+5
|
* ftfy all over (needs Pipfile.lock)Bryan Newbold2019-01-231-11/+12
|
* more tests; fix some importer behaviorBryan Newbold2019-01-231-27/+23
|
* refactor remaining importersBryan Newbold2019-01-221-36/+68
|
* importers and tests all use new api-passingBryan Newbold2019-01-081-3/+10
|
* python impl of API ident harmonizationBryan Newbold2018-12-241-5/+5
|
* implement release_year (and rustfmt)Bryan Newbold2018-12-241-2/+4
|
* grobid importer: release_date as a dateBryan Newbold2018-11-211-1/+1
|
* bunch of pylint cleanupBryan Newbold2018-11-151-7/+6
|
* large refactor of python names/pathsBryan Newbold2018-11-151-2/+2
| | | | | | | - Add __init__.py files for fatcat_tools submodules, and use them in imports - Add a bunch of comments to files. - rename a number of classes and functions to be less verbose
* update crossref controlled vocabBryan Newbold2018-11-141-1/+1
|
* use Counter object instead of per-metric intsBryan Newbold2018-11-131-1/+1
|
* shuffle around fatcat_tools layoutBryan Newbold2018-11-131-0/+168