aboutsummaryrefslogtreecommitdiffstats
path: root/python/fatcat_tools/importers/jalc.py
Commit message (Collapse)AuthorAgeFilesLines
* typing: relatively simple type check fixesBryan Newbold2021-11-031-4/+10
| | | | | | | These mostly add new variable names so that existing variables aren't overwritten with a new type; delay coercing '{}' or '[]' to 'None' until the last minute; adding is-not-None checks to conditional clauses; and similar small changes.
* typing: initial annotations on importersBryan Newbold2021-11-031-14/+18
| | | | | This commit just adds the type annotations, doesn't do fixes to code to make type checking pass.
* importers: remove unused __main__ routineBryan Newbold2021-11-031-4/+0
| | | | | | These perhaps were used in initial develoment or testing? fatcat_import.py is the correct way to do these imports, even for testing/development.
* fmt (black): fatcat_tools/Bryan Newbold2021-11-021-81/+112
|
* python: isort everythingBryan Newbold2021-11-021-4/+6
|
* more consistent and defensive lower-casing of DOIsBryan Newbold2021-06-231-1/+2
| | | | | | | After noticing more upper/lower ambiguity in production. In particular, we have some old ingest requests in sandcrawler DB, which get re-submitted/re-tried, which have capitalized DOIs in the link source id field.
* simple lint (flake8) fixes over python codebaseBryan Newbold2020-07-231-1/+1
| | | | | | These should not have any behavior changes, though a number of exception catches are now more general, and there may be long-tail exceptions getting thrown in these statements.
* lint (flake8) tool python filesBryan Newbold2020-07-011-3/+0
|
* Indentity is not the same this as equality in PythonChristian Clauss2020-05-141-2/+2
|
* importers: replace newlines in get_text() stringsBryan Newbold2020-04-011-7/+7
|
* importers: more string/get_text swapsBryan Newbold2020-03-281-7/+7
| | | | See previous pubmed commit for details.
* jalc: avoid meaningless pages valuesBryan Newbold2020-03-231-4/+8
|
* refactor all python source for client lib nameBryan Newbold2019-09-051-8/+8
|
* JALC: handle empty publisher stringBryan Newbold2019-05-301-3/+4
|
* remove stray JALC debug codeBryan Newbold2019-05-291-2/+3
|
* improve JALC author handlingBryan Newbold2019-05-291-59/+85
|
* all new importers need to set contrib index (order)Bryan Newbold2019-05-221-1/+5
|
* jalc empty publisher stringBryan Newbold2019-05-221-2/+2
|
* better JALC and arxiv DOI checksBryan Newbold2019-05-221-1/+1
|
* yet another JALC edge-caseBryan Newbold2019-05-211-1/+1
|
* better JALC DOI de-manglingBryan Newbold2019-05-211-1/+10
|
* JALC importer requires a valid DOIBryan Newbold2019-05-211-0/+1
|
* handle bad JALC DOIsBryan Newbold2019-05-211-1/+3
|
* JALC more robust to partial namesBryan Newbold2019-05-211-8/+19
|
* more JALC importer tweaksBryan Newbold2019-05-211-7/+10
|
* JALC importer: handle missing titlesBryan Newbold2019-05-211-0/+2
|
* importers: create containers by defaultBryan Newbold2019-05-211-1/+3
|
* more JALC importer polishBryan Newbold2019-05-211-4/+17
|
* tweaks to new imports/testsBryan Newbold2019-05-211-1/+1
|
* clean up JALC importer a tiny bitBryan Newbold2019-05-211-8/+3
|
* initial flesh out of JALC parserBryan Newbold2019-05-211-0/+310