summaryrefslogtreecommitdiffstats
path: root/python/fatcat_tools
Commit message (Expand)AuthorAgeFilesLines
* Merge branch 'bnewbold-cleanups-nov2021' into 'master'bnewbold2021-11-114-0/+748
|\
| * file/release bugfix: handle files with multiple editsBryan Newbold2021-11-091-6/+6
| * cleanups: add more state=active checksBryan Newbold2021-11-092-0/+8
| * update link source filters in file/release bugfixBryan Newbold2021-11-091-2/+8
| * initial file/release bugfix cleanup worker and notesBryan Newbold2021-11-091-0/+231
| * updates to lowercase DOI cleanupBryan Newbold2021-11-091-7/+15
| * lowercase DOI lint and check entity statusBryan Newbold2021-11-091-4/+5
| * more iteration on short wayback timestamp cleanupBryan Newbold2021-11-091-1/+1
| * cleanups: tweaks to wayback CDX cleanup scriptsBryan Newbold2021-11-091-5/+13
| * cleanups: initial lowercase DOI cleanup scriptBryan Newbold2021-11-091-0/+145
| * wayback short ts: another regression test, and some small fmt/tweaksBryan Newbold2021-11-091-3/+38
| * wayback cleanup: actually update entityBryan Newbold2021-11-091-2/+4
| * imports: generic file cleanup removes exact duplicate URLsBryan Newbold2021-11-091-0/+9
| * wayback short ts: add regression test for dupe URLsBryan Newbold2021-11-091-0/+44
| * short wayback ts: initial cleanup script implementationBryan Newbold2021-11-091-0/+251
* | pubmed: allow updates if PMCID does not exist yetBryan Newbold2021-11-101-1/+6
|/
* cleanups: create a separate JsonLinePusher for cleanup workers (distinct base...Bryan Newbold2021-11-032-2/+19
* datacite importer: remove unused 'year_only' variableBryan Newbold2021-11-031-2/+3
* pubmed harvester: remove unused variablesBryan Newbold2021-11-031-2/+2
* pubmed harvester: explicit assertions to mark unreachable code pathsBryan Newbold2021-11-031-0/+2
* typing: add assertions to fatcat_tool code to make type assumptions explicitBryan Newbold2021-11-033-0/+3
* typing: add annotations to remaining fatcat_tools codeBryan Newbold2021-11-039-122/+186
* datacite: add comment about potential date parsing bugBryan Newbold2021-11-031-0/+1
* datacite importer: dateparser.date.DateDataParser()Bryan Newbold2021-11-031-1/+1
* more involved type wrangling and fixes for importersBryan Newbold2021-11-033-12/+14
* typing: relatively simple type check fixesBryan Newbold2021-11-0314-87/+82
* typing: initial annotations on importersBryan Newbold2021-11-0322-274/+443
* typing: first batch of python bulk type annotationsBryan Newbold2021-11-039-69/+129
* importers: remove unused __main__ routineBryan Newbold2021-11-034-19/+0
* lint: resolve existing mypy type errorsBryan Newbold2021-11-028-50/+86
* re-fix some lint issues after big 'fmt'Bryan Newbold2021-11-022-4/+5
* fmt (black): fatcat_tools/Bryan Newbold2021-11-0243-3194/+4020
* python: isort everythingBryan Newbold2021-11-0232-71/+116
* arabesque import 'hit' field is 1/0, not true/falseBryan Newbold2021-11-021-2/+2
* lint: simple, safe inline lint fixesBryan Newbold2021-11-0218-83/+82
* lint/fmt: remove all 'import *'Bryan Newbold2021-11-025-21/+41
* entity transforms: add basic type annotationsBryan Newbold2021-11-021-7/+19
* ftfy 'fix_entities' argument has been renamedBryan Newbold2021-11-021-4/+4
* hacks to work around new pylint false positivesBryan Newbold2021-11-021-2/+3
* cleanup imports after fatcat_tools.transforms changeBryan Newbold2021-11-021-5/+8
* re-fmt all the fatcat_tools __init__ files for readabilityBryan Newbold2021-11-025-30/+62
* remove 'import *' from fatcat_tools (for transforms)Bryan Newbold2021-11-021-2/+2
* small python tweaks for annotations, importsBryan Newbold2021-11-023-3/+7
* try some type annotationsBryan Newbold2021-11-024-70/+79
* reviewer: add annotations required by mypyBryan Newbold2021-11-021-2/+3
* fix missing variable in fileset ingestBryan Newbold2021-11-021-2/+1
* Merge branch 'bnewbold-import-fileset'Bryan Newbold2021-11-025-4/+350
|\
| * WIP: more fileset ingestBryan Newbold2021-10-181-13/+21
| * WIP: rel fixesBryan Newbold2021-10-141-6/+6
| * fileset ingest small tweaksBryan Newbold2021-10-141-21/+36