| Commit message (Expand) | Author | Age | Files | Lines |
* | short wayback ts: initial cleanup script implementation | Bryan Newbold | 2021-11-09 | 1 | -0/+251 |
* | cleanups: create a separate JsonLinePusher for cleanup workers (distinct base... | Bryan Newbold | 2021-11-03 | 2 | -2/+19 |
* | datacite importer: remove unused 'year_only' variable | Bryan Newbold | 2021-11-03 | 1 | -2/+3 |
* | pubmed harvester: remove unused variables | Bryan Newbold | 2021-11-03 | 1 | -2/+2 |
* | pubmed harvester: explicit assertions to mark unreachable code paths | Bryan Newbold | 2021-11-03 | 1 | -0/+2 |
* | typing: add assertions to fatcat_tool code to make type assumptions explicit | Bryan Newbold | 2021-11-03 | 3 | -0/+3 |
* | typing: add annotations to remaining fatcat_tools code | Bryan Newbold | 2021-11-03 | 9 | -122/+186 |
* | datacite: add comment about potential date parsing bug | Bryan Newbold | 2021-11-03 | 1 | -0/+1 |
* | datacite importer: dateparser.date.DateDataParser() | Bryan Newbold | 2021-11-03 | 1 | -1/+1 |
* | more involved type wrangling and fixes for importers | Bryan Newbold | 2021-11-03 | 3 | -12/+14 |
* | typing: relatively simple type check fixes | Bryan Newbold | 2021-11-03 | 14 | -87/+82 |
* | typing: initial annotations on importers | Bryan Newbold | 2021-11-03 | 22 | -274/+443 |
* | typing: first batch of python bulk type annotations | Bryan Newbold | 2021-11-03 | 9 | -69/+129 |
* | importers: remove unused __main__ routine | Bryan Newbold | 2021-11-03 | 4 | -19/+0 |
* | lint: resolve existing mypy type errors | Bryan Newbold | 2021-11-02 | 8 | -50/+86 |
* | re-fix some lint issues after big 'fmt' | Bryan Newbold | 2021-11-02 | 2 | -4/+5 |
* | fmt (black): fatcat_tools/ | Bryan Newbold | 2021-11-02 | 43 | -3194/+4020 |
* | python: isort everything | Bryan Newbold | 2021-11-02 | 32 | -71/+116 |
* | arabesque import 'hit' field is 1/0, not true/false | Bryan Newbold | 2021-11-02 | 1 | -2/+2 |
* | lint: simple, safe inline lint fixes | Bryan Newbold | 2021-11-02 | 18 | -83/+82 |
* | lint/fmt: remove all 'import *' | Bryan Newbold | 2021-11-02 | 5 | -21/+41 |
* | entity transforms: add basic type annotations | Bryan Newbold | 2021-11-02 | 1 | -7/+19 |
* | ftfy 'fix_entities' argument has been renamed | Bryan Newbold | 2021-11-02 | 1 | -4/+4 |
* | hacks to work around new pylint false positives | Bryan Newbold | 2021-11-02 | 1 | -2/+3 |
* | cleanup imports after fatcat_tools.transforms change | Bryan Newbold | 2021-11-02 | 1 | -5/+8 |
* | re-fmt all the fatcat_tools __init__ files for readability | Bryan Newbold | 2021-11-02 | 5 | -30/+62 |
* | remove 'import *' from fatcat_tools (for transforms) | Bryan Newbold | 2021-11-02 | 1 | -2/+2 |
* | small python tweaks for annotations, imports | Bryan Newbold | 2021-11-02 | 3 | -3/+7 |
* | try some type annotations | Bryan Newbold | 2021-11-02 | 4 | -70/+79 |
* | reviewer: add annotations required by mypy | Bryan Newbold | 2021-11-02 | 1 | -2/+3 |
* | fix missing variable in fileset ingest | Bryan Newbold | 2021-11-02 | 1 | -2/+1 |
* | Merge branch 'bnewbold-import-fileset' | Bryan Newbold | 2021-11-02 | 5 | -4/+350 |
|\ |
|
| * | WIP: more fileset ingest | Bryan Newbold | 2021-10-18 | 1 | -13/+21 |
| * | WIP: rel fixes | Bryan Newbold | 2021-10-14 | 1 | -6/+6 |
| * | fileset ingest small tweaks | Bryan Newbold | 2021-10-14 | 1 | -21/+36 |
| * | initial implementation of fileset ingest importers | Bryan Newbold | 2021-10-14 | 2 | -3/+224 |
| * | ingest: handle datasets, components, other ingest types | Bryan Newbold | 2021-10-14 | 1 | -1/+15 |
| * | generic fileset importer class, with test coverage | Bryan Newbold | 2021-10-14 | 3 | -0/+88 |
* | | Merge branch 'bnewbold-match-get' | Bryan Newbold | 2021-11-02 | 1 | -3/+9 |
|\ \ |
|
| * | | access: populate thumbnail_url for PDFs | Bryan Newbold | 2021-10-18 | 1 | -3/+9 |
| |/ |
|
* / | pubmed: switch default http site to retrieve update files | Martin Czygan | 2021-10-15 | 1 | -2/+4 |
|/ |
|
* | dblp import: basic support for handles as identifiers | Bryan Newbold | 2021-10-13 | 1 | -1/+5 |
* | python: normalization/validation support for handle identifiers (hdl) | Bryan Newbold | 2021-10-13 | 1 | -0/+33 |
* | dblp import: fix typos in identifier parsing | Bryan Newbold | 2021-10-13 | 1 | -2/+1 |
* | python: partial importer utilization of new schema changes | Bryan Newbold | 2021-10-13 | 3 | -6/+18 |
* | python: implement ES schema changes | Bryan Newbold | 2021-10-13 | 1 | -4/+17 |
* | Merge branch 'bnewbold-ingest-tweaks' into 'master' | bnewbold | 2021-10-02 | 3 | -39/+106 |
|\ |
|
| * | kafka import: optional 'force-flush' mode for some importers | Bryan Newbold | 2021-10-01 | 1 | -0/+13 |
| * | new SPN web (html) importer | Bryan Newbold | 2021-10-01 | 2 | -27/+81 |
| * | ingest importer behavior tweaks | Bryan Newbold | 2021-10-01 | 1 | -8/+8 |