| Commit message (Collapse) | Author | Age | Files | Lines | |
|---|---|---|---|---|---|
| * | re-fix some lint issues after big 'fmt' | Bryan Newbold | 2021-11-02 | 2 | -4/+5 | 
| | | |||||
| * | fmt (black): fatcat_tools/ | Bryan Newbold | 2021-11-02 | 43 | -3194/+4020 | 
| | | |||||
| * | python: isort everything | Bryan Newbold | 2021-11-02 | 32 | -71/+116 | 
| | | |||||
| * | arabesque import 'hit' field is 1/0, not true/false | Bryan Newbold | 2021-11-02 | 1 | -2/+2 | 
| | | |||||
| * | lint: simple, safe inline lint fixes | Bryan Newbold | 2021-11-02 | 18 | -83/+82 | 
| | | | | | '==' vs 'is'; 'not a in b' vs 'a not in b'; etc | ||||
| * | lint/fmt: remove all 'import *' | Bryan Newbold | 2021-11-02 | 5 | -21/+41 | 
| | | |||||
| * | entity transforms: add basic type annotations | Bryan Newbold | 2021-11-02 | 1 | -7/+19 | 
| | | |||||
| * | ftfy 'fix_entities' argument has been renamed | Bryan Newbold | 2021-11-02 | 1 | -4/+4 | 
| | | |||||
| * | hacks to work around new pylint false positives | Bryan Newbold | 2021-11-02 | 1 | -2/+3 | 
| | | |||||
| * | cleanup imports after fatcat_tools.transforms change | Bryan Newbold | 2021-11-02 | 1 | -5/+8 | 
| | | |||||
| * | re-fmt all the fatcat_tools __init__ files for readability | Bryan Newbold | 2021-11-02 | 5 | -30/+62 | 
| | | |||||
| * | remove 'import *' from fatcat_tools (for transforms) | Bryan Newbold | 2021-11-02 | 1 | -2/+2 | 
| | | |||||
| * | small python tweaks for annotations, imports | Bryan Newbold | 2021-11-02 | 3 | -3/+7 | 
| | | |||||
| * | try some type annotations | Bryan Newbold | 2021-11-02 | 4 | -70/+79 | 
| | | |||||
| * | reviewer: add annotations required by mypy | Bryan Newbold | 2021-11-02 | 1 | -2/+3 | 
| | | |||||
| * | fix missing variable in fileset ingest | Bryan Newbold | 2021-11-02 | 1 | -2/+1 | 
| | | |||||
| * | Merge branch 'bnewbold-import-fileset' | Bryan Newbold | 2021-11-02 | 5 | -4/+350 | 
| |\ | |||||
| | * | WIP: more fileset ingest | Bryan Newbold | 2021-10-18 | 1 | -13/+21 | 
| | | | |||||
| | * | WIP: rel fixes | Bryan Newbold | 2021-10-14 | 1 | -6/+6 | 
| | | | |||||
| | * | fileset ingest small tweaks | Bryan Newbold | 2021-10-14 | 1 | -21/+36 | 
| | | | |||||
| | * | initial implementation of fileset ingest importers | Bryan Newbold | 2021-10-14 | 2 | -3/+224 | 
| | | | |||||
| | * | ingest: handle datasets, components, other ingest types | Bryan Newbold | 2021-10-14 | 1 | -1/+15 | 
| | | | |||||
| | * | generic fileset importer class, with test coverage | Bryan Newbold | 2021-10-14 | 3 | -0/+88 | 
| | | | |||||
| * | | Merge branch 'bnewbold-match-get' | Bryan Newbold | 2021-11-02 | 1 | -3/+9 | 
| |\ \ | |||||
| | * | | access: populate thumbnail_url for PDFs | Bryan Newbold | 2021-10-18 | 1 | -3/+9 | 
| | |/ | |||||
| * / | pubmed: switch default http site to retrieve update files | Martin Czygan | 2021-10-15 | 1 | -2/+4 | 
| |/ | | | | | | | Proxy started to throw: "dial tcp: lookup ftp.ncbi.nlm.nih.gov on [::1]:53: read udp [::1]:45178->[::1]:53: read: connection refused" NIH has a http version on it's own, try to use that. | ||||
| * | dblp import: basic support for handles as identifiers | Bryan Newbold | 2021-10-13 | 1 | -1/+5 | 
| | | |||||
| * | python: normalization/validation support for handle identifiers (hdl) | Bryan Newbold | 2021-10-13 | 1 | -0/+33 | 
| | | |||||
| * | dblp import: fix typos in identifier parsing | Bryan Newbold | 2021-10-13 | 1 | -2/+1 | 
| | | |||||
| * | python: partial importer utilization of new schema changes | Bryan Newbold | 2021-10-13 | 3 | -6/+18 | 
| | | |||||
| * | python: implement ES schema changes | Bryan Newbold | 2021-10-13 | 1 | -4/+17 | 
| | | |||||
| * | Merge branch 'bnewbold-ingest-tweaks' into 'master' | bnewbold | 2021-10-02 | 3 | -39/+106 | 
| |\ | | | | | | | | | ingest importer behavior tweaks See merge request webgroup/fatcat!120 | ||||
| | * | kafka import: optional 'force-flush' mode for some importers | Bryan Newbold | 2021-10-01 | 1 | -0/+13 | 
| | | | | | | | | | Behavior and motivation described in the kafka json import comment. | ||||
| | * | new SPN web (html) importer | Bryan Newbold | 2021-10-01 | 2 | -27/+81 | 
| | | | |||||
| | * | ingest importer behavior tweaks | Bryan Newbold | 2021-10-01 | 1 | -8/+8 | 
| | | | | | | | | | | | - change order of 'want()' checks, so that result counts are clearer - don't require GROBID success for file imports with SPN | ||||
| | * | importer common: more verbose logging (with counts) | Bryan Newbold | 2021-10-01 | 1 | -4/+4 | 
| | | | |||||
| * | | datacite: skip empty abstracts | Martin Czygan | 2021-10-01 | 1 | -1/+4 | 
| |/ | | | | | Do not add abstracts where `clean` results in the empty string - this violates a constraint: `either abstract_sha1 or content is required` | ||||
| * | pubmed: workaround a networking issue | Martin Czygan | 2021-09-09 | 1 | -24/+21 | 
| | | | | | | | use an http proxy (https://github.com/miku/ftpup) to fetch files from FTP, keep some retry logic; also, hardcoding the proxy path as this should be a temporary workaround | ||||
| * | pubmed: add option to ftp download with lftp | Martin Czygan | 2021-09-08 | 1 | -2/+31 | 
| | | | | | | lftp is a classic command line ftp client, and we hope that its retry capabilities are enough of a workaround for the current networking issue | ||||
| * | pubmed harvester: add basic retry logic | Martin Czygan | 2021-08-20 | 1 | -8/+21 | 
| | | | | | | | | | Related to a previous issue with seemingly random EOFError from FTP connections, this patch wrap "ftpretr" helper function with a basic retry. Refs: fatcat-workers/issues/92151, fatcat-workers/issues/91102 | ||||
| * | refs: default to *not* consolidating works | Bryan Newbold | 2021-08-06 | 1 | -1/+1 | 
| | | | | | | | | We don't handle counts for consolidated refs yet, so just don't consolidate. This should fix, eg, "Showing 1-18 of 19" type UX confusion, with the trade-off that some works will be duplicated in inbound ref tables. | ||||
| * | refs: lint fixes | Bryan Newbold | 2021-07-27 | 1 | -0/+1 | 
| | | |||||
| * | refs: support for wikipedia outbound refs, and display in tables | Bryan Newbold | 2021-07-27 | 1 | -2/+2 | 
| | | |||||
| * | refs: generalize web endpoints; JSON content negotiation; openlibrary ↵ | Bryan Newbold | 2021-07-23 | 2 | -22/+57 | 
| | | | | | inbound view; etc | ||||
| * | refs: small refactors/tweaks | Bryan Newbold | 2021-07-23 | 1 | -11/+17 | 
| | | |||||
| * | remove unused imports (lint) | Bryan Newbold | 2021-07-23 | 2 | -3/+2 | 
| | | |||||
| * | pylint: skip pydantic import check (dynamic/extensions) | Bryan Newbold | 2021-07-23 | 1 | -8/+2 | 
| | | |||||
| * | refs: refactor web paths; enrich refs as generic; remove old refs link | Bryan Newbold | 2021-07-23 | 1 | -50/+35 | 
| | | |||||
| * | refs fetch: add some hacks; sort hits | Bryan Newbold | 2021-07-23 | 1 | -6/+16 | 
| | | |||||
| * | fixes for newer ref index | Bryan Newbold | 2021-07-23 | 1 | -1/+1 | 
| | | |||||
