summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* dblp import: basic support for handles as identifiersBryan Newbold2021-10-131-1/+5
|
* python: normalization/validation support for handle identifiers (hdl)Bryan Newbold2021-10-131-0/+33
|
* dblp import: fix typos in identifier parsingBryan Newbold2021-10-131-2/+1
|
* guide updates for v0.4 schema changesBryan Newbold2021-10-133-12/+57
|
* CHANGELOG updates for v0.4 releaseBryan Newbold2021-10-131-1/+18
|
* python: partial importer utilization of new schema changesBryan Newbold2021-10-133-6/+18
|
* python: test coverage of rust schema changesBryan Newbold2021-10-134-2/+59
|
* python: implement ES schema changesBryan Newbold2021-10-131-4/+17
|
* web: implement new schema changesBryan Newbold2021-10-136-11/+45
|
* elasticsearch schema changesBryan Newbold2021-10-132-3/+13
|
* rust: partial test coverage of schema changesBryan Newbold2021-10-131-2/+32
|
* rust: prep for possible DOI lowercase enforcementBryan Newbold2021-10-131-1/+5
| | | | | | See also: https://github.com/internetarchive/fatcat/issues/83 This commit is no behavior change, just leaving a note to self.
* rust: implement scheman and API changesBryan Newbold2021-10-135-38/+353
|
* rust: handle new migrations in test helperBryan Newbold2021-10-131-1/+1
|
* rust: implement recent SQL changesBryan Newbold2021-10-132-0/+12
|
* fatcatd: display version correctly, and at startupBryan Newbold2021-10-131-2/+8
|
* python client: codegen for v0.4Bryan Newbold2021-10-138-25/+325
|
* python client: bump version in codegen scriptBryan Newbold2021-10-131-1/+1
|
* fatcat-api: enforce more release ext_id checks at create/updateBryan Newbold2021-10-131-2/+15
| | | | Not enforcing these was a serious bug!
* sql: v0.4 schema implementation (as diesel migration)Bryan Newbold2021-10-132-0/+58
|
* bump rust code version to v0.4.0Bryan Newbold2021-10-133-5/+7
|
* rust codegen for v0.4Bryan Newbold2021-10-138-25/+307
|
* schema: implement v0.4 tweaks, and bump version numberBryan Newbold2021-10-131-2/+71
|
* update proposals for v0.4 and (hypothetical) v0.5Bryan Newbold2021-10-132-4/+35
|
* update statsBryan Newbold2021-10-113-0/+48
|
* another vanished content exampleBryan Newbold2021-10-071-0/+7
|
* Merge branch 'bnewbold-ingest-tweaks' into 'master'bnewbold2021-10-025-39/+142
|\ | | | | | | | | ingest importer behavior tweaks See merge request webgroup/fatcat!120
| * update changelog with notable ingest importer tweaksBryan Newbold2021-10-011-0/+3
| |
| * kafka import: optional 'force-flush' mode for some importersBryan Newbold2021-10-012-0/+16
| | | | | | | | Behavior and motivation described in the kafka json import comment.
| * new SPN web (html) importerBryan Newbold2021-10-013-27/+111
| |
| * ingest importer behavior tweaksBryan Newbold2021-10-011-8/+8
| | | | | | | | | | - change order of 'want()' checks, so that result counts are clearer - don't require GROBID success for file imports with SPN
| * importer common: more verbose logging (with counts)Bryan Newbold2021-10-011-4/+4
| |
* | Merge branch 'martin-datacite-emtpy-abstract-sentry-94639' into 'master'bnewbold2021-10-024-2/+95
|\ \ | |/ |/| | | | | datacite: skip empty abstracts See merge request webgroup/fatcat!119
| * datacite: skip empty abstractsMartin Czygan2021-10-014-2/+95
|/ | | | | Do not add abstracts where `clean` results in the empty string - this violates a constraint: `either abstract_sha1 or content is required`
* default ingest request topic now '-daily'; configurable for ingest_tool.pyBryan Newbold2021-09-304-4/+9
|
* Merge branch 'martin-pubmed-ftp-extramuros' into 'master'Martin Czygan2021-09-091-24/+21
|\ | | | | | | | | pubmed: workaround a networking issue See merge request webgroup/fatcat!118
| * pubmed: workaround a networking issueMartin Czygan2021-09-091-24/+21
| | | | | | | | | | | | use an http proxy (https://github.com/miku/ftpup) to fetch files from FTP, keep some retry logic; also, hardcoding the proxy path as this should be a temporary workaround
* | trivial blank line lintBryan Newbold2021-09-081-1/+0
| |
* | Merge branch 'master' of git.archive.org:webgroup/fatcatBryan Newbold2021-09-081-2/+31
|\|
| * Merge branch 'martin-pubmed-use-lftp' into 'master'Martin Czygan2021-09-081-2/+31
| |\ | | | | | | | | | | | | pubmed: add option to ftp download with lftp See merge request webgroup/fatcat!117
| | * pubmed: add option to ftp download with lftpMartin Czygan2021-09-081-2/+31
| |/ | | | | | | | | lftp is a classic command line ftp client, and we hope that its retry capabilities are enough of a workaround for the current networking issue
* / sql_dumps: set collection at upload timeBryan Newbold2021-09-021-2/+5
|/
* Merge branch 'martin-pubmed-eof-sentry-92151' into 'master'Martin Czygan2021-08-211-8/+21
|\ | | | | | | | | pubmed harvester: add basic retry logic See merge request webgroup/fatcat!116
| * pubmed harvester: add basic retry logicMartin Czygan2021-08-201-8/+21
|/ | | | | | | | Related to a previous issue with seemingly random EOFError from FTP connections, this patch wrap "ftpretr" helper function with a basic retry. Refs: fatcat-workers/issues/92151, fatcat-workers/issues/91102
* guide: remove accidental duplicated background sectionBryan Newbold2021-08-181-9/+0
|
* cgraph -> refcatBryan Newbold2021-08-132-2/+2
|
* web: fix stats rowspan (oops)Bryan Newbold2021-08-121-1/+1
|
* web: remove confusing 'references' row from stats tableBryan Newbold2021-08-121-3/+0
| | | | Now that we have refcat, which is a different number
* Merge branch 'martin-guide-ref-minor-tweaks' into 'master'bnewbold2021-08-091-3/+4
|\ | | | | | | | | guide: reference graph, minor tweaks See merge request webgroup/fatcat!115
| * guide: reference graph, minor tweaksMartin Czygan2021-08-071-3/+4
|/