summaryrefslogtreecommitdiffstats
Commit message (Expand)AuthorAgeFilesLines
* address spammy datacite titlesMartin Czygan2020-09-232-0/+25
* homepage: small grammar tweaks (The/the)Bryan Newbold2020-09-111-3/+3
* ingest: default to crawl protocols.io DOIsBryan Newbold2020-09-101-0/+2
* Merge branch 'bnewbold-datacite-not-empty-version' into 'master'bnewbold2020-09-113-2/+3
|\
| * datacite: handle case of empty-string versionBryan Newbold2020-09-103-2/+3
|/
* file_meta import notesBryan Newbold2020-09-041-0/+75
* update stats snapshotBryan Newbold2020-09-032-0/+47
* remove spurious print statementBryan Newbold2020-09-031-1/+0
* Merge branch 'bnewbold-file-meta-cleanups' into 'master'Martin Czygan2020-09-033-0/+149
|\
| * generic file entity clean-ups as part of file_meta importerBryan Newbold2020-09-023-0/+149
|/
* Merge branch 'bnewbold-filemeta'Bryan Newbold2020-08-275-0/+162
|\
| * fix comment typo (thanks martin)Bryan Newbold2020-08-271-1/+1
| * fixes and test coverage for file_meta importerBryan Newbold2020-08-214-6/+82
| * initial implementation of file_meta importerBryan Newbold2020-08-213-0/+86
* | Merge branch 'bnewbold-meta-tags' into 'master'Martin Czygan2020-08-251-2/+1
|\ \ | |/ |/|
| * remove typo (isbn:) from metadata DC.language fieldBryan Newbold2020-08-211-1/+1
| * remove placeholder description meta tagBryan Newbold2020-08-201-1/+0
|/
* Merge branch 'bnewbold-sitemap' into 'master'bnewbold2020-08-2010-7/+206
|\
| * fix SearchAction nesting in WebSite (schema.org)Bryan Newbold2020-08-201-5/+2
| * sitemap fixes from testingBryan Newbold2020-08-194-9/+20
| * update robots.txt and sitemap.xmlBryan Newbold2020-08-194-2/+52
| * iterate on sitemap generationBryan Newbold2020-08-196-7/+119
| * initial sitemap.xml notes/templateBryan Newbold2020-08-192-0/+29
|/
* bulk edit log: add notes on recent chocula importBryan Newbold2020-08-171-0/+17
* entity updater: handle doi=None case betterBryan Newbold2020-08-141-1/+1
* entity updater: es['publisher_type'] not always setBryan Newbold2020-08-141-1/+1
* Merge branch 'bnewbold-ingest-improvements' into 'master'Martin Czygan2020-08-138-38/+120
|\
| * entity update: change big5 ingest behaviorBryan Newbold2020-08-111-9/+15
| * datacite importer: update test cases for 'Additional file' as component, not ...Bryan Newbold2020-08-115-5/+5
| * entity update: default to ingest non-OA worksBryan Newbold2020-08-111-9/+10
| * entity update: skip ingest of figshare+zenodo 'group' DOIsBryan Newbold2020-08-111-0/+15
| * datacite import: figshare-specific hacksBryan Newbold2020-08-112-3/+4
| * datacite import: refactor release_type detection into static methodBryan Newbold2020-08-111-14/+51
| * datacite import: refactor publisher-specific hacks into static methodBryan Newbold2020-08-111-15/+29
| * update crawl blocklist for SPNv2 requests which mostly failBryan Newbold2020-08-101-2/+10
* | Merge branch 'martin-datacite-json-decode-err-sentry-38625' into 'master'bnewbold2020-08-101-1/+8
|\ \ | |/ |/|
| * harvest: datacite API yields HTTP 200 with broken JSONMartin Czygan2020-08-101-1/+8
|/
* release ES transform tweaksBryan Newbold2020-08-071-3/+5
* Merge branch 'bnewbold-work-dumps' into 'master'bnewbold2020-08-056-19/+237
|\
| * fatcat export: flush after batch, not per-lineBryan Newbold2020-08-051-1/+1
| * proposal for work groupingBryan Newbold2020-08-041-0/+60
| * include releases_by_work in ident tarballBryan Newbold2020-08-041-1/+2
| * update SQL dump docs with group-by-work command (by default)Bryan Newbold2020-08-041-1/+1
| * group-by-work mode for fatcat-exportBryan Newbold2020-08-041-15/+157
| * rust Makefile: fix test commandBryan Newbold2020-08-041-2/+1
| * WIP: sorted release ident dumpsBryan Newbold2020-08-041-0/+16
* | Merge branch 'bnewbold-chocula-import-tweaks' into 'master'bnewbold2020-08-051-12/+22
|\ \ | |/ |/|
| * chocula import update tweaksBryan Newbold2020-08-041-10/+14
| * more update keys and cases for chocula importerBryan Newbold2020-08-041-5/+11
| * fix key name mismatch in chocula importerBryan Newbold2020-08-041-1/+1
|/