Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | update robots.txt and sitemap.xml | Bryan Newbold | 2020-08-19 | 4 | -2/+52 |
| | | | | | | - show minimal robots/sitemap if not in prod environment - default to allow all in robots.txt; link to sitemap index files - basic sitemap.xml without entity-level links | ||||
* | iterate on sitemap generation | Bryan Newbold | 2020-08-19 | 6 | -7/+119 |
| | |||||
* | initial sitemap.xml notes/template | Bryan Newbold | 2020-08-19 | 2 | -0/+29 |
| | |||||
* | bulk edit log: add notes on recent chocula import | Bryan Newbold | 2020-08-17 | 1 | -0/+17 |
| | |||||
* | entity updater: handle doi=None case better | Bryan Newbold | 2020-08-14 | 1 | -1/+1 |
| | |||||
* | entity updater: es['publisher_type'] not always set | Bryan Newbold | 2020-08-14 | 1 | -1/+1 |
| | | | | This is a small bugfix for a production issue. | ||||
* | Merge branch 'bnewbold-ingest-improvements' into 'master' | Martin Czygan | 2020-08-13 | 8 | -38/+120 |
|\ | | | | | | | | | ingest behavior changes; some datacite metadata tweaks See merge request webgroup/fatcat!78 | ||||
| * | entity update: change big5 ingest behavior | Bryan Newbold | 2020-08-11 | 1 | -9/+15 |
| | | | | | | | | | | | | | | | | | | In addition to changing the OA default, this was the main intended behavior change in this group of commits: want to ingest fewer attempts that we *expect* to fail, but default to ingest/crawl attempt if we are uncertain. This is because there is a long tail of journals that register DOIs and are defacto OA (fulltext is available), but we don't have metadata indicating them as such. | ||||
| * | datacite importer: update test cases for 'Additional file' as component, not ↵ | Bryan Newbold | 2020-08-11 | 5 | -5/+5 |
| | | | | | | | | stub | ||||
| * | entity update: default to ingest non-OA works | Bryan Newbold | 2020-08-11 | 1 | -9/+10 |
| | | |||||
| * | entity update: skip ingest of figshare+zenodo 'group' DOIs | Bryan Newbold | 2020-08-11 | 1 | -0/+15 |
| | | |||||
| * | datacite import: figshare-specific hacks | Bryan Newbold | 2020-08-11 | 2 | -3/+4 |
| | | |||||
| * | datacite import: refactor release_type detection into static method | Bryan Newbold | 2020-08-11 | 1 | -14/+51 |
| | | |||||
| * | datacite import: refactor publisher-specific hacks into static method | Bryan Newbold | 2020-08-11 | 1 | -15/+29 |
| | | | | | | | | Also tweak title/publisher detection to use DOI prefixes | ||||
| * | update crawl blocklist for SPNv2 requests which mostly fail | Bryan Newbold | 2020-08-10 | 1 | -2/+10 |
| | | |||||
* | | Merge branch 'martin-datacite-json-decode-err-sentry-38625' into 'master' | bnewbold | 2020-08-10 | 1 | -1/+8 |
|\ \ | |/ |/| | | | | | harvest: datacite API yields HTTP 200 with broken JSON See merge request webgroup/fatcat!77 | ||||
| * | harvest: datacite API yields HTTP 200 with broken JSON | Martin Czygan | 2020-08-10 | 1 | -1/+8 |
|/ | | | | As a first step: log response body for debugging. | ||||
* | release ES transform tweaks | Bryan Newbold | 2020-08-07 | 1 | -3/+5 |
| | | | | | | | | pass-through publisher_type from container extra metadata (ES field already existed; this is from newer chocula metadata) count arxiv and PMCID papers which haven't been crawled (by IA) as "dark", not "bright" | ||||
* | Merge branch 'bnewbold-work-dumps' into 'master' | bnewbold | 2020-08-05 | 6 | -19/+237 |
|\ | | | | | | | | | release dumps grouped by work_id See merge request webgroup/fatcat!75 | ||||
| * | fatcat export: flush after batch, not per-line | Bryan Newbold | 2020-08-05 | 1 | -1/+1 |
| | | | | | | | | Good catch, thanks Martin | ||||
| * | proposal for work grouping | Bryan Newbold | 2020-08-04 | 1 | -0/+60 |
| | | |||||
| * | include releases_by_work in ident tarball | Bryan Newbold | 2020-08-04 | 1 | -1/+2 |
| | | |||||
| * | update SQL dump docs with group-by-work command (by default) | Bryan Newbold | 2020-08-04 | 1 | -1/+1 |
| | | |||||
| * | group-by-work mode for fatcat-export | Bryan Newbold | 2020-08-04 | 1 | -15/+157 |
| | | |||||
| * | rust Makefile: fix test command | Bryan Newbold | 2020-08-04 | 1 | -2/+1 |
| | | |||||
| * | WIP: sorted release ident dumps | Bryan Newbold | 2020-08-04 | 1 | -0/+16 |
| | | |||||
* | | Merge branch 'bnewbold-chocula-import-tweaks' into 'master' | bnewbold | 2020-08-05 | 1 | -12/+22 |
|\ \ | |/ |/| | | | | | chocula import tweaks See merge request webgroup/fatcat!74 | ||||
| * | chocula import update tweaks | Bryan Newbold | 2020-08-04 | 1 | -10/+14 |
| | | |||||
| * | more update keys and cases for chocula importer | Bryan Newbold | 2020-08-04 | 1 | -5/+11 |
| | | |||||
| * | fix key name mismatch in chocula importer | Bryan Newbold | 2020-08-04 | 1 | -1/+1 |
|/ | | | | chocula 'export-fatcat' uses 'ident', not 'fatcat_ident' | ||||
* | Merge branch 'bnewbold-editing' into 'master' | bnewbold | 2020-08-03 | 29 | -404/+1368 |
|\ | | | | | | | | | editing improvements See merge request webgroup/fatcat!73 | ||||
| * | web: add links to deletion pages from edit pages | Bryan Newbold | 2020-07-31 | 4 | -0/+13 |
| | | |||||
| * | update top-level README (checklist done) | Bryan Newbold | 2020-07-31 | 1 | -1/+1 |
| | | |||||
| * | editing: withdrawn_status, release_year | Bryan Newbold | 2020-07-31 | 2 | -24/+44 |
| | | |||||
| * | release form validators and tweak labels | Bryan Newbold | 2020-07-31 | 1 | -8/+37 |
| | | |||||
| * | fix typo bug resulting in lost/bad ext_id web edits | Bryan Newbold | 2020-07-31 | 2 | -2/+16 |
| | | |||||
| * | implement webface entity deletion | Bryan Newbold | 2020-07-31 | 3 | -27/+308 |
| | | |||||
| * | routes: handle case of viewing deleted entity in editgroup context | Bryan Newbold | 2020-07-30 | 4 | -8/+35 |
| | | | | | | | | | | | | Eg, consider deleting an entity. When viewing the editgroup, want to be able to click the deleted entity and see the "deleted entity" page instead of a generic 404. | ||||
| * | TOML editing proposal | Bryan Newbold | 2020-07-30 | 1 | -0/+43 |
| | | |||||
| * | remove some meta-fields from TOML form (all entities) | Bryan Newbold | 2020-07-30 | 1 | -1/+5 |
| | | |||||
| * | fix search redirect codes in new tests | Bryan Newbold | 2020-07-30 | 1 | -4/+4 |
| | | |||||
| * | wire up new TOML views | Bryan Newbold | 2020-07-30 | 14 | -83/+256 |
| | | |||||
| * | generic HTML views for TOML editing | Bryan Newbold | 2020-07-30 | 4 | -0/+80 |
| | | |||||
| * | editing: more 'raise' status instead of 'abort()' | Bryan Newbold | 2020-07-30 | 1 | -1/+1 |
| | | |||||
| * | generic helpers for TOML editing routes | Bryan Newbold | 2020-07-30 | 2 | -10/+201 |
| | | |||||
| * | basic toml transform helper | Bryan Newbold | 2020-07-30 | 3 | -4/+42 |
| | | |||||
| * | pipenv: lock pycountry to 19.10 version | Bryan Newbold | 2020-07-30 | 2 | -7/+7 |
| | | | | | | | | datacite importer had errors otherwise | ||||
| * | pipenv: add toml library (and update lock) | Bryan Newbold | 2020-07-30 | 2 | -276/+327 |
| | | |||||
| * | lock loginpass version to prevent conflicting authlib version | Bryan Newbold | 2020-07-30 | 1 | -1/+1 |
| | | | | | | | | | | May be possible to upgrade both of these libraries together, but that isn't the purpose of current development. | ||||
* | | fix search redirect codes in new tests | Bryan Newbold | 2020-07-31 | 1 | -4/+4 |
|/ |