Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | finish backend refactoring of search code | Bryan Newbold | 2020-07-24 | 2 | -135/+185 |
| | |||||
* | update web_search tests to mock ES client | Bryan Newbold | 2020-07-24 | 2 | -45/+47 |
| | | | | | | Instead of using 'responses' mock of 'requests' library. Tried using 'elasticmock' helper but it didn't work. | ||||
* | refactor release and container search | Bryan Newbold | 2020-07-24 | 6 | -136/+235 |
| | | | | | | | | | | Based on fatcat-scholar refactoring. This doesn't include refactoring of stats, aggregates, or histograms yet, just the direct queries. Don't have any test coverage yet; intend to try elasticmock or figuring out how to ingest mock JSON results directly. | ||||
* | web search: fix pylint error | Bryan Newbold | 2020-07-24 | 1 | -2/+2 |
| | |||||
* | WIP: refactoring search to use elasticsearch-dsl | Bryan Newbold | 2020-07-24 | 2 | -153/+137 |
| | |||||
* | Merge branch 'bnewbold-more-lint-fixes' into 'master' | Martin Czygan | 2020-07-24 | 14 | -34/+26 |
|\ | | | | | | | | | more lint fixes See merge request webgroup/fatcat!69 | ||||
| * | fix issnl typo in pubmed | Bryan Newbold | 2020-07-23 | 1 | -1/+1 |
| | | | | | | | | | | | | | | | | | | | | Oh no! This bug may actually have had significant negative impact on metadata in fatcat, in terms of missing container_id associations with pubmed entities. There are about 500k release entities with a PMID but no container_id. Of those, 89k have at least a container_name. Unclear how many would have matched to ISSN-L and thus to a container. | ||||
| * | remove isascii() work around definition in importers/datacite.py | Bryan Newbold | 2020-07-23 | 1 | -7/+1 |
| | | | | | | | | We are python3.7 now, so this isn't needed. | ||||
| * | simple lint (flake8) fixes over python codebase | Bryan Newbold | 2020-07-23 | 7 | -19/+18 |
| | | | | | | | | | | | | These should not have any behavior changes, though a number of exception catches are now more general, and there may be long-tail exceptions getting thrown in these statements. | ||||
| * | fix actual typo in tests (caught by lint) | Bryan Newbold | 2020-07-23 | 1 | -2/+2 |
| | | |||||
| * | simple lint (flake8) fixes in tests | Bryan Newbold | 2020-07-23 | 5 | -5/+4 |
| | | | | | | | | | | | | The pytest fixture syntax interacts weirdly with flake8 tests, so ignore the "redefinition" and "unused variable" errors more carefully for .py files under ./tests/ | ||||
* | | Merge branch 'bnewbold-preservation-year-offset' into 'master' | bnewbold | 2020-07-24 | 2 | -0/+55 |
|\ \ | |/ |/| | | | | | preservation year offset See merge request webgroup/fatcat!67 | ||||
| * | simplify in_kbart check statement | Bryan Newbold | 2020-07-23 | 1 | -1/+1 |
| | | | | | | | | Thanks @martin | ||||
| * | make in_kbart transform inclusive of last year | Bryan Newbold | 2020-07-23 | 2 | -0/+55 |
|/ | | | | | | | | | | | | | | | | | Frequently when looking at preservation coverage of journals, the current year shows as "un-preserved" when in fact there is robust KBART (keepers, eg CLOCKSS/Portico) coverage. This is partially because we don't update containers with KBART year spans very frequently (which is on us), and partially because KBART reports are often a bit out of day (eg, doesn't show coverage for the current year. For that matter, they probably take a few months to update the previous year as well, but that is a larger time span to fudge over. This patch means we will count Portico/LOCKSS/etc coverage for "last year" to count as coverage of publications dated "this year". Note that for this to be effective/correct, it is assumed that we will update containers with coverage year spans at least once a year, and that we will re-index all releases at least once a year. | ||||
* | example bad MAG match | Bryan Newbold | 2020-07-23 | 1 | -0/+6 |
| | |||||
* | update table/database size stats | Bryan Newbold | 2020-07-22 | 2 | -0/+48 |
| | |||||
* | Merge branch 'martin-datacite-duplicated-author-gh-59' into 'master' | bnewbold | 2020-07-11 | 13 | -251/+619 |
|\ | | | | | | | | | datacite: address duplicated contributor issue See merge request webgroup/fatcat!65 | ||||
| * | datacite: resolve formatting issues in tests | Martin Czygan | 2020-07-10 | 103 | -341/+319 |
| |\ | |||||
| * | | datacite: adjust tests | Martin Czygan | 2020-07-10 | 4 | -10/+6 |
| | | | |||||
| * | | datacite: there should be no index gaps | Martin Czygan | 2020-07-10 | 1 | -2/+8 |
| | | | |||||
| * | | datacite: document contributor types | Martin Czygan | 2020-07-10 | 1 | -0/+25 |
| | | | |||||
| * | | wip: contrib, GH59 | Martin Czygan | 2020-07-10 | 2 | -245/+383 |
| | | | |||||
| * | | wip: contrib, GH59 | Martin Czygan | 2020-07-10 | 5 | -3/+105 |
| | | | |||||
| * | | datacite: address duplicated contributor issue | Martin Czygan | 2020-07-07 | 6 | -11/+110 |
| | | | | | | | | | | | | | | | | | | | | | Use string comparison. * https://fatcat.wiki/release/spjysmrnsrgyzgq6ise5o44rlu/contribs * https://api.datacite.org/dois/10.25940/roper-31098406 | ||||
* | | | Merge branch 'martin-datacite-bugfix-sentry-44035' into 'master' | bnewbold | 2020-07-11 | 1 | -0/+4 |
|\ \ \ | |_|/ |/| | | | | | | | | datacite: mitigate sentry #44035 See merge request webgroup/fatcat!66 | ||||
| * | | datacite: mitigate sentry #44035 | Martin Czygan | 2020-07-10 | 1 | -0/+4 |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | According to sentry, running `c.get('nameIdentifiers', []) or []` on a c with value: ``` {'affiliation': [], 'familyName': 'Guidon', 'givenName': 'Manuel', 'nameIdentifiers': {'nameIdentifier': 'https://orcid.org/0000-0003-3543-6683', 'nameIdentifierScheme': 'ORCID', 'schemeUri': 'https://orcid.org'}, 'nameType': 'Personal'} ``` results in a string, which I cannot reproduce. The document in question at: https://api.datacite.org/dois/10.26275/kuw1-fdls seems fine, too. | ||||
* | | | Merge branch 'martin-arxiv-fix-http-503' into 'master' | bnewbold | 2020-07-10 | 1 | -1/+1 |
|\ \ \ | | | | | | | | | | | | | | | | | arxiv: address 503, "Retry after specified interval" error See merge request webgroup/fatcat!64 | ||||
| * | | | arxiv: do retry five times of HTTP 503 | Martin Czygan | 2020-07-10 | 1 | -1/+1 |
| | | | | |||||
* | | | | get mediawiki username creation working with spaces | Bryan Newbold | 2020-07-09 | 1 | -1/+2 |
| | | | | |||||
* | | | | Merge branch 'martin-datacite-bugfix-sentry-44035' into 'master' | Martin Czygan | 2020-07-06 | 1 | -1/+1 |
|\ \ \ \ | |/ / / |/| / / | |/ / | | | | datacite: fix attribute error See merge request webgroup/fatcat!63 | ||||
| * / | datacite: fix attribute error | Martin Czygan | 2020-07-07 | 1 | -1/+1 |
|/ / | | | | | | | refs: #44035 | ||||
* | | Merge branch 'bnewbold-lint' into 'master' | Martin Czygan | 2020-07-06 | 94 | -351/+152 |
|\ \ | | | | | | | | | | | | | lint cleanups See merge request webgroup/fatcat!62 | ||||
| * | | tweak flake8 params | Bryan Newbold | 2020-07-01 | 1 | -2/+8 |
| | | | |||||
| * | | lint (flake8) python test files | Bryan Newbold | 2020-07-01 | 45 | -168/+71 |
| | | | |||||
| * | | lint (flake8) tool python files | Bryan Newbold | 2020-07-01 | 33 | -130/+46 |
| | | | |||||
| * | | lint (flake8) web interface python files | Bryan Newbold | 2020-07-01 | 7 | -26/+16 |
| | | | |||||
| * | | lint (flake8) top-level python files | Bryan Newbold | 2020-07-01 | 8 | -25/+11 |
|/ / | |||||
* | | updates to Makefile | Bryan Newbold | 2020-07-01 | 3 | -6/+33 |
| | | |||||
* | | reviewer: fix bugs in common code found by mypy | Bryan Newbold | 2020-07-01 | 1 | -2/+3 |
| | | |||||
* | | update TODO with some old examples | Bryan Newbold | 2020-07-01 | 1 | -0/+10 |
| | | |||||
* | | commit old example notes | Bryan Newbold | 2020-07-01 | 3 | -0/+65 |
| | | |||||
* | | JALC bulk edit notes from 2020-03-23 | Bryan Newbold | 2020-07-01 | 1 | -0/+23 |
| | | |||||
* | | commit example of an elasticsearch SQL query | Bryan Newbold | 2020-07-01 | 1 | -0/+8 |
| | | |||||
* | | commit old README about bulk downloads | Bryan Newbold | 2020-07-01 | 1 | -0/+40 |
|/ | |||||
* | CLI proposal | Bryan Newbold | 2020-06-30 | 1 | -0/+124 |
| | |||||
* | add new license mappings | Bryan Newbold | 2020-06-30 | 2 | -0/+27 |
| | |||||
* | datacite: improve license mapping | Martin Czygan | 2020-06-30 | 2 | -9/+29 |
| | | | | via "missed potential license", refs #58 | ||||
* | Merge branch 'martin-datacite-fix-strptime-36559' into 'master' | bnewbold | 2020-06-29 | 2 | -1/+2 |
|\ | | | | | | | | | datacite: hard cast possible date value to string See merge request webgroup/fatcat!59 | ||||
| * | datacite: hard cast possible date value to string | Martin Czygan | 2020-06-29 | 2 | -1/+2 |
|/ | |||||
* | remove accidentally-commited lines from rust Makefile | Bryan Newbold | 2020-06-26 | 1 | -3/+0 |
| |