Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | datacite: fill a few more release_type gaps | Martin Czygan | 2020-01-08 | 1 | -17/+18 |
| | | | | | | | | | | | | | | | | | | * citeproc: http://docs.citationstyles.org/en/stable/specification.html#appendix-iii-types * resourceTypeGeneral: https://schema.datacite.org/meta/kernel-4.0/doc/DataCite-MetadataKernel_v4.0.pdf#page=32 * resourceType: uncontrolled, over 170000 distinct values, frequent: null, Dataset, JournalArticle, PGRFA Material, Journal Article, Dataset/UNITE Species Hypothesis, ... General frequency: * "attributes.types": 18210075, * "attributes.types.ris": 18058890, * "attributes.types.bibtex": 18058888, * "attributes.types.citeproc": 18058890, * "attributes.types.schemaOrg": 18058929, * "attributes.types.resourceType": 12737988, * "attributes.types.resourceTypeGeneral": 16576139, | ||||
* | datacite: adding datacite-specific extra metadata | Martin Czygan | 2020-01-07 | 31 | -1468/+1598 |
| | | | | | | | | | | | | | * attributes.metadataVersion * attributes.schemaVersion * attributes.version (source dependent values, follows suggestions in https://schema.datacite.org/meta/kernel-4.3/doc/DataCite-MetadataKernel_v4.3.pdf#page=26, but values vary) Furthermore: * attributes.types.resourceTypeGeneral * attributes.types.resourceType | ||||
* | datacite: apply pylint suggestions | Martin Czygan | 2020-01-07 | 1 | -8/+10 |
| | |||||
* | datacite: fix typos | Martin Czygan | 2020-01-07 | 2 | -2/+2 |
| | |||||
* | datacite: set release_stage to published by default | Martin Czygan | 2020-01-06 | 1 | -4/+5 |
| | | | | | | Set to `None` only if there is no publisher yet. Docs: https://support.datacite.org/docs/doi-states | ||||
* | datacite: month field should be top-level | Martin Czygan | 2020-01-06 | 12 | -16/+16 |
| | |||||
* | datacite: include month in extra | Martin Czygan | 2020-01-06 | 12 | -11/+15 |
| | | | | | > include release_month as a top-level extra field [...] to auto-populate the schema field from that | ||||
* | datacite: indicate mismatched file in test | Martin Czygan | 2020-01-06 | 1 | -1/+1 |
| | |||||
* | datacite: clean abstracts, use unknown value tokens | Martin Czygan | 2020-01-06 | 4 | -7/+29 |
| | | | | | | | | Datacite defines placeholders for unknown values: * https://support.datacite.org/docs/schema-values-unknown-information-v43 Clean abstracts. | ||||
* | datacite: clean abstract as well | Martin Czygan | 2020-01-06 | 1 | -1/+1 |
| | |||||
* | datacite: filter out 'Cites' relation as well | Martin Czygan | 2020-01-06 | 1 | -1/+1 |
| | |||||
* | pytest: explicitly indicate all in-scope test files | Bryan Newbold | 2020-01-04 | 1 | -3/+1 |
| | | | | | | | | | | | The purpose of this change is to test errors when pytest tries to recursively update assertion statements in all dependent packages. The reason pytest does this is to add pretty printing, which is nice, but probably shouldn't be done in all dependency libraries. This fixes test problems with both CSL (citeproc_styles) and dateparser (when actually imported in code, which currently on master does not happen). | ||||
* | datacite: always include "datacite" key in extra | Martin Czygan | 2020-01-04 | 15 | -28/+28 |
| | | | | | | > always include extra values for the respective DOI registrars (datacite, crossref, jalc), even if they are empty ({}), to be used as a flag so we know which DOI registrar supplied the metadata. | ||||
* | datacite: use normal.clean_doi | Martin Czygan | 2020-01-03 | 2 | -15/+1 |
| | |||||
* | datacite: parse_datacite_dates returns month | Martin Czygan | 2020-01-03 | 2 | -17/+51 |
| | | | | As [...] we will soon add support for release_month field in the release schema. | ||||
* | datacite: prepare release_month (stub) | Martin Czygan | 2020-01-03 | 2 | -24/+24 |
| | |||||
* | datacite: lowercase only once | Martin Czygan | 2020-01-03 | 1 | -3/+4 |
| | |||||
* | add pycountry dependency | Martin Czygan | 2020-01-03 | 2 | -1/+9 |
| | |||||
* | add missing pathlib2 dependency | Martin Czygan | 2020-01-03 | 2 | -1/+18 |
| | | | | | first seen in CI (jobs/230137), slightly related: https://github.com/pytest-dev/pytest/issues/3953 | ||||
* | update potentially outdated Pipfile.lock | Martin Czygan | 2020-01-03 | 1 | -96/+86 |
| | | | | | | | | via: $ pipenv lock CI complained with a slightly cryptic: > TypeError: __init__() missing 1 required positional argument: 'self' | ||||
* | datacite: remove --lang-detect flag | Martin Czygan | 2020-01-03 | 7 | -25/+21 |
| | | | | Estimated time for a single call is in the order of 50ms. | ||||
* | datacite: add another test case | Martin Czygan | 2020-01-02 | 3 | -1/+71 |
| | |||||
* | datacite: open case for editing after creation | Martin Czygan | 2020-01-02 | 1 | -0/+2 |
| | |||||
* | datacite: add helper script to create new test case | Martin Czygan | 2020-01-02 | 1 | -0/+14 |
| | |||||
* | datacite: address raw_name index form comment | Martin Czygan | 2020-01-02 | 21 | -112/+171 |
| | | | | | | | | | > The convention for display_name and raw_name is to be how the name would normally be printed, not in index form (surname comma given_name). So we might need to un-encode names like "Tricart, Pierre". Use an additional `index_form_to_display_name` function to convert index from to display form, heuristically. | ||||
* | datacite: add two more skipable tokens | Martin Czygan | 2020-01-02 | 1 | -1/+1 |
| | |||||
* | datacite: add conversion fixtures | Martin Czygan | 2020-01-02 | 50 | -1/+3949 |
| | | | | | | | | | | | | | The `test_datacite_conversions` function will compare an input (datacite) document to an expected output (release entity as JSON). This way, it should not be too hard to add more cases by adding: input, output - and by increasing the counter in the range loop within the test. To view input and result side by side with vim, change into the test directory and run: tests/files/datacite $ ./caseview.sh 18 | ||||
* | datacite: names can be 'Unav', too | Martin Czygan | 2020-01-02 | 1 | -1/+4 |
| | |||||
* | datacite: avoid more None values | Martin Czygan | 2020-01-01 | 1 | -4/+4 |
| | |||||
* | datacite: address 'Unpublished' publisher | Martin Czygan | 2019-12-31 | 1 | -9/+10 |
| | |||||
* | datacite: ensure name schema is defined | Martin Czygan | 2019-12-31 | 1 | -1/+2 |
| | |||||
* | datacite: fix typo | Martin Czygan | 2019-12-31 | 1 | -1/+1 |
| | |||||
* | datacite: isascii was added in 3.7, only | Martin Czygan | 2019-12-31 | 1 | -1/+7 |
| | |||||
* | datacite: skip non-ascii doi for now | Martin Czygan | 2019-12-31 | 1 | -0/+4 |
| | | | | | | Example of a non-ascii doi: * https://doi.org/10.13125/américacrítica/3017 | ||||
* | datacite: clean doi | Martin Czygan | 2019-12-31 | 1 | -1/+13 |
| | | | | | | | address issue with EN DASH DOI. > "external identifier doesn't match required pattern for a DOI (expected, eg, '10.1234/aksjdfh'): 10.25513/1812-3996.2017.1.34–42" | ||||
* | datacite: update docs | Martin Czygan | 2019-12-31 | 1 | -9/+9 |
| | |||||
* | datacite: perform additional checks on contrib | Martin Czygan | 2019-12-30 | 1 | -3/+9 |
| | |||||
* | datacite: check for empty title after clean | Martin Czygan | 2019-12-29 | 1 | -2/+5 |
| | |||||
* | datacite: update docs with observed values | Martin Czygan | 2019-12-29 | 1 | -1/+3 |
| | |||||
* | datacite: page number misses are too common | Martin Czygan | 2019-12-28 | 1 | -1/+2 |
| | | | | | | Should be a level debug, not info. Examples: E675, n/a, 15D.2.1, 15D.2.1, A.1E.1, A.1E.1, ... | ||||
* | datacite: suppress debug-like language lookup miss message | Martin Czygan | 2019-12-28 | 1 | -1/+3 |
| | |||||
* | datacite: adjust tests | Martin Czygan | 2019-12-28 | 1 | -2/+1 |
| | |||||
* | datacite: treat untyped names as people | Martin Czygan | 2019-12-28 | 1 | -1/+1 |
| | |||||
* | datacite: include container_name top level key in extra | Martin Czygan | 2019-12-28 | 1 | -7/+21 |
| | |||||
* | datacite: use clean on field values | Martin Czygan | 2019-12-28 | 1 | -2/+28 |
| | |||||
* | datacite: include doi in error messages | Martin Czygan | 2019-12-28 | 1 | -8/+8 |
| | |||||
* | remove langcodes dependency | Martin Czygan | 2019-12-28 | 2 | -15/+0 |
| | |||||
* | datacite: limit abstract length | Martin Czygan | 2019-12-28 | 1 | -0/+6 |
| | |||||
* | datacite: use iso 639-1 codes | Martin Czygan | 2019-12-28 | 1 | -7/+4 |
| | |||||
* | datacite: use specific auth var | Martin Czygan | 2019-12-28 | 1 | -1/+1 |
| |