aboutsummaryrefslogtreecommitdiffstats
path: root/python/tests
Commit message (Collapse)AuthorAgeFilesLines
* datacite: month field should be top-levelMartin Czygan2020-01-0611-14/+14
|
* datacite: include month in extraMartin Czygan2020-01-0611-11/+13
| | | | | > include release_month as a top-level extra field [...] to auto-populate the schema field from that
* datacite: indicate mismatched file in testMartin Czygan2020-01-061-1/+1
|
* datacite: clean abstracts, use unknown value tokensMartin Czygan2020-01-063-3/+3
| | | | | | | | Datacite defines placeholders for unknown values: * https://support.datacite.org/docs/schema-values-unknown-information-v43 Clean abstracts.
* datacite: always include "datacite" key in extraMartin Czygan2020-01-0414-26/+26
| | | | | | > always include extra values for the respective DOI registrars (datacite, crossref, jalc), even if they are empty ({}), to be used as a flag so we know which DOI registrar supplied the metadata.
* datacite: use normal.clean_doiMartin Czygan2020-01-031-4/+0
|
* datacite: parse_datacite_dates returns monthMartin Czygan2020-01-031-7/+16
| | | | As [...] we will soon add support for release_month field in the release schema.
* datacite: prepare release_month (stub)Martin Czygan2020-01-031-14/+14
|
* datacite: remove --lang-detect flagMartin Czygan2020-01-035-10/+15
| | | | Estimated time for a single call is in the order of 50ms.
* datacite: add another test caseMartin Czygan2020-01-023-1/+71
|
* datacite: open case for editing after creationMartin Czygan2020-01-021-0/+2
|
* datacite: add helper script to create new test caseMartin Czygan2020-01-021-0/+14
|
* datacite: address raw_name index form commentMartin Czygan2020-01-0220-112/+128
| | | | | | | | | > The convention for display_name and raw_name is to be how the name would normally be printed, not in index form (surname comma given_name). So we might need to un-encode names like "Tricart, Pierre". Use an additional `index_form_to_display_name` function to convert index from to display form, heuristically.
* datacite: add conversion fixturesMartin Czygan2020-01-0250-1/+3949
| | | | | | | | | | | | | The `test_datacite_conversions` function will compare an input (datacite) document to an expected output (release entity as JSON). This way, it should not be too hard to add more cases by adding: input, output - and by increasing the counter in the range loop within the test. To view input and result side by side with vim, change into the test directory and run: tests/files/datacite $ ./caseview.sh 18
* datacite: adjust testsMartin Czygan2019-12-281-2/+1
|
* address first round of MR14 commentsMartin Czygan2019-12-281-2/+176
| | | | | | | | | | | | | * add missing langdetect * use entity_to_dict for json debug output * factor out code for fields in function and add table driven tests * update citeproc types * add author as default role * add raw_affiliation * include relations from datacite * remove url (covered by doi already) Using yapf for python formatting.
* improve datacite field mapping and importMartin Czygan2019-12-283-17/+92
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current version succeeded to import a random sample of 100000 records (0.5%) from datacite. The --debug (write JSON to stdout) and --insert-log-file (log batch before committing to db) flags are temporary added to help debugging. Add few unit tests. Some edge cases: a) Existing keys without value requires a slightly awkward: ``` titles = attributes.get('titles', []) or [] ``` b) There can be 0, 1, or more (first one wins) titles. c) Date handling is probably not ideal. Datacite has a potentiall fine grained list of dates. The test case (tests/files/datacite_sample.jsonl) refers to https://ssl.fao.org/glis/doi/10.18730/8DYM9, which has date (main descriptor) 1986. The datacite record contains: 2017 (publicationYear, probably the year of record creation with reference system), 1978-06-03 (collected, e.g. experimental sample), 1986 ("Accepted"). The online version of the resource knows even one more date (2019-06-05 10:14:43 by WIEWS update).
* datacite: importer skeletonMartin Czygan2019-12-281-0/+25
| | | | | | * contributors, title, date, publisher, container, license Field and value analysis via https://github.com/miku/indigo.
* datacite: fix harvest testMartin Czygan2019-12-271-1/+1
| | | | | | Produced messages should match: jq '.data|length' tests/files/datacite_api.json
* datacite: add simple test and fixture for datacite api interactionMartin Czygan2019-12-272-0/+46
|
* add regression test for medlinedate -> year parsingBryan Newbold2019-12-232-0/+102
|
* regression test for deleted entity history viewBryan Newbold2019-12-091-0/+25
|
* add basic test for crossref harvest API callBryan Newbold2019-12-062-0/+46
|
* add regression test for upper-case SHA-1 form submitBryan Newbold2019-12-021-0/+10
|
* ingest file result importerBryan Newbold2019-11-152-0/+59
|
* test for ingest transformBryan Newbold2019-11-151-0/+57
|
* add ingest request transform (and test)Bryan Newbold2019-11-151-1/+1
|
* Merge branch 'martin-search-results-pagination' into 'master'Martin Czygan2019-11-151-2/+3
|\ | | | | | | | | Add basic pagination to search results See merge request webgroup/fatcat!4
| * address test issueMartin Czygan2019-11-151-2/+3
| |
| * adjust search test case for new wordingMartin Czygan2019-11-141-2/+2
| | | | | | | | > "Showing top " -> "Showing first "
* | fix crossref component testBryan Newbold2019-11-041-1/+1
|/
* commit file cleaner testsBryan Newbold2019-10-081-0/+58
|
* redirect direct entity underscore linksBryan Newbold2019-10-031-0/+2
|
* python webface impl token generationBryan Newbold2019-09-181-0/+8
|
* skip test_crossref_importer_huge() by defaultBryan Newbold2019-09-131-0/+1
|
* refactor all python source for client lib nameBryan Newbold2019-09-0526-74/+74
|
* add kbart counts to container statsBryan Newbold2019-07-311-0/+1
|
* complete generic entity rev viewsBryan Newbold2019-06-281-8/+46
| | | | | | Was getting 500s in production from crawlers. Also expand test coverage.
* release elasticsearch results: stage not statusBryan Newbold2019-06-131-1/+1
|
* start adding some new web route testsBryan Newbold2019-06-131-0/+6
|
* update tests for lookup viewsBryan Newbold2019-06-051-3/+3
|
* release lookup viewBryan Newbold2019-06-051-1/+1
|
* tweak JALC tests for english swaperooBryan Newbold2019-05-291-2/+2
|
* faster LargeFile XML importer for PubMedBryan Newbold2019-05-291-3/+3
|
* set superceded flag on 'old' arxiv releasesBryan Newbold2019-05-231-0/+3
|
* count linked refs (not just raw refs) in elasticsearchBryan Newbold2019-05-221-0/+6
|
* arxiv license slug shorter; fix testBryan Newbold2019-05-221-2/+2
|
* more JALC importer polishBryan Newbold2019-05-211-2/+29
|
* JALC bulk file importerBryan Newbold2019-05-211-0/+100
|
* arxiv importer polishBryan Newbold2019-05-211-1/+2
|