summaryrefslogtreecommitdiffstats
path: root/python/tests
Commit message (Collapse)AuthorAgeFilesLines
* datacite: address raw_name index form commentMartin Czygan2020-01-0220-112/+128
| | | | | | | | | > The convention for display_name and raw_name is to be how the name would normally be printed, not in index form (surname comma given_name). So we might need to un-encode names like "Tricart, Pierre". Use an additional `index_form_to_display_name` function to convert index from to display form, heuristically.
* datacite: add conversion fixturesMartin Czygan2020-01-0250-1/+3949
| | | | | | | | | | | | | The `test_datacite_conversions` function will compare an input (datacite) document to an expected output (release entity as JSON). This way, it should not be too hard to add more cases by adding: input, output - and by increasing the counter in the range loop within the test. To view input and result side by side with vim, change into the test directory and run: tests/files/datacite $ ./caseview.sh 18
* datacite: adjust testsMartin Czygan2019-12-281-2/+1
|
* address first round of MR14 commentsMartin Czygan2019-12-281-2/+176
| | | | | | | | | | | | | * add missing langdetect * use entity_to_dict for json debug output * factor out code for fields in function and add table driven tests * update citeproc types * add author as default role * add raw_affiliation * include relations from datacite * remove url (covered by doi already) Using yapf for python formatting.
* improve datacite field mapping and importMartin Czygan2019-12-283-17/+92
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Current version succeeded to import a random sample of 100000 records (0.5%) from datacite. The --debug (write JSON to stdout) and --insert-log-file (log batch before committing to db) flags are temporary added to help debugging. Add few unit tests. Some edge cases: a) Existing keys without value requires a slightly awkward: ``` titles = attributes.get('titles', []) or [] ``` b) There can be 0, 1, or more (first one wins) titles. c) Date handling is probably not ideal. Datacite has a potentiall fine grained list of dates. The test case (tests/files/datacite_sample.jsonl) refers to https://ssl.fao.org/glis/doi/10.18730/8DYM9, which has date (main descriptor) 1986. The datacite record contains: 2017 (publicationYear, probably the year of record creation with reference system), 1978-06-03 (collected, e.g. experimental sample), 1986 ("Accepted"). The online version of the resource knows even one more date (2019-06-05 10:14:43 by WIEWS update).
* datacite: importer skeletonMartin Czygan2019-12-281-0/+25
| | | | | | * contributors, title, date, publisher, container, license Field and value analysis via https://github.com/miku/indigo.
* datacite: fix harvest testMartin Czygan2019-12-271-1/+1
| | | | | | Produced messages should match: jq '.data|length' tests/files/datacite_api.json
* datacite: add simple test and fixture for datacite api interactionMartin Czygan2019-12-272-0/+46
|
* add regression test for medlinedate -> year parsingBryan Newbold2019-12-232-0/+102
|
* regression test for deleted entity history viewBryan Newbold2019-12-091-0/+25
|
* add basic test for crossref harvest API callBryan Newbold2019-12-062-0/+46
|
* add regression test for upper-case SHA-1 form submitBryan Newbold2019-12-021-0/+10
|
* ingest file result importerBryan Newbold2019-11-152-0/+59
|
* test for ingest transformBryan Newbold2019-11-151-0/+57
|
* add ingest request transform (and test)Bryan Newbold2019-11-151-1/+1
|
* Merge branch 'martin-search-results-pagination' into 'master'Martin Czygan2019-11-151-2/+3
|\ | | | | | | | | Add basic pagination to search results See merge request webgroup/fatcat!4
| * address test issueMartin Czygan2019-11-151-2/+3
| |
| * adjust search test case for new wordingMartin Czygan2019-11-141-2/+2
| | | | | | | | > "Showing top " -> "Showing first "
* | fix crossref component testBryan Newbold2019-11-041-1/+1
|/
* commit file cleaner testsBryan Newbold2019-10-081-0/+58
|
* redirect direct entity underscore linksBryan Newbold2019-10-031-0/+2
|
* python webface impl token generationBryan Newbold2019-09-181-0/+8
|
* skip test_crossref_importer_huge() by defaultBryan Newbold2019-09-131-0/+1
|
* refactor all python source for client lib nameBryan Newbold2019-09-0526-74/+74
|
* add kbart counts to container statsBryan Newbold2019-07-311-0/+1
|
* complete generic entity rev viewsBryan Newbold2019-06-281-8/+46
| | | | | | Was getting 500s in production from crawlers. Also expand test coverage.
* release elasticsearch results: stage not statusBryan Newbold2019-06-131-1/+1
|
* start adding some new web route testsBryan Newbold2019-06-131-0/+6
|
* update tests for lookup viewsBryan Newbold2019-06-051-3/+3
|
* release lookup viewBryan Newbold2019-06-051-1/+1
|
* tweak JALC tests for english swaperooBryan Newbold2019-05-291-2/+2
|
* faster LargeFile XML importer for PubMedBryan Newbold2019-05-291-3/+3
|
* set superceded flag on 'old' arxiv releasesBryan Newbold2019-05-231-0/+3
|
* count linked refs (not just raw refs) in elasticsearchBryan Newbold2019-05-221-0/+6
|
* arxiv license slug shorter; fix testBryan Newbold2019-05-221-2/+2
|
* more JALC importer polishBryan Newbold2019-05-211-2/+29
|
* JALC bulk file importerBryan Newbold2019-05-211-0/+100
|
* arxiv importer polishBryan Newbold2019-05-211-1/+2
|
* JSTOR importer polishBryan Newbold2019-05-211-5/+5
|
* updates to pubmed importerBryan Newbold2019-05-211-4/+45
|
* tweaks to new imports/testsBryan Newbold2019-05-212-3/+3
|
* initial pubmed importerBryan Newbold2019-05-211-0/+80
|
* minor jalc test cleanupBryan Newbold2019-05-211-6/+1
|
* missing jstor import test (and fix typo)Bryan Newbold2019-05-211-0/+77
|
* initial arxivraw importer (from parser)Bryan Newbold2019-05-211-0/+96
|
* initial flesh out of JALC parserBryan Newbold2019-05-211-0/+88
|
* basic JALC XML DOI metadata parserBryan Newbold2019-05-211-0/+176
|
* basic JSTOR XML parserBryan Newbold2019-05-211-0/+58
|
* basic arxivraw XML parserBryan Newbold2019-05-211-0/+31
|
* basic pubmed parserBryan Newbold2019-05-211-0/+36822
|