summaryrefslogtreecommitdiffstats
path: root/python/tests
Commit message (Collapse)AuthorAgeFilesLines
* datacite: more careful title string access; fixes sentry #88350Martin Czygan2021-06-113-1/+96
| | | | | Caused by a partial "title entry without title" coming *first* (e.g. just holding, e.g. a language, like: {'lang': 'da'}
* dblp tests: skip redundant seek(0)Bryan Newbold2021-06-031-6/+1
|
* ingest: add per-container ingest type overridesBryan Newbold2021-05-211-0/+6
|
* fix arabesque sqlite3 examples to have 14-digit timestampsBryan Newbold2021-05-211-0/+0
|
* make dblp tests more robustBryan Newbold2021-04-121-2/+11
| | | | | | These were causing a lot of spurious errors in local development. Not sure these tweaks will entirely fix the problem.
* transform tool: container transform stats lookup supportBryan Newbold2021-04-061-0/+1
|
* search container stats: changes to be called from index code pathBryan Newbold2021-04-061-0/+10
| | | | Eg, allowing injection of more config values
* container search schema: preservation stats, new fieldsBryan Newbold2021-04-061-5/+42
| | | | Includes transform code updates and partial test coverage.
* datacite: a missing surname should be None, not the empty stringMartin Czygan2021-04-022-2/+0
| | | | refs sentry #77700
* improve dblp release importBryan Newbold2020-12-171-3/+4
|
* very simple dblp container importerBryan Newbold2020-12-174-5/+77
|
* basic test coverage of dblp release importerBryan Newbold2020-12-174-0/+503
|
* add 'lxml' mode for large XML file import, and multi-tagsBryan Newbold2020-12-171-2/+2
|
* fix sloppy is_preserved ES transfom test failureBryan Newbold2020-12-171-1/+1
|
* Merge branch 'bnewbold-doaj-fuzzy' into 'master'bnewbold2020-12-183-2/+99
|\ | | | | | | | | DOAJ import fuzzy match filter See merge request webgroup/fatcat!92
| * update fuzzy helper to pass 'reason' through to import codeBryan Newbold2020-12-171-2/+2
| | | | | | | | | | The motivation for this change is to enable passing the 'reason' through to edit extra metadata, in cases where we merge or cluster releases.
| * add fuzzy match filtering to DOAJ importerBryan Newbold2020-12-161-2/+14
| | | | | | | | | | | | | | | | | | | | | | In this default configuration, any entities with a fuzzy match (even "ambiguous") will be skipped at import time, to prevent creating duplicates. This is conservative towards not creating new/duplicate entities. In the future, as we get more confidence in fuzzy match/verification, we can start to ignore AMBIGUOUS, handle EXACT as same release, and merge STRONG (and WEAK?) matches under the same work entity.
| * add fuzzy matching helper to importer base classBryan Newbold2020-12-162-0/+85
| | | | | | | | Using fuzzycat. Add basic test coverage.
* | improve release elasticsearch transform test coverageBryan Newbold2020-12-163-11/+86
|/
* DOAJ: remove accidentally commited 'skip' of a testBryan Newbold2020-11-201-1/+0
|
* doaj: fix update code path (getattr not __dict__)Bryan Newbold2020-11-202-11/+67
| | | | Also add missing code coverage for update path (disabled by default).
* implement remainder of DOAJ article importerBryan Newbold2020-11-191-11/+6
|
* initial implementation of DOAJ importerBryan Newbold2020-11-192-0/+97
| | | | Several things to finish implementing and polish.
* ingest: fix XML ingest test fileBryan Newbold2020-11-051-1/+1
|
* ingest: progress on HTML ingestBryan Newbold2020-11-052-2/+44
|
* ingest: tests for basic XML ingestBryan Newbold2020-11-052-0/+18
|
* ingest: basic checks for ingest_typeBryan Newbold2020-11-052-1/+7
|
* Merge branch 'bnewbold-202009-polish' into 'master'Martin Czygan2020-09-292-6/+6
|\ | | | | | | | | fatcat.wiki 2020-09 polish See merge request webgroup/fatcat!84
| * lint cleanupsBryan Newbold2020-09-171-2/+0
| |
| * web: route constraints on fcids and UUIDsBryan Newbold2020-09-171-4/+6
| | | | | | | | | | | | | | | | | | | | | | Instead of accepting any string for these parameters and throwing a 400 error if not the correct type, implement better route matching at the framework level and return more 404s. This resolves several outstanding sentry exceptions. The "flask-uuid" was imported and seems to have been configured for this purpose previously, but I guess I never finished configuring it.
* | address spammy datacite titlesMartin Czygan2020-09-231-0/+6
|/ | | | | | | | | seemingly from zenodo: * https://fatcat.wiki/release/rzcpjwukobd4pj36ipla22cnoi * https://doi.org/10.5281/zenodo.4041777 About 3400 records with "FULL MOVIE" in title, currently.
* datacite: handle case of empty-string versionBryan Newbold2020-09-102-1/+2
| | | | | Includes a tiny tweak to the datacite import sample file to test this code path.
* generic file entity clean-ups as part of file_meta importerBryan Newbold2020-09-021-0/+99
|
* fixes and test coverage for file_meta importerBryan Newbold2020-08-212-0/+68
|
* datacite importer: update test cases for 'Additional file' as component, not ↵Bryan Newbold2020-08-115-5/+5
| | | | stub
* datacite import: figshare-specific hacksBryan Newbold2020-08-111-0/+1
|
* fix typo bug resulting in lost/bad ext_id web editsBryan Newbold2020-07-311-0/+14
|
* implement webface entity deletionBryan Newbold2020-07-311-0/+57
|
* fix search redirect codes in new testsBryan Newbold2020-07-301-4/+4
|
* wire up new TOML viewsBryan Newbold2020-07-302-20/+62
|
* basic toml transform helperBryan Newbold2020-07-301-0/+22
|
* simple search route increased coverageBryan Newbold2020-07-301-0/+27
|
* minor lint fixesBryan Newbold2020-07-301-1/+0
|
* coverage search: 'recent' endpoint test (minimal)Bryan Newbold2020-07-301-1/+32
|
* expand test coverage of new preservation viewsBryan Newbold2020-07-301-15/+122
|
* refactor coverage tests/mocksBryan Newbold2020-07-305-39/+80
|
* coverage test mock fixesBryan Newbold2020-07-301-14/+51
|
* lint coverage changes (so far)Bryan Newbold2020-07-302-15/+3
|
* include new-style preservation+release_type aggs in container statsBryan Newbold2020-07-301-1/+12
|
* add regression test for broken container coverageBryan Newbold2020-07-302-57/+98
| | | | also shuffle around search/coverage test files