Commit message (Collapse) | Author | Age | Files | Lines | ||
---|---|---|---|---|---|---|
... | ||||||
* | reduce default import batch size to 50 | Bryan Newbold | 2019-01-29 | 1 | -1/+1 | |
| | ||||||
* | yet another required field bug | Bryan Newbold | 2019-01-29 | 1 | -4/+5 | |
| | ||||||
* | fix null name for container (required) | Bryan Newbold | 2019-01-29 | 1 | -1/+5 | |
| | ||||||
* | tweaks to GROBID metadata import | Bryan Newbold | 2019-01-29 | 1 | -3/+2 | |
| | ||||||
* | crossref import tweaks/fixes | Bryan Newbold | 2019-01-29 | 3 | -8/+12 | |
| | | | | | - refs: article-title not title; save unstructured; authors not author - save 'language' field (already an ISO code) | |||||
* | fix bug in clean() resulting in many consistency check fails | Bryan Newbold | 2019-01-29 | 2 | -12/+12 | |
| | ||||||
* | fix refs extra ordering bug | Bryan Newbold | 2019-01-29 | 1 | -6/+6 | |
| | ||||||
* | pass through kwargs (fixes bezerk imports) | Bryan Newbold | 2019-01-29 | 5 | -5/+10 | |
| | ||||||
* | ensure raw_name is not stub | Bryan Newbold | 2019-01-29 | 1 | -1/+4 | |
| | ||||||
* | partial shell.py update | Bryan Newbold | 2019-01-29 | 1 | -4/+4 | |
| | | | | ... but should refactor to use .env and auth_api helper | |||||
* | ensure abstracts aren't stubs | Bryan Newbold | 2019-01-29 | 1 | -2/+3 | |
| | ||||||
* | add stub parse_record() to make pylint happy | Bryan Newbold | 2019-01-28 | 1 | -0/+4 | |
| | ||||||
* | elastic doesn't do well with nullables | Bryan Newbold | 2019-01-28 | 1 | -14/+14 | |
| | ||||||
* | fix title length checks in crossref | Bryan Newbold | 2019-01-28 | 1 | -2/+2 | |
| | ||||||
* | fix rel/url order swap | Bryan Newbold | 2019-01-28 | 1 | -1/+1 | |
| | ||||||
* | remove accidental print in release transform | Bryan Newbold | 2019-01-28 | 1 | -1/+0 | |
| | ||||||
* | don't allow empty or single-character clean strings | Bryan Newbold | 2019-01-28 | 1 | -1/+1 | |
| | ||||||
* | fix tests/cli.sh | Bryan Newbold | 2019-01-28 | 1 | -1/+1 | |
| | ||||||
* | filter short/stub original_title | Bryan Newbold | 2019-01-28 | 1 | -3/+7 | |
| | ||||||
* | fix typo in container transform | Bryan Newbold | 2019-01-28 | 1 | -1/+1 | |
| | ||||||
* | fixes to transform code | Bryan Newbold | 2019-01-28 | 1 | -9/+11 | |
| | ||||||
* | add quick test for WARN rust logging | Bryan Newbold | 2019-01-28 | 1 | -0/+12 | |
| | ||||||
* | many fixes in GROBID importer | Bryan Newbold | 2019-01-28 | 1 | -14/+10 | |
| | ||||||
* | fix matched test vector | Bryan Newbold | 2019-01-28 | 1 | -1/+1 | |
| | | | | this was resulting in a collision with default/example database objects. | |||||
* | fix GROBID null/short abstract additions | Bryan Newbold | 2019-01-28 | 1 | -1/+2 | |
| | ||||||
* | batch size as a general import param | Bryan Newbold | 2019-01-28 | 1 | -13/+4 | |
| | ||||||
* | add missing bezerk-mode flag to GROBID import | Bryan Newbold | 2019-01-28 | 1 | -3/+8 | |
| | ||||||
* | enforce title len>1 for release imports | Bryan Newbold | 2019-01-28 | 2 | -1/+8 | |
| | ||||||
* | fix typo in crossref importer | Bryan Newbold | 2019-01-28 | 1 | -1/+1 | |
| | ||||||
* | drop creators with no display name at all | Bryan Newbold | 2019-01-28 | 1 | -3/+3 | |
| | ||||||
* | make ORCID importer skip no-names, not assert | Bryan Newbold | 2019-01-28 | 1 | -1/+2 | |
| | ||||||
* | more ES index fixes | Bryan Newbold | 2019-01-28 | 3 | -3/+4 | |
| | ||||||
* | vastly improve entity_to_dict() speed | Bryan Newbold | 2019-01-28 | 1 | -1/+9 | |
| | ||||||
* | add filesets and webcaptures to dumps | Bryan Newbold | 2019-01-28 | 1 | -1/+2 | |
| | ||||||
* | fatcat -> fatcat_release ES index | Bryan Newbold | 2019-01-28 | 3 | -20/+21 | |
| | ||||||
* | transform and import fixes/tweaks | Bryan Newbold | 2019-01-25 | 5 | -22/+92 | |
| | ||||||
* | update journal meta import/transform | Bryan Newbold | 2019-01-25 | 6 | -154/+226 | |
| | ||||||
* | grobid import extra metadata tweaks | Bryan Newbold | 2019-01-24 | 1 | -6/+7 | |
| | ||||||
* | refactor _get_editgroup => get_editgroup_id | Bryan Newbold | 2019-01-24 | 2 | -5/+6 | |
| | ||||||
* | refactor make_rel_url | Bryan Newbold | 2019-01-24 | 3 | -29/+66 | |
| | ||||||
* | tweak crossref import, and update tests | Bryan Newbold | 2019-01-24 | 4 | -29/+74 | |
| | ||||||
* | empty fields test | Bryan Newbold | 2019-01-24 | 1 | -0/+13 | |
| | ||||||
* | allow importing contrib/refs lists | Bryan Newbold | 2019-01-24 | 3 | -9/+25 | |
| | | | | | | The motivation here isn't really to support these gigantic lists on principle, but to be able to ingest large corpuses without having to decide whether to filter out or crop such lists. | |||||
* | notes on refactoring container 'extra' | Bryan Newbold | 2019-01-24 | 1 | -0/+79 | |
| | ||||||
* | importer bugfixes | Bryan Newbold | 2019-01-23 | 3 | -8/+14 | |
| | ||||||
* | more import script fixes | Bryan Newbold | 2019-01-23 | 1 | -1/+4 | |
| | ||||||
* | start changes to release ES schema | Bryan Newbold | 2019-01-23 | 4 | -119/+195 | |
| | ||||||
* | bunch of crossref import tweaks (need tests) | Bryan Newbold | 2019-01-23 | 1 | -50/+43 | |
| | ||||||
* | try to fix any_abstract | Bryan Newbold | 2019-01-23 | 1 | -1/+1 | |
| | ||||||
* | clean() checks if it returns null-length string | Bryan Newbold | 2019-01-23 | 1 | -1/+5 | |
| |