Commit message (Collapse) | Author | Age | Files | Lines | ||
---|---|---|---|---|---|---|
... | ||||||
* | initial flesh out of JALC parser | Bryan Newbold | 2019-05-21 | 3 | -1/+348 | |
| | ||||||
* | include structured contrib names in CDL/dash importer | Bryan Newbold | 2019-05-20 | 1 | -2/+2 | |
| | ||||||
* | fix default mimetype (impacted pre-1923 files) | Bryan Newbold | 2019-05-15 | 2 | -4/+9 | |
| | ||||||
* | python impl | Bryan Newbold | 2019-05-14 | 9 | -32/+38 | |
| | ||||||
* | python impl | Bryan Newbold | 2019-05-14 | 6 | -16/+16 | |
| | ||||||
* | python: impl size_bytes -> size | Bryan Newbold | 2019-05-13 | 1 | -1/+1 | |
| | ||||||
* | importer code updates | Bryan Newbold | 2019-05-13 | 4 | -3/+18 | |
| | ||||||
* | partial python impl of ext_id and release_stage refactors | Bryan Newbold | 2019-05-13 | 3 | -15/+20 | |
| | ||||||
* | add limits to match importers | Bryan Newbold | 2019-04-23 | 3 | -2/+27 | |
| | ||||||
* | archive.org isn't really a repository | Bryan Newbold | 2019-04-22 | 1 | -1/+3 | |
| | ||||||
* | editgroup description override | Bryan Newbold | 2019-04-22 | 1 | -2/+2 | |
| | ||||||
* | arabesque importer does require timestamp/wayback | Bryan Newbold | 2019-04-22 | 1 | -0/+3 | |
| | ||||||
* | matched importer shouldn't require wayback | Bryan Newbold | 2019-04-22 | 1 | -5/+7 | |
| | ||||||
* | handle API 400 in arabesque import (invalid extid) | Bryan Newbold | 2019-04-19 | 1 | -7/+14 | |
| | ||||||
* | fix arabesque importer crawl_id None bug | Bryan Newbold | 2019-04-18 | 1 | -1/+1 | |
| | ||||||
* | mechanism to not double-update entities | Bryan Newbold | 2019-04-18 | 2 | -1/+9 | |
| | ||||||
* | minor arabesque tweaks | Bryan Newbold | 2019-04-18 | 1 | -0/+2 | |
| | ||||||
* | update URL rel list | Bryan Newbold | 2019-04-18 | 1 | -1/+10 | |
| | ||||||
* | arabesque importer does fewer updates | Bryan Newbold | 2019-04-18 | 1 | -1/+8 | |
| | ||||||
* | arabesque importer | Bryan Newbold | 2019-04-18 | 1 | -0/+165 | |
| | ||||||
* | early version of arabesque importer | Bryan Newbold | 2019-04-12 | 1 | -0/+1 | |
| | ||||||
* | add SqlitePusher importer option | Bryan Newbold | 2019-04-12 | 2 | -1/+21 | |
| | ||||||
* | fix cdl_dash_dat license_slug | Bryan Newbold | 2019-03-19 | 1 | -7/+3 | |
| | ||||||
* | importer for CDL/DASH dat pilot dweb datasets | Bryan Newbold | 2019-03-19 | 2 | -0/+200 | |
| | ||||||
* | new importer: wayback_static | Bryan Newbold | 2019-03-19 | 2 | -0/+237 | |
| | ||||||
* | bunch of lint/whitespace cleanups | Bryan Newbold | 2019-02-22 | 3 | -5/+3 | |
| | ||||||
* | better/additional crossref license lookups | Bryan Newbold | 2019-02-14 | 1 | -20/+58 | |
| | ||||||
* | crossref: import subtitle as str, not list[str] | Bryan Newbold | 2019-02-14 | 1 | -0/+2 | |
| | ||||||
* | don't print missing DOIs, just count | Bryan Newbold | 2019-02-05 | 1 | -1/+3 | |
| | ||||||
* | add some missing LICENSE_SLUG_MAP | Bryan Newbold | 2019-02-05 | 1 | -1/+4 | |
| | ||||||
* | yet another required field bug | Bryan Newbold | 2019-01-29 | 1 | -4/+5 | |
| | ||||||
* | fix null name for container (required) | Bryan Newbold | 2019-01-29 | 1 | -1/+5 | |
| | ||||||
* | tweaks to GROBID metadata import | Bryan Newbold | 2019-01-29 | 1 | -3/+2 | |
| | ||||||
* | crossref import tweaks/fixes | Bryan Newbold | 2019-01-29 | 1 | -7/+9 | |
| | | | | | - refs: article-title not title; save unstructured; authors not author - save 'language' field (already an ISO code) | |||||
* | fix bug in clean() resulting in many consistency check fails | Bryan Newbold | 2019-01-29 | 2 | -12/+12 | |
| | ||||||
* | fix refs extra ordering bug | Bryan Newbold | 2019-01-29 | 1 | -6/+6 | |
| | ||||||
* | pass through kwargs (fixes bezerk imports) | Bryan Newbold | 2019-01-29 | 5 | -5/+10 | |
| | ||||||
* | ensure raw_name is not stub | Bryan Newbold | 2019-01-29 | 1 | -1/+4 | |
| | ||||||
* | ensure abstracts aren't stubs | Bryan Newbold | 2019-01-29 | 1 | -2/+3 | |
| | ||||||
* | add stub parse_record() to make pylint happy | Bryan Newbold | 2019-01-28 | 1 | -0/+4 | |
| | ||||||
* | fix title length checks in crossref | Bryan Newbold | 2019-01-28 | 1 | -2/+2 | |
| | ||||||
* | fix rel/url order swap | Bryan Newbold | 2019-01-28 | 1 | -1/+1 | |
| | ||||||
* | don't allow empty or single-character clean strings | Bryan Newbold | 2019-01-28 | 1 | -1/+1 | |
| | ||||||
* | filter short/stub original_title | Bryan Newbold | 2019-01-28 | 1 | -3/+7 | |
| | ||||||
* | many fixes in GROBID importer | Bryan Newbold | 2019-01-28 | 1 | -14/+10 | |
| | ||||||
* | fix GROBID null/short abstract additions | Bryan Newbold | 2019-01-28 | 1 | -1/+2 | |
| | ||||||
* | enforce title len>1 for release imports | Bryan Newbold | 2019-01-28 | 2 | -1/+8 | |
| | ||||||
* | drop creators with no display name at all | Bryan Newbold | 2019-01-28 | 1 | -3/+3 | |
| | ||||||
* | make ORCID importer skip no-names, not assert | Bryan Newbold | 2019-01-28 | 1 | -1/+2 | |
| | ||||||
* | transform and import fixes/tweaks | Bryan Newbold | 2019-01-25 | 2 | -4/+10 | |
| |