Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | missing jstor import test (and fix typo) | Bryan Newbold | 2019-05-21 | 1 | -2/+1 |
| | |||||
* | initial arxivraw importer (from parser) | Bryan Newbold | 2019-05-21 | 2 | -0/+299 |
| | |||||
* | clean up JALC importer a tiny bit | Bryan Newbold | 2019-05-21 | 1 | -8/+3 |
| | |||||
* | initial JSTOR importer | Bryan Newbold | 2019-05-21 | 2 | -0/+271 |
| | |||||
* | initial flesh out of JALC parser | Bryan Newbold | 2019-05-21 | 3 | -1/+348 |
| | |||||
* | include structured contrib names in CDL/dash importer | Bryan Newbold | 2019-05-20 | 1 | -2/+2 |
| | |||||
* | fix default mimetype (impacted pre-1923 files) | Bryan Newbold | 2019-05-15 | 2 | -4/+9 |
| | |||||
* | python impl | Bryan Newbold | 2019-05-14 | 9 | -32/+38 |
| | |||||
* | python impl | Bryan Newbold | 2019-05-14 | 6 | -16/+16 |
| | |||||
* | python: impl size_bytes -> size | Bryan Newbold | 2019-05-13 | 1 | -1/+1 |
| | |||||
* | importer code updates | Bryan Newbold | 2019-05-13 | 4 | -3/+18 |
| | |||||
* | partial python impl of ext_id and release_stage refactors | Bryan Newbold | 2019-05-13 | 3 | -15/+20 |
| | |||||
* | add limits to match importers | Bryan Newbold | 2019-04-23 | 3 | -2/+27 |
| | |||||
* | archive.org isn't really a repository | Bryan Newbold | 2019-04-22 | 1 | -1/+3 |
| | |||||
* | editgroup description override | Bryan Newbold | 2019-04-22 | 1 | -2/+2 |
| | |||||
* | arabesque importer does require timestamp/wayback | Bryan Newbold | 2019-04-22 | 1 | -0/+3 |
| | |||||
* | matched importer shouldn't require wayback | Bryan Newbold | 2019-04-22 | 1 | -5/+7 |
| | |||||
* | handle API 400 in arabesque import (invalid extid) | Bryan Newbold | 2019-04-19 | 1 | -7/+14 |
| | |||||
* | fix arabesque importer crawl_id None bug | Bryan Newbold | 2019-04-18 | 1 | -1/+1 |
| | |||||
* | mechanism to not double-update entities | Bryan Newbold | 2019-04-18 | 2 | -1/+9 |
| | |||||
* | minor arabesque tweaks | Bryan Newbold | 2019-04-18 | 1 | -0/+2 |
| | |||||
* | update URL rel list | Bryan Newbold | 2019-04-18 | 1 | -1/+10 |
| | |||||
* | arabesque importer does fewer updates | Bryan Newbold | 2019-04-18 | 1 | -1/+8 |
| | |||||
* | arabesque importer | Bryan Newbold | 2019-04-18 | 1 | -0/+165 |
| | |||||
* | early version of arabesque importer | Bryan Newbold | 2019-04-12 | 1 | -0/+1 |
| | |||||
* | add SqlitePusher importer option | Bryan Newbold | 2019-04-12 | 2 | -1/+21 |
| | |||||
* | fix cdl_dash_dat license_slug | Bryan Newbold | 2019-03-19 | 1 | -7/+3 |
| | |||||
* | importer for CDL/DASH dat pilot dweb datasets | Bryan Newbold | 2019-03-19 | 2 | -0/+200 |
| | |||||
* | new importer: wayback_static | Bryan Newbold | 2019-03-19 | 2 | -0/+237 |
| | |||||
* | bunch of lint/whitespace cleanups | Bryan Newbold | 2019-02-22 | 3 | -5/+3 |
| | |||||
* | better/additional crossref license lookups | Bryan Newbold | 2019-02-14 | 1 | -20/+58 |
| | |||||
* | crossref: import subtitle as str, not list[str] | Bryan Newbold | 2019-02-14 | 1 | -0/+2 |
| | |||||
* | don't print missing DOIs, just count | Bryan Newbold | 2019-02-05 | 1 | -1/+3 |
| | |||||
* | add some missing LICENSE_SLUG_MAP | Bryan Newbold | 2019-02-05 | 1 | -1/+4 |
| | |||||
* | yet another required field bug | Bryan Newbold | 2019-01-29 | 1 | -4/+5 |
| | |||||
* | fix null name for container (required) | Bryan Newbold | 2019-01-29 | 1 | -1/+5 |
| | |||||
* | tweaks to GROBID metadata import | Bryan Newbold | 2019-01-29 | 1 | -3/+2 |
| | |||||
* | crossref import tweaks/fixes | Bryan Newbold | 2019-01-29 | 1 | -7/+9 |
| | | | | | - refs: article-title not title; save unstructured; authors not author - save 'language' field (already an ISO code) | ||||
* | fix bug in clean() resulting in many consistency check fails | Bryan Newbold | 2019-01-29 | 2 | -12/+12 |
| | |||||
* | fix refs extra ordering bug | Bryan Newbold | 2019-01-29 | 1 | -6/+6 |
| | |||||
* | pass through kwargs (fixes bezerk imports) | Bryan Newbold | 2019-01-29 | 5 | -5/+10 |
| | |||||
* | ensure raw_name is not stub | Bryan Newbold | 2019-01-29 | 1 | -1/+4 |
| | |||||
* | ensure abstracts aren't stubs | Bryan Newbold | 2019-01-29 | 1 | -2/+3 |
| | |||||
* | add stub parse_record() to make pylint happy | Bryan Newbold | 2019-01-28 | 1 | -0/+4 |
| | |||||
* | fix title length checks in crossref | Bryan Newbold | 2019-01-28 | 1 | -2/+2 |
| | |||||
* | fix rel/url order swap | Bryan Newbold | 2019-01-28 | 1 | -1/+1 |
| | |||||
* | don't allow empty or single-character clean strings | Bryan Newbold | 2019-01-28 | 1 | -1/+1 |
| | |||||
* | filter short/stub original_title | Bryan Newbold | 2019-01-28 | 1 | -3/+7 |
| | |||||
* | many fixes in GROBID importer | Bryan Newbold | 2019-01-28 | 1 | -14/+10 |
| | |||||
* | fix GROBID null/short abstract additions | Bryan Newbold | 2019-01-28 | 1 | -1/+2 |
| |