aboutsummaryrefslogtreecommitdiffstats
path: root/python/fatcat_import.py
Commit message (Expand)AuthorAgeFilesLines
* pubmed ftp harvest and KafkaBs4XmlPusherMartin Czygan2020-02-191-16/+16
* refactor fatcat_import kafka group namesBryan Newbold2020-01-211-13/+54
* fix trivial one-character typo in fatcat_import.pyBryan Newbold2020-01-171-1/+1
* actually control pubmed updates with a flagBryan Newbold2020-01-171-0/+4
* add missing sentry/raven tagsBryan Newbold2020-01-101-0/+6
* Merge branch 'martin-datacite-import'Martin Czygan2020-01-081-0/+43
|\
| * datacite: fix typosMartin Czygan2020-01-071-1/+1
| * datacite: remove --lang-detect flagMartin Czygan2020-01-031-4/+0
| * datacite: use specific auth varMartin Czygan2019-12-281-1/+1
| * datacite: add missing --extid-map-file flagMartin Czygan2019-12-281-0/+4
| * improve datacite field mapping and importMartin Czygan2019-12-281-1/+14
| * datacite: importer skeletonMartin Czygan2019-12-281-0/+30
* | importers: control update behavior with more-standard flagBryan Newbold2020-01-061-1/+5
|/
* savepapernow result importerBryan Newbold2019-12-121-0/+24
* improve argparse usageBryan Newbold2019-12-111-18/+30
* tweaks to file ingest importerBryan Newbold2019-12-031-0/+6
* have ingest-file-results importer operate as crawl-botBryan Newbold2019-11-151-1/+1
* better ingest-file-results import nameBryan Newbold2019-11-151-1/+1
* ingest file result importerBryan Newbold2019-11-151-0/+34
* small fixes to confluent-kafka importers/workersBryan Newbold2019-09-201-1/+1
* convert importers to confluent-kafka libraryBryan Newbold2019-09-201-2/+3
* start chocula importerBryan Newbold2019-09-031-0/+14
* support extids in matched importerBryan Newbold2019-06-201-0/+4
* faster LargeFile XML importer for PubMedBryan Newbold2019-05-291-1/+1
* make pubmed ref lookups configurableBryan Newbold2019-05-221-1/+8
* creative importer for bulk JSTOR importsBryan Newbold2019-05-221-0/+18
* pubmed importer command and tweaksBryan Newbold2019-05-221-0/+25
* arxiv importer robustification and CLI implBryan Newbold2019-05-211-0/+21
* JALC bulk file importerBryan Newbold2019-05-211-0/+21
* fix default mimetype (impacted pre-1923 files)Bryan Newbold2019-05-151-1/+5
* editgroup description overrideBryan Newbold2019-04-221-1/+11
* minor arabesque tweaksBryan Newbold2019-04-181-12/+22
* arabesque importer using crawl-bot credsBryan Newbold2019-04-181-1/+1
* arabesque import tweaksBryan Newbold2019-04-181-0/+4
* early version of arabesque importerBryan Newbold2019-04-121-0/+28
* importer for CDL/DASH dat pilot dweb datasetsBryan Newbold2019-03-191-1/+29
* new importer: wayback_staticBryan Newbold2019-03-191-0/+48
* reduce default import batch size to 50Bryan Newbold2019-01-291-1/+1
* batch size as a general import paramBryan Newbold2019-01-281-13/+4
* add missing bezerk-mode flag to GROBID importBryan Newbold2019-01-281-3/+8
* fix typo in crossref importerBryan Newbold2019-01-281-1/+1
* update journal meta import/transformBryan Newbold2019-01-251-3/+3
* more import script fixesBryan Newbold2019-01-231-1/+4
* update importer scriptBryan Newbold2019-01-231-33/+24
* pubmed+datacite tokens; no journal,grobid,matched tokensBryan Newbold2019-01-221-2/+2
* issn => journal_metadata in several placesBryan Newbold2019-01-171-9/+9
* start refactoring API object passingBryan Newbold2019-01-081-13/+36
* crossref importer checks for existing DOIsBryan Newbold2018-11-211-3/+7
* correct kafka topic namesBryan Newbold2018-11-201-1/+1
* start supporting kafka importersBryan Newbold2018-11-191-3/+17