| Commit message (Expand) | Author | Age | Files | Lines |
* | single-file variant of fileset importer for dataset attempts | Bryan Newbold | 2022-03-23 | 1 | -0/+58 |
* | importer: hotfix for sentry config error | Bryan Newbold | 2022-02-25 | 1 | -1/+1 |
* | update sentry SDK configuration | Bryan Newbold | 2022-02-25 | 1 | -3/+1 |
* | move from raven to sentry_sdk | Martin Czygan | 2021-12-14 | 1 | -2/+2 |
* | remove cdl_dash_dat and wayback_static importers | Bryan Newbold | 2021-11-10 | 1 | -86/+0 |
* | remove deprecated extid sqlite3 lookup table feature from importers | Bryan Newbold | 2021-11-09 | 1 | -21/+1 |
* | facat_import.py: work around corner case in run_cdl_dash_dat() | Bryan Newbold | 2021-11-03 | 1 | -1/+1 |
* | typing: first batch of python bulk type annotations | Bryan Newbold | 2021-11-03 | 1 | -27/+27 |
* | fmt (black): *.py | Bryan Newbold | 2021-11-02 | 1 | -359/+591 |
* | lint/fmt: remove all 'import *' | Bryan Newbold | 2021-11-02 | 1 | -3/+40 |
* | initial implementation of fileset ingest importers | Bryan Newbold | 2021-10-14 | 1 | -0/+74 |
* | generic fileset importer class, with test coverage | Bryan Newbold | 2021-10-14 | 1 | -0/+21 |
* | kafka import: optional 'force-flush' mode for some importers | Bryan Newbold | 2021-10-01 | 1 | -0/+3 |
* | new SPN web (html) importer | Bryan Newbold | 2021-10-01 | 1 | -0/+30 |
* | very simple dblp container importer | Bryan Newbold | 2020-12-17 | 1 | -2/+34 |
* | dblp release importer: container_id lookup TSV, and dump JSON mode | Bryan Newbold | 2020-12-17 | 1 | -3/+7 |
* | initial implementation of dblp release importer (in progress) | Bryan Newbold | 2020-12-17 | 1 | -0/+29 |
* | add 'lxml' mode for large XML file import, and multi-tags | Bryan Newbold | 2020-12-17 | 1 | -2/+1 |
* | implement remainder of DOAJ article importer | Bryan Newbold | 2020-11-19 | 1 | -0/+37 |
* | ingest: initial 'web' worker implementation | Bryan Newbold | 2020-11-05 | 1 | -0/+42 |
* | ingest: whitelist -> allowlist | Bryan Newbold | 2020-11-05 | 1 | -3/+3 |
* | fixes and test coverage for file_meta importer | Bryan Newbold | 2020-08-21 | 1 | -1/+4 |
* | initial implementation of file_meta importer | Bryan Newbold | 2020-08-21 | 1 | -0/+15 |
* | lint (flake8) top-level python files | Bryan Newbold | 2020-07-01 | 1 | -1/+3 |
* | Merge pull request #53 from EdwardBetts/spelling | bnewbold | 2020-03-27 | 1 | -2/+2 |
|\ |
|
| * | Correct spelling mistakes | Edward Betts | 2020-03-27 | 1 | -2/+2 |
* | | Merge branch 'martin-kafka-bs4-import' into 'master' | Martin Czygan | 2020-03-10 | 1 | -16/+18 |
|\ \
| |/
|/| |
|
| * | fatcat_import: address potential hanging, if stdin is empty | Martin Czygan | 2020-03-09 | 1 | -0/+2 |
| * | more pubmed adjustments | Martin Czygan | 2020-02-22 | 1 | -1/+1 |
| * | pubmed ftp harvest and KafkaBs4XmlPusher | Martin Czygan | 2020-02-19 | 1 | -16/+16 |
* | | shadow import fixes from QA testing | Bryan Newbold | 2020-02-13 | 1 | -1/+1 |
* | | basic shadow importer | Bryan Newbold | 2020-02-13 | 1 | -0/+15 |
|/ |
|
* | refactor fatcat_import kafka group names | Bryan Newbold | 2020-01-21 | 1 | -13/+54 |
* | fix trivial one-character typo in fatcat_import.py | Bryan Newbold | 2020-01-17 | 1 | -1/+1 |
* | actually control pubmed updates with a flag | Bryan Newbold | 2020-01-17 | 1 | -0/+4 |
* | add missing sentry/raven tags | Bryan Newbold | 2020-01-10 | 1 | -0/+6 |
* | Merge branch 'martin-datacite-import' | Martin Czygan | 2020-01-08 | 1 | -0/+43 |
|\ |
|
| * | datacite: fix typos | Martin Czygan | 2020-01-07 | 1 | -1/+1 |
| * | datacite: remove --lang-detect flag | Martin Czygan | 2020-01-03 | 1 | -4/+0 |
| * | datacite: use specific auth var | Martin Czygan | 2019-12-28 | 1 | -1/+1 |
| * | datacite: add missing --extid-map-file flag | Martin Czygan | 2019-12-28 | 1 | -0/+4 |
| * | improve datacite field mapping and import | Martin Czygan | 2019-12-28 | 1 | -1/+14 |
| * | datacite: importer skeleton | Martin Czygan | 2019-12-28 | 1 | -0/+30 |
* | | importers: control update behavior with more-standard flag | Bryan Newbold | 2020-01-06 | 1 | -1/+5 |
|/ |
|
* | savepapernow result importer | Bryan Newbold | 2019-12-12 | 1 | -0/+24 |
* | improve argparse usage | Bryan Newbold | 2019-12-11 | 1 | -18/+30 |
* | tweaks to file ingest importer | Bryan Newbold | 2019-12-03 | 1 | -0/+6 |
* | have ingest-file-results importer operate as crawl-bot | Bryan Newbold | 2019-11-15 | 1 | -1/+1 |
* | better ingest-file-results import name | Bryan Newbold | 2019-11-15 | 1 | -1/+1 |
* | ingest file result importer | Bryan Newbold | 2019-11-15 | 1 | -0/+34 |