summaryrefslogtreecommitdiffstats
Commit message (Expand)AuthorAgeFilesLines
...
| * gitlab CI: cleanupsBryan Newbold2020-12-221-4/+20
| * gitlab CI: explicitly use xenial tag of imageBryan Newbold2020-12-221-1/+1
| * docker xenial: use get-pipenv.py to install pipenv et alBryan Newbold2020-12-221-5/+6
| * docker xenial: switch to rust 1.43.0Bryan Newbold2020-12-221-1/+1
| * docker xenial: include python3.8Bryan Newbold2020-12-221-1/+6
* | web ingest: terminal URL mismatch as skip, not assertBryan Newbold2020-12-301-1/+3
* | update stats (post DOAJ and dblp imports)Bryan Newbold2020-12-292-0/+47
* | dblp import notes; bulk edit changelog updateBryan Newbold2020-12-292-1/+63
* | finally update CHANGELOG for actual v0.3.3 tag/releasev0.3.3Bryan Newbold2020-12-241-15/+16
* | rust openapi lib: bump version to v0.3.3Bryan Newbold2020-12-241-1/+1
* | rust: update lazy_static dependencyBryan Newbold2020-12-243-35/+26
* | dblp release import: skip arxiv_id releasesBryan Newbold2020-12-241-0/+9
* | normalizer: test for un-versioned arxiv_idBryan Newbold2020-12-241-0/+4
* | dblp import: fix arxiv_id typoBryan Newbold2020-12-231-1/+1
* | ingest: allow dblp importsBryan Newbold2020-12-231-1/+1
* | fuzzy: set 120 second timeout on ES lookupsBryan Newbold2020-12-231-1/+1
* | DOAJ import notes, and SQL/stats updateBryan Newbold2020-12-235-0/+109
|/
* dblp: polish HTML scrape/extract pipelineBryan Newbold2020-12-174-3/+30
* dblp: flesh out update code path (especially to add container_id linkage)Bryan Newbold2020-12-171-2/+6
* dblp: run fuzzy matching at try_update time (same as DOAJ)Bryan Newbold2020-12-171-1/+8
* small dblp proposal updatesBryan Newbold2020-12-171-5/+2
* dblp: script and notes on container metadata generationBryan Newbold2020-12-174-0/+134
* improve dblp release importBryan Newbold2020-12-173-4/+17
* very simple dblp container importerBryan Newbold2020-12-177-7/+256
* dblp release importer: container_id lookup TSV, and dump JSON modeBryan Newbold2020-12-172-13/+73
* commit DBLP proposal progressBryan Newbold2020-12-171-7/+10
* dblp import proposalBryan Newbold2020-12-171-0/+159
* basic test coverage of dblp release importerBryan Newbold2020-12-174-0/+503
* wikidata QID normalize helperBryan Newbold2020-12-171-2/+24
* initial implementation of dblp release importer (in progress)Bryan Newbold2020-12-173-0/+474
* add 'lxml' mode for large XML file import, and multi-tagsBryan Newbold2020-12-173-19/+31
* rust: fix malformed ext id error typeBryan Newbold2020-12-171-2/+2
* rust: rename and improve dblp key (id) syntax checkBryan Newbold2020-12-172-9/+17
* fix sloppy is_preserved ES transfom test failureBryan Newbold2020-12-171-1/+1
* DOAJ import notesBryan Newbold2020-12-172-2/+23
* add dblp as an ingest source and identifierBryan Newbold2020-12-171-1/+2
* ingest: allow doaj ingest responsesBryan Newbold2020-12-171-1/+2
* bug fix: is_preserved should always be boolBryan Newbold2020-12-171-2/+2
* Merge branch 'bnewbold-doaj-fuzzy' into 'master'bnewbold2020-12-187-267/+544
|\
| * update fuzzy helper to pass 'reason' through to import codeBryan Newbold2020-12-172-5/+5
| * pipenv: bump fuzzycat to 0.1.9Bryan Newbold2020-12-172-5/+5
| * add fuzzy match filtering to DOAJ importerBryan Newbold2020-12-162-4/+23
| * add fuzzy matching helper to importer base classBryan Newbold2020-12-163-2/+147
| * pipenv: add fuzzycat dependencyBryan Newbold2020-12-162-261/+374
* | Merge pull request #65 from ibnesayeed/patch-1bnewbold2020-12-171-1/+1
|\ \
| * | Improve status counting efficiencySawood Alam2020-12-171-1/+1
* | | Merge branch 'bnewbold-es-transform-html' into 'master'Martin Czygan2020-12-175-146/+296
|\ \ \ | |_|/ |/| |
| * | entity update worker: treat fileset and webcapture updates like file updatesBryan Newbold2020-12-161-3/+25
| * | fix indentationBryan Newbold2020-12-161-2/+2
| * | have release elasticsearch transform count webcaptures and filesets towards p...Bryan Newbold2020-12-161-26/+57