aboutsummaryrefslogtreecommitdiffstats
Commit message (Expand)AuthorAgeFilesLines
* bump python client to 0.5.0Bryan Newbold2021-11-1710-15/+15
* because of SQL change, this schema bump does warrent a minor version bump to ...Bryan Newbold2021-11-171-1/+1
* content_scope: include in file ES schema and transformBryan Newbold2021-11-172-0/+2
* guide: document content_scope fieldBryan Newbold2021-11-173-1/+49
* minimal python test coverage of content_scope fieldsBryan Newbold2021-11-173-0/+6
* python code: update python_openapi_client in lockfileBryan Newbold2021-11-171-1/+1
* update python client library codegen for content_scopeBryan Newbold2021-11-179-17/+95
* rust: bump crate version and lockfileBryan Newbold2021-11-172-3/+3
* rust: implement content_scopeBryan Newbold2021-11-175-0/+22
* SQL implementation of content_scopeBryan Newbold2021-11-172-0/+36
* codegen rust code for content_scopeBryan Newbold2021-11-173-4/+19
* schema: add content_scope fields, and bump to 0.4.1Bryan Newbold2021-11-171-1/+10
* proposal: content_scope fieldBryan Newbold2021-11-171-0/+84
* updated notes on possible cleanupsBryan Newbold2021-11-171-4/+27
* ISSN-L dupes check: output all matchesBryan Newbold2021-11-171-1/+1
* document cleanups run this weekBryan Newbold2021-11-125-0/+244
* web: handle ES non-int error codes betterBryan Newbold2021-11-121-9/+12
* Merge branch 'bnewbold-import-refactors' into 'master'bnewbold2021-11-1127-1599/+874
|\
| * update datacite tests for license slug changesBryan Newbold2021-11-102-8/+7
| * improve lookup_license_slug helper and lookup tableBryan Newbold2021-11-102-56/+62
| * refactor importer metadata tables into separate file; move some helpers aroundBryan Newbold2021-11-1010-702/+682
| * importers: refactor imports of clean() and other normalization helpersBryan Newbold2021-11-1012-95/+104
| * remove cdl_dash_dat and wayback_static importersBryan Newbold2021-11-104-596/+0
| * datacite import: store less subject metadataBryan Newbold2021-11-101-1/+7
| * add notes about 'double slash in DOI' issueBryan Newbold2021-11-091-0/+46
| * importers: use clean_doi() in many more (all?) importersBryan Newbold2021-11-096-12/+29
| * clean_doi: stop mutating double-slash DOIs, except for 10.1037 prefixBryan Newbold2021-11-091-1/+2
| * remove deprecated extid sqlite3 lookup table feature from importersBryan Newbold2021-11-0910-203/+10
* | Merge branch 'bnewbold-cleanups-nov2021' into 'master'bnewbold2021-11-119-1/+1504
|\ \
| * | wayback ts cleanup: one more filter tweakBryan Newbold2021-11-091-1/+2
| * | update cleanups notesBryan Newbold2021-11-092-0/+72
| * | file/release bugfix: handle files with multiple editsBryan Newbold2021-11-091-6/+6
| * | cleanups: add more state=active checksBryan Newbold2021-11-092-0/+8
| * | update link source filters in file/release bugfixBryan Newbold2021-11-091-2/+8
| * | initial file/release bugfix cleanup worker and notesBryan Newbold2021-11-092-0/+375
| * | updates to lowercase DOI cleanupBryan Newbold2021-11-092-7/+86
| * | lowercase DOI lint and check entity statusBryan Newbold2021-11-091-4/+5
| * | more iteration on short wayback timestamp cleanupBryan Newbold2021-11-093-4/+129
| * | lint: minor import tweakBryan Newbold2021-11-091-1/+1
| * | cleanups: tweaks to wayback CDX cleanup scriptsBryan Newbold2021-11-092-6/+21
| * | cleanups: initial lowercase DOI cleanup scriptBryan Newbold2021-11-091-0/+145
| * | wayback short ts: another regression test, and some small fmt/tweaksBryan Newbold2021-11-091-3/+38
| * | wayback cleanup: actually update entityBryan Newbold2021-11-091-2/+4
| * | imports: generic file cleanup removes exact duplicate URLsBryan Newbold2021-11-091-0/+9
| * | wayback short ts: add regression test for dupe URLsBryan Newbold2021-11-091-0/+44
| * | short wayback ts: initial cleanup script implementationBryan Newbold2021-11-091-0/+251
| * | wayback timestamps: updates to handle 4-digit caseBryan Newbold2021-11-092-11/+108
| * | start work on wayback short-timestamp cleanupBryan Newbold2021-11-092-0/+238
| |/
* | update crawlability docsBryan Newbold2021-11-101-1/+9
* | sitemap generation improvementsBryan Newbold2021-11-102-1/+2