aboutsummaryrefslogtreecommitdiffstats
path: root/extra
Commit message (Collapse)AuthorAgeFilesLines
* JSON typo in release mappingBryan Newbold2020-01-301-1/+0
|
* ES schemas: make keywords case-insensitive by defaultBryan Newbold2020-01-304-66/+115
| | | | But not applying asciifolding; don't see any need to do so?
* tweak file ES archive.org domain trackingBryan Newbold2020-01-301-0/+1
|
* elastic schema fixesBryan Newbold2020-01-292-7/+7
|
* add country to v03b release schemaBryan Newbold2020-01-291-0/+1
|
* update ES docs and proposalBryan Newbold2020-01-291-0/+2
|
* actually implement changelog transformBryan Newbold2020-01-291-1/+10
|
* ES release schema updatesBryan Newbold2020-01-291-23/+46
|
* container ES schema changesBryan Newbold2020-01-291-13/+20
|
* first implementation of ES file schemaBryan Newbold2020-01-291-0/+46
| | | | | Includes a trivial test and transform, but not any workers or doc updates.
* stats: remove internal PG table sizes from old dumpsBryan Newbold2020-01-192-292/+0
| | | | For ease of reading and comparison
* update stats and table sizesBryan Newbold2020-01-194-0/+96
|
* sql table size script: shorter outputBryan Newbold2020-01-151-0/+1
| | | | This skips postgres-internal tables in size output
* 2019-01-07 status updateBryan Newbold2020-01-072-0/+36
|
* DB loads take a long time nowBryan Newbold2019-12-211-1/+1
|
* add 2019-12-20 statsBryan Newbold2019-12-202-0/+148
|
* add kafka-pixy to docker-compose fileBryan Newbold2019-12-101-0/+8
|
* tweaks to docker-compose imageBryan Newbold2019-12-101-0/+5
| | | | | - don't start kafka image until zookeeper is running - set very liberal "watermarks" for elasticsearch disk monitoring
* increase max.message.bytes in containerMartin Czygan2019-12-051-0/+1
| | | | | While working on datacite, some message were larger than the default of 1000012 bytes.
* export raw affiliation strings for analysisBryan Newbold2019-10-031-0/+17
|
* docker-compose: kafka 2.0, and -dev topic namesBryan Newbold2019-09-201-3/+2
|
* document release publish processv0.3.1Bryan Newbold2019-09-181-0/+48
|
* create new collection just for fatcat exportsBryan Newbold2019-09-091-1/+1
|
* update more rust library name refsBryan Newbold2019-09-051-4/+4
|
* update all other mentions of python client libBryan Newbold2019-09-053-9/+9
|
* sql_dumps: typoBryan Newbold2019-07-141-1/+1
|
* more fixup notes (from QA server)Bryan Newbold2019-06-271-5/+46
|
* finish fixup_longtail_issnl_unique; but not going to run itBryan Newbold2019-06-271-4/+3
|
* initial work on longtail_issnl_unique.pyBryan Newbold2019-06-241-0/+192
|
* stats.json update after releases v03 cut-overBryan Newbold2019-06-061-0/+1
|
* elasticsearch index alias howtoBryan Newbold2019-06-061-1/+16
|
* QA checks (for hash, extid duplication)Bryan Newbold2019-06-044-0/+82
|
* recent prod table sizes; 380 GBytes or so totalBryan Newbold2019-06-041-0/+233
|
* dump_release_extid.sql changes for new schemaBryan Newbold2019-06-031-1/+1
|
* move export README info to sql_dumps docBryan Newbold2019-06-031-1/+29
|
* fix parse_merge_metadata.py merge_spans()Bryan Newbold2019-05-301-4/+8
|
* better KBART mergingBryan Newbold2019-05-301-4/+5
|
* initial code to handle multiple KBART spans betterBryan Newbold2019-05-301-2/+64
|
* add work-in-progress elastic index notesBryan Newbold2019-05-301-0/+11
|
* add 'superceded' release extra flag to elastic schemaBryan Newbold2019-05-231-0/+1
|
* also track work_id in release elasticsearch tableBryan Newbold2019-05-221-0/+1
|
* count linked refs (not just raw refs) in elasticsearchBryan Newbold2019-05-221-0/+1
|
* commit SQL table stats scriptsBryan Newbold2019-05-212-0/+36
|
* include creator_ids in release elastic schemaBryan Newbold2019-05-201-0/+1
| | | | Intent is to allow fast creator search/lookup
* elastic release schema updateBryan Newbold2019-05-201-1/+6
|
* start tracking statsBryan Newbold2019-05-072-0/+2
|
* IA collection page embed example descriptionBryan Newbold2019-05-071-0/+45
| | | | This code has some issues, but is worth commiting
* old fileset and webcapture example entitiesBryan Newbold2019-04-302-0/+146
|
* no-derive metadata and SQL dump uploads (to petabox)Bryan Newbold2019-04-301-0/+2
|
* faster elasticsearch importsBryan Newbold2019-04-301-1/+1
|