summaryrefslogtreecommitdiffstats
path: root/extra
Commit message (Expand)AuthorAgeFilesLines
...
* | update stats (post DOAJ and dblp imports)Bryan Newbold2020-12-292-0/+47
* | DOAJ import notes, and SQL/stats updateBryan Newbold2020-12-234-0/+94
|/
* dblp: polish HTML scrape/extract pipelineBryan Newbold2020-12-173-3/+16
* dblp: script and notes on container metadata generationBryan Newbold2020-12-174-0/+134
* Merge pull request #65 from ibnesayeed/patch-1bnewbold2020-12-171-1/+1
|\
| * Improve status counting efficiencySawood Alam2020-12-171-1/+1
* | Revert "docker xenial base image: include python3.8"Bryan Newbold2020-12-111-6/+1
* | docker xenial base image: include python3.8Bryan Newbold2020-12-111-1/+6
* | docker: how to push to dockerhubBryan Newbold2020-12-111-0/+4
|/
* update database/table statsBryan Newbold2020-10-122-0/+48
* update stats snapshotBryan Newbold2020-09-032-0/+47
* sitemap fixes from testingBryan Newbold2020-08-193-4/+15
* iterate on sitemap generationBryan Newbold2020-08-196-7/+119
* initial sitemap.xml notes/templateBryan Newbold2020-08-192-0/+29
* include releases_by_work in ident tarballBryan Newbold2020-08-041-1/+2
* update SQL dump docs with group-by-work command (by default)Bryan Newbold2020-08-041-1/+1
* WIP: sorted release ident dumpsBryan Newbold2020-08-041-0/+16
* update table/database size statsBryan Newbold2020-07-222-0/+48
* commit example of an elasticsearch SQL queryBryan Newbold2020-07-011-0/+8
* commit old README about bulk downloadsBryan Newbold2020-07-011-0/+40
* ES schema: add best_url to file schemaBryan Newbold2020-06-041-0/+1
* sql: really don't double-dump requestsBryan Newbold2020-05-261-1/+0
* 2020-05-26 prod database size and statsBryan Newbold2020-05-262-0/+48
* update prod statsBryan Newbold2020-04-177-0/+149
* Add missing packages to Dockerfile and CI fileBryan Newbold2020-04-161-1/+1
* test-base DockerfileBryan Newbold2020-04-162-0/+51
* update bulk export instructionsBryan Newbold2020-04-071-4/+2
* sql_dumps: stop doing redundant release dumpsBryan Newbold2020-04-011-1/+3
* bulk exports README different from SQL READMEBryan Newbold2020-03-171-1/+1
* ES README: really need to limit to 1k esbulk batchesBryan Newbold2020-02-261-3/+3
* Merge branch 'bnewbold-elastic-v03b'Bryan Newbold2020-02-265-61/+203
|\
| * update ES transform READMEBryan Newbold2020-02-261-2/+3
| * ES container last tweaksBryan Newbold2020-02-261-3/+4
| * ES release: last minor tweaksBryan Newbold2020-02-261-3/+5
| * release schema: do doc_value on DOIsBryan Newbold2020-02-131-1/+1
| * ES release: actually do want doc_values for work_idBryan Newbold2020-02-051-1/+1
| * fix axiv/arxiv typo in release schemaBryan Newbold2020-02-041-1/+1
| * ES release schema: fix typoBryan Newbold2020-01-311-1/+1
| * fix json typos in changelog schemaBryan Newbold2020-01-301-2/+2
| * add upper-case work-around from kibana map joinBryan Newbold2020-01-301-0/+1
| * JSON typo in release mappingBryan Newbold2020-01-301-1/+0
| * ES schemas: make keywords case-insensitive by defaultBryan Newbold2020-01-304-66/+115
| * tweak file ES archive.org domain trackingBryan Newbold2020-01-301-0/+1
| * elastic schema fixesBryan Newbold2020-01-292-7/+7
| * add country to v03b release schemaBryan Newbold2020-01-291-0/+1
| * update ES docs and proposalBryan Newbold2020-01-291-0/+2
| * actually implement changelog transformBryan Newbold2020-01-291-1/+10
| * ES release schema updatesBryan Newbold2020-01-291-23/+46
| * container ES schema changesBryan Newbold2020-01-291-13/+20
| * first implementation of ES file schemaBryan Newbold2020-01-291-0/+46