aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* reduce: log broken line onlyMartin Czygan2021-07-101-1/+1
|
* reduce: add key and indexed ts for exact matchesMartin Czygan2021-07-101-0/+2
|
* batch: drop loggingMartin Czygan2021-07-101-4/+0
|
* batch: log batch sizeMartin Czygan2021-07-101-1/+1
|
* reduce: short circuit large groupsMartin Czygan2021-07-101-2/+12
| | | | | | | | we saw a jump in memory usage, and it may be related to groups with thousands of elements; e.g. maybe some weird string, that appears too many times as key, e.g. 123/test; as a first measure, we sort circuit further batching; other mitigiation may to be limit groups size completely
* schema: prefer isbn13Martin Czygan2021-07-101-1/+5
|
* schema: render isbn as wellMartin Czygan2021-07-101-1/+7
|
* reduce: ol, fuzzy, w/ unstructuredMartin Czygan2021-07-101-1/+1
|
* schema: add testMartin Czygan2021-07-102-0/+20
|
* schema: flesh our unstructured renderingMartin Czygan2021-07-102-0/+56
|
* release to unstructured stubMartin Czygan2021-07-103-2/+84
|
* update docsMartin Czygan2021-07-101-0/+2
|
* reduce: open library id tweaksMartin Czygan2021-07-101-5/+27
|
* tasks: bref, add wikipediaMartin Czygan2021-07-101-2/+3
|
* reduce: tweak wiki brefMartin Czygan2021-07-101-4/+5
|
* reduce: filter out duplicate wiki linksMartin Czygan2021-07-101-0/+8
|
* wiki: use lowercase base32 of page titleMartin Czygan2021-07-091-2/+3
| | | | * mostly case insensitive, same case as ident
* reduce: use a base64 encoded title as keyMartin Czygan2021-07-091-1/+7
|
* tasks: amend wiki bref taskMartin Czygan2021-07-091-0/+1
|
* tasks: fix typoMartin Czygan2021-07-091-1/+1
|
* tasks: use uncompressed streamMartin Czygan2021-07-091-1/+2
|
* wiki: cleanup redundant checkMartin Czygan2021-07-091-1/+1
|
* wiki: tweak whitespace handlingMartin Czygan2021-07-091-1/+7
|
* wiki: more aggressive whitespace cleanupMartin Czygan2021-07-091-1/+2
|
* wiki: try a bit more cleanupMartin Czygan2021-07-091-1/+5
|
* tasks: wiki, sort by doi in first columnMartin Czygan2021-07-091-1/+1
|
* wiki: verify doiMartin Czygan2021-07-091-1/+1
|
* unstructured: cleanup obsolete regexMartin Czygan2021-07-091-9/+3
|
* tasks: BrefZipWikiDOIMartin Czygan2021-07-091-1/+8
|
* reduce: wiki doc in column 3Martin Czygan2021-07-091-1/+1
|
* tests: sync verify test dataMartin Czygan2021-07-096-0/+176
|
* tasks: wiki stubMartin Czygan2021-07-092-0/+17
|
* wiki: flip doi and page title columnMartin Czygan2021-07-091-3/+3
|
* reduce: move batch sizeMartin Czygan2021-07-092-9/+9
|
* cli: try to always display shiv_rootMartin Czygan2021-07-081-1/+2
|
* update proposal statusMartin Czygan2021-07-081-2/+2
|
* reduce: prepare command line helpMartin Czygan2021-07-081-0/+12
|
* note on timingsMartin Czygan2021-07-082-1/+9
|
* update docsMartin Czygan2021-07-081-3/+3
|
* reduce: set default batch sizeMartin Czygan2021-07-081-6/+8
|
* simplify importsMartin Czygan2021-07-089-9/+9
|
* reduce: separate batch callsMartin Czygan2021-07-082-20/+25
|
* fix merge conflictMartin Czygan2021-07-078-82/+136
|\
| * update docsMartin Czygan2021-07-071-3/+8
| |
| * skate: no need for aliasMartin Czygan2021-07-071-1/+1
| |
| * add WikipediaDOIMartin Czygan2021-07-071-0/+43
| |
| * do not compress sort tmp filesMartin Czygan2021-07-061-21/+21
| |
| * run a parity derivationMartin Czygan2021-07-061-2/+2
| |
| * util: cleanup encoderMartin Czygan2021-07-061-19/+0
| |
| * reduce: remove log lineMartin Czygan2021-07-061-1/+0
| |