aboutsummaryrefslogtreecommitdiffstats
path: root/notes
Commit message (Collapse)AuthorAgeFilesLines
* update notes on cluster, nbMartin Czygan2020-10-221-1/+47
|
* update notes on clusteringMartin Czygan2020-10-221-0/+18
|
* update cluster notesMartin Czygan2020-10-221-0/+27
|
* notes: clusteringMartin Czygan2020-10-221-0/+11
|
* cluster variantsMartin Czygan2020-10-211-0/+54
|
* update various docs; start data issue logMartin Czygan2020-09-032-1/+1
|
* add notes on abbrevsMartin Czygan2020-08-152-0/+2260
|
* update planMartin Czygan2020-08-141-0/+5
|
* note on optimization: marisa-trieMartin Czygan2020-08-121-0/+1
| | | | | | | | | | Currently, the JSON mapping is 172M, turning this into a dict takes a bit, plus consumes GBs of memory. For exact lookups, we might want to use marisa-trie: > String data in a MARISA-trie may take up to 50x-100x less memory than in a standard Python dict; the raw lookup speed is comparable; trie also provides fast advanced methods like prefix search.
* add notes/todoMartin Czygan2020-08-121-0/+17