fuzzycat - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	update notes on cluster, nb	Martin Czygan	2020-10-22	1	-1/+47
\|
*	update notes on clustering	Martin Czygan	2020-10-22	1	-0/+18
\|
*	update cluster notes	Martin Czygan	2020-10-22	1	-0/+27
\|
*	notes: clustering	Martin Czygan	2020-10-22	1	-0/+11
\|
*	cluster variants	Martin Czygan	2020-10-21	1	-0/+54
\|
*	update various docs; start data issue log	Martin Czygan	2020-09-03	2	-1/+1
\|
*	add notes on abbrevs	Martin Czygan	2020-08-15	2	-0/+2260
\|
*	update plan	Martin Czygan	2020-08-14	1	-0/+5
\|
*	note on optimization: marisa-trie	Martin Czygan	2020-08-12	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	Currently, the JSON mapping is 172M, turning this into a dict takes a bit, plus consumes GBs of memory. For exact lookups, we might want to use marisa-trie: > String data in a MARISA-trie may take up to 50x-100x less memory than in a standard Python dict; the raw lookup speed is comparable; trie also provides fast advanced methods like prefix search.
*	add notes/todo	Martin Czygan	2020-08-12	1	-0/+17