diff options
author | Martin Czygan <martin.czygan@gmail.com> | 2020-08-12 15:05:51 +0200 |
---|---|---|
committer | Martin Czygan <martin.czygan@gmail.com> | 2020-08-12 15:05:51 +0200 |
commit | 0b4db31a797a582c25942e693d531ee37b618674 (patch) | |
tree | 3363c9ba42e711234a911931e65ac184520892a3 /notebooks/Journal_Names.html | |
parent | 703fdbebc53352036bfa9e9a13599421e38d949e (diff) | |
download | fuzzycat-0b4db31a797a582c25942e693d531ee37b618674.tar.gz fuzzycat-0b4db31a797a582c25942e693d531ee37b618674.zip |
note on optimization: marisa-trie
Currently, the JSON mapping is 172M, turning this into a dict takes a
bit, plus consumes GBs of memory. For exact lookups, we might want to
use marisa-trie:
> String data in a MARISA-trie may take up to 50x-100x less memory than
in a standard Python dict; the raw lookup speed is comparable; trie also
provides fast advanced methods like prefix search.
Diffstat (limited to 'notebooks/Journal_Names.html')
0 files changed, 0 insertions, 0 deletions