diff options
| author | Martin Czygan <martin.czygan@gmail.com> | 2021-04-13 20:45:10 +0200 |
|---|---|---|
| committer | Martin Czygan <martin.czygan@gmail.com> | 2021-04-13 20:45:10 +0200 |
| commit | 825c928ff9411b55f6bec5fa11f7771367ae3a24 (patch) | |
| tree | 76cd371ba60b125e83bb76f89c7fedb2e2cab144 | |
| parent | b69d62526d41ce0cc4454cde92514850ad83b035 (diff) | |
| download | fuzzycat-825c928ff9411b55f6bec5fa11f7771367ae3a24.tar.gz fuzzycat-825c928ff9411b55f6bec5fa11f7771367ae3a24.zip | |
update README
| -rw-r--r-- | README.md | 8 |
1 files changed, 8 insertions, 0 deletions
@@ -226,3 +226,11 @@ split and clusters are searched within each batch). TODO(miku): sort out shardin [skate](https://github.com/miku/skate) - to go from refs and release dataset to a number of clusters, relating references to releases * need to verify, but not the references against each other, only refs againt the release + +# Notes on Performance + +While running bulk (1B+) clustering and verification, even with parallel, +fuzzycat got slow. The citation graph project therefore contains a +reimplementation of `fuzzycat.verify` and related functions in Go, which in +this case is an order of magnitude faster. See: +[skate](https://git.archive.org/martin/cgraph/-/tree/master/skate). |
