diff options
author | Martin Czygan <martin.czygan@gmail.com> | 2020-08-27 17:26:33 +0200 |
---|---|---|
committer | Martin Czygan <martin.czygan@gmail.com> | 2020-08-27 17:26:33 +0200 |
commit | 4815943600cfeb7ad4a50f48a21b59df4c369b7c (patch) | |
tree | c0dc2878e48dfa0fba33fd07a464cd4b664444b9 | |
parent | 4ab53ddfeef8fa99f5cf507f582c224a32e4c8b9 (diff) | |
download | fuzzycat-4815943600cfeb7ad4a50f48a21b59df4c369b7c.tar.gz fuzzycat-4815943600cfeb7ad4a50f48a21b59df4c369b7c.zip |
README: add performance data point
-rw-r--r-- | README.md | 16 | ||||
-rw-r--r-- | projects/grobid_refs/README.md | 6 |
2 files changed, 22 insertions, 0 deletions
@@ -26,3 +26,19 @@ specific code using the fatcat openapi client. ## Matching approaches  + +## Performance data point + +Candidate generation via elasticsearch, 40 parallel queries, sustained speed at +about 17857 queries per hour, that is around 5 queries/s. + +``` +$ time cat ~/data/researchgate/x04 | \ + parallel -j40 --pipe -N 1 ./fatcatx_rg_unmatched.py - \ + > ~/data/researchgate/x04_results.ndj +... +real 3409m16.442s +user 29177m5.516s +sys 4927m3.277s +``` + diff --git a/projects/grobid_refs/README.md b/projects/grobid_refs/README.md new file mode 100644 index 0000000..13ca3fc --- /dev/null +++ b/projects/grobid_refs/README.md @@ -0,0 +1,6 @@ +# Grobid refs + +References extracted from [grobid](https://grobid.readthedocs.io). + +Example grobid output: [grobid.tei.xml](grobid.tei.xml). + |