aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
-rw-r--r--README.md16
-rw-r--r--projects/grobid_refs/README.md6
2 files changed, 22 insertions, 0 deletions
diff --git a/README.md b/README.md
index 2a8721e..7c6468d 100644
--- a/README.md
+++ b/README.md
@@ -26,3 +26,19 @@ specific code using the fatcat openapi client.
## Matching approaches
![](static/approach.png)
+
+## Performance data point
+
+Candidate generation via elasticsearch, 40 parallel queries, sustained speed at
+about 17857 queries per hour, that is around 5 queries/s.
+
+```
+$ time cat ~/data/researchgate/x04 | \
+ parallel -j40 --pipe -N 1 ./fatcatx_rg_unmatched.py - \
+ > ~/data/researchgate/x04_results.ndj
+...
+real 3409m16.442s
+user 29177m5.516s
+sys 4927m3.277s
+```
+
diff --git a/projects/grobid_refs/README.md b/projects/grobid_refs/README.md
new file mode 100644
index 0000000..13ca3fc
--- /dev/null
+++ b/projects/grobid_refs/README.md
@@ -0,0 +1,6 @@
+# Grobid refs
+
+References extracted from [grobid](https://grobid.readthedocs.io).
+
+Example grobid output: [grobid.tei.xml](grobid.tei.xml).
+