aboutsummaryrefslogtreecommitdiffstats
path: root/skate
diff options
context:
space:
mode:
Diffstat (limited to 'skate')
-rw-r--r--skate/README.md15
1 files changed, 9 insertions, 6 deletions
diff --git a/skate/README.md b/skate/README.md
index d3a361c..8e2d7d1 100644
--- a/skate/README.md
+++ b/skate/README.md
@@ -1,15 +1,18 @@
# skate
A small library and suite of command line tools related to generating a
-citation graph.
+[citation graph](https://en.wikipedia.org/wiki/Citation_graph).
-## Why?
+> There is no standard format for the citations in bibliographies, and the
+> record linkage of citations can be a time-consuming and complicated process.
-Python was a bit too slow, even when parallelized, e.g. for generating clusters
-of similar documents or to do verification. An option for the future would be
-to resort to [Cython](https://cython.org/). Parts of
+## Background
+
+Python was a bit too slow, even when parallelized (with GNU parallel), e.g. for
+generating clusters of similar documents or to do verification. An option for
+the future would be to resort to [Cython](https://cython.org/). Parts of
[fuzzycat](https://git.archive.org/webgroup/fuzzycat) has been ported into this
-project for performance.
+project for performance (and we saw a 25x speedup for certain tasks).
![](static/zipkey.png)