aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorMartin Czygan <martin.czygan@gmail.com>2020-12-18 03:12:05 +0100
committerMartin Czygan <martin.czygan@gmail.com>2020-12-18 03:12:05 +0100
commit5bd9eba35a9697e0cf2ac4b53d99a0112d038803 (patch)
treecc6bb7ae4f45709d04ed1db8c3d85322a9ef9f4f
parente4b37ea5bf0e3b2294f6f996c42e844524e2c0f2 (diff)
downloadfuzzycat-5bd9eba35a9697e0cf2ac4b53d99a0112d038803.tar.gz
fuzzycat-5bd9eba35a9697e0cf2ac4b53d99a0112d038803.zip
link to sort
-rw-r--r--README.md2
1 files changed, 1 insertions, 1 deletions
diff --git a/README.md b/README.md
index 47bf25e..8547d45 100644
--- a/README.md
+++ b/README.md
@@ -34,7 +34,7 @@ $ python -m fuzzycat cluster -t tsandcrawler < data/re.json > cluster.json.zst
Clustering works in a three step process:
1. key extraction for each document (choose algorithm)
-2. sorting by keys (via GNU sort)
+2. sorting by keys (via [GNU sort](https://www.gnu.org/software/coreutils/manual/html_node/sort-invocation.html))
3. group by key and write out ([itertools.groupby](https://docs.python.org/3/library/itertools.html#itertools.groupby))
### Verification