aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorMartin Czygan <martin.czygan@gmail.com>2021-09-13 18:30:30 +0200
committerMartin Czygan <martin.czygan@gmail.com>2021-09-13 18:31:23 +0200
commit14068f0c743fa558a0303b2c04775d8baedeba4c (patch)
treeda35dc95eb378f989a5060054a45e5d937f05bed
parent9a7465c5c402a2ddad0abc15015c61a1a76d6485 (diff)
downloadfuzzycat-14068f0c743fa558a0303b2c04775d8baedeba4c.tar.gz
fuzzycat-14068f0c743fa558a0303b2c04775d8baedeba4c.zip
update README
-rw-r--r--README.md11
1 files changed, 7 insertions, 4 deletions
diff --git a/README.md b/README.md
index c8afc68..67dfe15 100644
--- a/README.md
+++ b/README.md
@@ -19,7 +19,8 @@ records, and others are designed to work "online" making queries against hosted
web services and catalogs.
`fuzzycat` was originally developed by Martin Czygan at the Internet Archive,
-and is used in the construction of a citation graph and to identify duplicate
+and is used in the construction of a [citation
+graph](https://gitlab.com/internetarchive/refcat) and to identify duplicate
records in the [fatcat.wiki](https://fatcat.wiki) catalog and
[scholar.archive.org](https://scholar.archive.org) search index.
@@ -73,9 +74,11 @@ A CLI tool is included for processing records in UNIX stdin/stdout pipelines:
## Features and Use-Cases
-The **`refcat`** system builds on top of this library to build a citation graph
-by processing billions of structured and unstructured reference records
-extracted from scholarly papers.
+The [refcat project](https://gitlab.com/internetarchive/refcat) builds on top
+of this library to build a citation graph by processing billions of structured
+and unstructured reference records extracted from scholarly papers (note: jfor
+performance critical parts, some code has been ported to Go, albeit the test
+suite is shared between the Python and Go implementations).
Automated imports of metadata records into the fatcat catalog use fuzzycat to
filter new metadata which look like duplicates of existing records from other