aboutsummaryrefslogtreecommitdiffstats
path: root/README.md
diff options
context:
space:
mode:
Diffstat (limited to 'README.md')
-rw-r--r--README.md20
1 files changed, 19 insertions, 1 deletions
diff --git a/README.md b/README.md
index 7d6e5cb..3a543a3 100644
--- a/README.md
+++ b/README.md
@@ -1,2 +1,20 @@
# fcfuzzy
-Fuzzy matching publications for fatcat.
+
+Fuzzy matching publications for [fatcat](https://fatcat.wiki).
+
+## Motivation
+
+Most of the results on sites like [Google
+Scholar](https://scholar.google.com/scholar?q=fuzzy+matching) group
+publications into clusters. Each cluster represents one publication, abstracted
+from its concrete representation as a link to a PDF.
+
+We call the abstract publication *work* and the concrete instance a *release*.
+The goal is to group releases under works and to implement a versions feature.
+
+This repository contains both generic code for matching as well as fatcat
+specific code using the fatcat openapi client.
+
+## Dataset
+
+Release metadata from: [https://archive.org/details/fatcat_bulk_exports_2020-08-05](https://archive.org/details/fatcat_bulk_exports_2020-08-05).