diff options
author | Martin Czygan <martin.czygan@gmail.com> | 2020-09-04 18:30:55 +0200 |
---|---|---|
committer | Martin Czygan <martin.czygan@gmail.com> | 2020-09-04 18:30:55 +0200 |
commit | 87d0fd90205bc9e5ed7d849e801f6ef2ca5c077e (patch) | |
tree | 0751c03d8b7d31deb7e9e2f392316df7641425b3 | |
parent | e8fdd47282c987637ecb4a6f7fd7518cca12b8d9 (diff) | |
download | fuzzycat-87d0fd90205bc9e5ed7d849e801f6ef2ca5c077e.tar.gz fuzzycat-87d0fd90205bc9e5ed7d849e801f6ef2ca5c077e.zip |
note on approach
-rw-r--r-- | README.md | 10 |
1 files changed, 10 insertions, 0 deletions
@@ -17,6 +17,16 @@ The goal is to group releases under works and to implement a versions feature. This repository contains both generic code for matching as well as fatcat specific code using the fatcat openapi client. +## Approach + +There are probably a few assumption we can make: + +* If two strings are given, an exact string match does not mean equality (at + all), e.g. "Acta geographica" has currently eight associated ISSN, and a +title like "Buchbesprechungen" appears many hundreds of times. +* ... +* ... + ## Datasets * release and container metadata from: [https://archive.org/details/fatcat_bulk_exports_2020-08-05](https://archive.org/details/fatcat_bulk_exports_2020-08-05). |