From 87d0fd90205bc9e5ed7d849e801f6ef2ca5c077e Mon Sep 17 00:00:00 2001 From: Martin Czygan Date: Fri, 4 Sep 2020 18:30:55 +0200 Subject: note on approach --- README.md | 10 ++++++++++ 1 file changed, 10 insertions(+) (limited to 'README.md') diff --git a/README.md b/README.md index 9e413af..d23d00f 100644 --- a/README.md +++ b/README.md @@ -17,6 +17,16 @@ The goal is to group releases under works and to implement a versions feature. This repository contains both generic code for matching as well as fatcat specific code using the fatcat openapi client. +## Approach + +There are probably a few assumption we can make: + +* If two strings are given, an exact string match does not mean equality (at + all), e.g. "Acta geographica" has currently eight associated ISSN, and a +title like "Buchbesprechungen" appears many hundreds of times. +* ... +* ... + ## Datasets * release and container metadata from: [https://archive.org/details/fatcat_bulk_exports_2020-08-05](https://archive.org/details/fatcat_bulk_exports_2020-08-05). -- cgit v1.2.3