aboutsummaryrefslogtreecommitdiffstats
path: root/TODO.md
diff options
context:
space:
mode:
authorMartin Czygan <martin.czygan@gmail.com>2021-11-17 14:51:50 +0100
committerMartin Czygan <martin.czygan@gmail.com>2021-12-06 19:53:30 +0100
commitdd6149140542585f2b0bfc3b334ec2b0a88b790e (patch)
tree6a11c228558cfbf73932bc828cda9be3735cfd78 /TODO.md
parentd104f8d0ba8eef5563555de82be66bbf17f961db (diff)
downloadfuzzycat-dd6149140542585f2b0bfc3b334ec2b0a88b790e.tar.gz
fuzzycat-dd6149140542585f2b0bfc3b334ec2b0a88b790e.zip
complete FuzzyReleaseMatcher refactoring
We keep the name, since the api - "matcher.match(release)" - is the same; simplified queries; at most one query is performed against elasticsearch; parallel release retrieval from the API; optional support for release year windows; Test cases are expressed in yaml and will be auto-loaded from the specified directory; test work against the current search endpoint, which means the actual output may change on index updates; for the moment, we think this setup is relatively simple and not too unstable. about: title contrib, partial name input: > { "contribs": [ { "raw_name": "Adams" } ], "title": "digital libraries", "ext_ids": {} } release_year_padding: 1 expected: - 7rmvqtrb2jdyhcxxodihzzcugy - a2u6ougtsjcbvczou6sazsulcm - dy45vilej5diros6zmax46nm4e - exuwhhayird4fdjmmsiqpponlq - gqrj7jikezgcfpjfazhpf4e7c4 - mkmqt3453relbpuyktnmsg6hjq - t2g5sl3dgzchtnq7dynxyzje44 - t4tvenhrvzamraxrvvxivxmvga - wd3oeoi3bffknfbg2ymleqc4ja - y63a6dhrfnb7bltlxfynydbojy
Diffstat (limited to 'TODO.md')
-rw-r--r--TODO.md5
1 files changed, 5 insertions, 0 deletions
diff --git a/TODO.md b/TODO.md
index d9d8b02..414c972 100644
--- a/TODO.md
+++ b/TODO.md
@@ -1,5 +1,10 @@
# TODO
+* [ ] match release with fewer requests (or do them in parallel)
+* [ ] de-clobber verify
+
+----
+
* [ ] clustering should be broken up, e.g. into "map" and "sort"
* [x] match release should be a class
* [x] match release fuzzy should work not just with title