diff options
author | Bryan Newbold <bnewbold@robocracy.org> | 2023-01-04 19:55:30 -0800 |
---|---|---|
committer | Bryan Newbold <bnewbold@robocracy.org> | 2023-01-04 20:18:25 -0800 |
commit | 276ac2aa24166660bc6ffe7601cee44b5d848dae (patch) | |
tree | 8a35ce06e7ab9e6755b24abc41dee1115cf62788 /proposals/20190911_search_query_parsing.md | |
parent | ee46c33544941a5104182a2e221e841a32cbbf78 (diff) | |
download | fatcat-276ac2aa24166660bc6ffe7601cee44b5d848dae.tar.gz fatcat-276ac2aa24166660bc6ffe7601cee44b5d848dae.zip |
proposals: update status; add some old ones; consistent file names
Diffstat (limited to 'proposals/20190911_search_query_parsing.md')
-rw-r--r-- | proposals/20190911_search_query_parsing.md | 28 |
1 files changed, 0 insertions, 28 deletions
diff --git a/proposals/20190911_search_query_parsing.md b/proposals/20190911_search_query_parsing.md deleted file mode 100644 index f1fb0128..00000000 --- a/proposals/20190911_search_query_parsing.md +++ /dev/null @@ -1,28 +0,0 @@ - -Status: brainstorm - -## Search Query Parsing - -The default "release" search on fatcat.wiki currently uses the elasticsearch -built-in `query_string` parser, which is explicitly not recommended for -public/production use. - -The best way forward is likely a custom query parser (eg, PEG-generated parser) -that generates a complete elasticsearch query JSON structure. - -A couple search issues this would help with: - -- better parsing of keywords (year, year-range, DOI, ISSN, etc) in complex - queries and turning these in to keyword term sub-queries -- queries including terms from multiple fields which aren't explicitly tagged - (eg, "lovelace computer" vs. "author:lovelace title:computer") -- avoiding unsustainably expensive queries (eg, prefix wildcard, regex) -- handling single-character mispellings and synonyms -- collapsing multiple releases under the same work in search results - -In the near future, we may also create a fulltext search index, which will have -it's own issues. - -## Tech Changes - -If we haven't already, should also switch to using elasticsearch client library. |