initial notes on crude query parsing

author: Bryan Newbold <bnewbold@archive.org> 2021-01-18 19:52:37 -0800
committer: Bryan Newbold <bnewbold@archive.org> 2021-01-19 19:49:04 -0800
commit: 0adb490ae2ba8f961bac559a981f89d6d264af60 (patch)
tree: 59e916cbd380c0b6c4d9e17c9fbc6353dd744a91 /proposals/2021_crude_query_parse.md
parent: 78ad484db9d7deb09410e49407cd036cdc9363d2 (diff)
download: fatcat-scholar-0adb490ae2ba8f961bac559a981f89d6d264af60.tar.gz
fatcat-scholar-0adb490ae2ba8f961bac559a981f89d6d264af60.zip
1 files changed, 18 insertions, 0 deletions
diff --git a/proposals/2021_crude_query_parse.md b/proposals/2021_crude_query_parse.md
new file mode 100644
index 0000000..2a7663b
--- /dev/null
+++ b/proposals/2021_crude_query_parse.md
@@ -0,0 +1,18 @@
+
+
+Thinking of simple ways to reduce query parse errors and handle more queries as
+expected. In particular:
+
+- handle slashes in query tokens (eg, "N/A" without quotes)
+- handle semi-colons in queries, when they are not intended as filters
+- if query "looks like" a raw citation string, detect that and do citation
+  parsing in to a structured format, then do a query or fuzzy lookup from there
+
+
+## Questions/Thoughts
+
+Should we detect title lookups in addition to full citation lookups? Probably
+too complicated.
+
+Do we have a static list of colon-prefixes, or load from the schema mapping
+file itself?
author	Bryan Newbold <bnewbold@archive.org>	2021-01-18 19:52:37 -0800
committer	Bryan Newbold <bnewbold@archive.org>	2021-01-19 19:49:04 -0800
commit	0adb490ae2ba8f961bac559a981f89d6d264af60 (patch)
tree	59e916cbd380c0b6c4d9e17c9fbc6353dd744a91 /proposals/2021_crude_query_parse.md
parent	78ad484db9d7deb09410e49407cd036cdc9363d2 (diff)
download	fatcat-scholar-0adb490ae2ba8f961bac559a981f89d6d264af60.tar.gz fatcat-scholar-0adb490ae2ba8f961bac559a981f89d6d264af60.zip