summaryrefslogtreecommitdiffstats
path: root/proposals/2021_crude_query_parse.md
diff options
context:
space:
mode:
authorBryan Newbold <bnewbold@archive.org>2021-01-18 19:52:37 -0800
committerBryan Newbold <bnewbold@archive.org>2021-01-19 19:49:04 -0800
commit0adb490ae2ba8f961bac559a981f89d6d264af60 (patch)
tree59e916cbd380c0b6c4d9e17c9fbc6353dd744a91 /proposals/2021_crude_query_parse.md
parent78ad484db9d7deb09410e49407cd036cdc9363d2 (diff)
downloadfatcat-scholar-0adb490ae2ba8f961bac559a981f89d6d264af60.tar.gz
fatcat-scholar-0adb490ae2ba8f961bac559a981f89d6d264af60.zip
initial notes on crude query parsing
Diffstat (limited to 'proposals/2021_crude_query_parse.md')
-rw-r--r--proposals/2021_crude_query_parse.md18
1 files changed, 18 insertions, 0 deletions
diff --git a/proposals/2021_crude_query_parse.md b/proposals/2021_crude_query_parse.md
new file mode 100644
index 0000000..2a7663b
--- /dev/null
+++ b/proposals/2021_crude_query_parse.md
@@ -0,0 +1,18 @@
+
+
+Thinking of simple ways to reduce query parse errors and handle more queries as
+expected. In particular:
+
+- handle slashes in query tokens (eg, "N/A" without quotes)
+- handle semi-colons in queries, when they are not intended as filters
+- if query "looks like" a raw citation string, detect that and do citation
+ parsing in to a structured format, then do a query or fuzzy lookup from there
+
+
+## Questions/Thoughts
+
+Should we detect title lookups in addition to full citation lookups? Probably
+too complicated.
+
+Do we have a static list of colon-prefixes, or load from the schema mapping
+file itself?