summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorBryan Newbold <bnewbold@archive.org>2020-08-17 23:22:52 -0700
committerBryan Newbold <bnewbold@archive.org>2020-08-17 23:22:52 -0700
commitf0aa8010401e3872f8f1dcc85c409e77c6b5a1d8 (patch)
tree70c5153f23bb23bbcdd11bfe54c14133a2d1b09c
downloades-public-proxy-f0aa8010401e3872f8f1dcc85c409e77c6b5a1d8.tar.gz
es-public-proxy-f0aa8010401e3872f8f1dcc85c409e77c6b5a1d8.zip
init repo with README, gitignore, etc
-rw-r--r--.gitignore22
-rw-r--r--README.md43
-rw-r--r--plan.txt105
3 files changed, 170 insertions, 0 deletions
diff --git a/.gitignore b/.gitignore
new file mode 100644
index 0000000..2ead7e1
--- /dev/null
+++ b/.gitignore
@@ -0,0 +1,22 @@
+target/
+*.o
+*.a
+*.pyc
+#*#
+*~
+*.swp
+.*
+*.tmp
+*.old
+*.profile
+*.bkp
+*.bak
+[Tt]humbs.db
+*.DS_Store
+build/
+_build/
+src/build/
+*.log
+
+# Don't ignore this file itself
+!.gitignore
diff --git a/README.md b/README.md
new file mode 100644
index 0000000..5920b9c
--- /dev/null
+++ b/README.md
@@ -0,0 +1,43 @@
+
+**es-public-proxy**: Elasticsearch API proxy intended to be exposed to the
+public internet (or any non-localhost clients) for safe read-only queries
+
+This is intended as a simple alternative to other "read-only" plugins or
+authentication solutions for elasticsearch. A benefit of keeping the
+elasticsearch API itself, instead of building a application-layer wrapper, is
+that there already exist client libraries, tools, and integrations in many
+languages.
+
+Plan:
+
+- single Rust executable
+- fast and simple enough to never impact performance or latency
+- TOML configuration
+- some modern async/await framework
+- use official elasticsearch crate? or just reqwest?
+- small subset of total public API: get, search, scroll
+- per-index permissions
+- return response bodies untouched
+- parse queries with serde JSON, then re-serialize
+
+Stretch or future goals:
+
+- parsing Lucene `query_string`
+- provide an alternate simpler API
+- query caching
+- index aliases and routing
+- version mapping (eg, expose 7.x API for 6.x index)
+
+Non-features:
+
+- TLS (use a general purpose reverse proxy)
+
+## Deployment
+
+The imagined use case is that you have elasticsearch proper listening only to
+localhost connections with plain HTTP. This makes adminstration easy from
+authenticated local UNIX users. No non-localhost connections to elasticsearch
+are allowed, even from trusted clients. This daemon runs as a small sidecar
+proxy on localhost, listening on a public port. All non-localhost clients
+direct queries through the proxy, which parses the query, ensures it is "safe",
+then passes through to backend.
diff --git a/plan.txt b/plan.txt
new file mode 100644
index 0000000..9ab837a
--- /dev/null
+++ b/plan.txt
@@ -0,0 +1,105 @@
+
+TODO: see what other requests the default python and javascript client libraries use
+
+## basics
+
+- config: TOML, env, args
+- filter requests by method and endpoint
+- filter query parameters
+- parse request bodies (queries)
+- method/body for denied requests
+- async streaming responses
+- minimize tokio feature flags
+
+factoring:
+- validate query method (method, path, query, body)
+
+## general endpoints
+
+- ping
+ (?)
+- basic info
+ GET /
+ (?)
+- scroll
+ POST /_search/scroll
+- clear scroll
+ DELETE /_search/scroll
+
+## per-index endpoints
+
+- basic info; mapping
+ (?)
+- count
+ GET /<index>/_count
+- get document
+ GET /<index>/_doc/<_id>
+ HEAD /<index>/_doc/<_id>
+ GET /<index>/_source/<_id>
+ HEAD /<index>/_source/<_id>
+- search
+ GET /<index>/_search
+ POST /<index>/_search
+
+later:
+
+- multi-get (`_mget`)
+- multi-search (`_msearch`)
+
+## query types
+
+compound:
+- bool
+- boosting
+- constant_score
+ filter (query)
+ boost (float, optional)
+
+fulltext:
+- match
+ <field>
+ (bare str allowed)
+ query (str)
+- match_phrase
+ <field>
+ (bare str allowed)
+ value (str)
+- multi_match
+- query_string
+- simple_query_string
+
+term-level:
+- range
+ <field>
+ gt, gte, lt, lte: str or number
+- term
+ <field>
+ value: str or number
+- terms
+ <field>
+ (array of str or number)
+- wildcard
+ <field>
+ value (str)
+ boost (float, optional)
+ rewrite (str, optional)
+- exists
+ field (str)
+- ids
+ values (array of str)
+- match_all
+ boost (float, optional)
+- match_none
+ boost (float, optional)
+
+
+TODO:
+- terms_set
+- span queries
+- fuzzy (configurable)
+
+## additional stuff
+
+- HTTP content-encoding: gzip
+- content-type header; always JSON?
+- https://www.elastic.co/guide/en/elasticsearch/reference/current/common-options.html