**es-public-proxy**: simple read-only HTTP reverse-proxy for exposing an Elasticsearch node to the public internet * type-safe de-serialization and re-serialization of all user data * single-binary, easy to install * simple configuration with sane defaults * low-overhead in network latency and compute resources * optional CORS headers for direct browser requests * SSL, transport compression, load-balancing, observability, and rate-limiting are left to other tools like nginx, caddy, or HAproxy * free software forever: AGPLv3+ license The Elasticsearch REST API is powerful, well documented, and has client library implementations for many programming languages. For datasets and services which contain only public information, it would be convenient to provide direct access to at least a subset of the API for anybody to take advantage of. The Elasticsearch maintainers warn against this behavior, on the basis that the API is not designed for public use. Recent versions of Elasticsearch have an authentication/authorization subsystem, and there are third-party plugins for read-only access (such as [ReadonlyREST](https://readonlyrest.com/)), but these solutions require careful configuration and knowledge of which endpoints are "safe" for users. Elasticsearch accepts request bodies on `GET` requests, and one proposed solution is to filter to only `GET` requests using a reverse proxy like nginx. However, some safe endpoints (such as deleting scroll objects) require other HTTP verbs, and most browsers do not support `GET` bodies, so this is only a partial hack. `es-public-proxy` is intended to be a simple and reliable alternative for the use case of exposing popular search queries on specific indices to the public web. HTTP requests are parsed and filtered in a safe, compiled language (Rust), then only safe queries are re-serialized and forwarded to the backend search instance listening on a different port. Note that of course clients can still submit "expensive" queries of various kinds which will slow down the host. Some of these can be disabled in the elasticsearch configuration (this would disable those queries for all connections, not just via the proxy). Some query types are simply not supported by this proxy. In the future the proxy could gain configruation parameters and smarter parsing of some query types (like `query_string`) to try and prevent even more expensive queries. ## Installation On Debian/Ubuntu Linux systems, the easiest way to get started is to download and install an unsigned `.deb` from . This will include a manpage, configuration file, and systemd unit file. After installing, edit the configuration file (`/etc/es-public-proxy.toml`) and start the service like: sudo systemctl start es-public-proxy sudo systemctl enable es-public-proxy On other platforms you can install and run on a per-user basis using the rust toolchain with: cargo install es-public-proxy es-public-proxy --example-config > example.toml # edit the configuration file es-public-proxy --config example.toml There is also a Dockerfile, but it isn't actively used and hasn't been pushed to any image repository. Eg, unsure how best to inject configuration into a docker image. You can build the image with: docker build -f extra/Dockerfile . ## Configuration In all cases you will want to explicitly enumerate all of the indices to have public access. There is an `unsafe_all_indices` intended for prototyping, but this may allow access to additional non-index API endpoints. One simple deployment pattern is to put `nginx`, `es-public-proxy`, and `elasticsearch` all on the same server. In this configuration, `nginx` would listen on all network interfaces on ports 80 and 443, and handle SSL upgrade redirects from 80 to 443, as well as add transport compression, restrict client body payload limits, etc. `es-public-proxy` would listen on localhost port 9292, and connect back to elasticsearch on localhost port 9200. ## Limitations Not all of the elasticsearch API has been implemented yet. In general, this service is likely to be more strict in parsing and corner-cases. For example: * URL query parameters like `?human` must be expanded into a boolean like `?human=true` * Some cases where elasticsearch will allow short-cutting a full object into a string, this proxy requires the full object format * index patterns in configuration are not supported ## Development To build this package you need the rust toolchain installed. We target stable Rust, 2018 edition, version 1.45+. Re-compiling the manpage requires [scdoc](https://git.sr.ht/~sircmpwn/scdoc). Building a Debian package (`.deb`) requires the `cargo-deb` plugin, which you can install with: `cargo install cargo-deb` A Makefile is included to wrap common development commands, for example: make test make lint make deb Contributions are welcome! Would prefer to keep the number of dependant crates low (eg, don't currently use a CLI argument parsing library), but open to discussion. When sending patches or merge requests, it is helpful (but not required) if you can include test coverage, re-run `cargo fmt`, and acknowledge the license terms ahead of time. The Minimum Supported Rust Version (MSRV) is 1.49.