aboutsummaryrefslogtreecommitdiffstats
path: root/guide/src/search_api.md
diff options
context:
space:
mode:
authorBryan Newbold <bnewbold@robocracy.org>2021-08-06 11:58:16 -0700
committerBryan Newbold <bnewbold@robocracy.org>2021-08-06 11:58:16 -0700
commit99885b458ad505ebb63b3e7cf5b1bae3dd2a459e (patch)
treede3fbb3e42b0bb7f6e447d2e13ac3f92a8bb90b2 /guide/src/search_api.md
parent950d3f08bd439aed92d01dbc3cca9747570aa82c (diff)
parent56e4ce2d8347cdfedd492d54fde080772f3d8725 (diff)
downloadfatcat-99885b458ad505ebb63b3e7cf5b1bae3dd2a459e.tar.gz
fatcat-99885b458ad505ebb63b3e7cf5b1bae3dd2a459e.zip
Merge branch 'bnewbold-refs-apis'
Diffstat (limited to 'guide/src/search_api.md')
-rw-r--r--guide/src/search_api.md29
1 files changed, 29 insertions, 0 deletions
diff --git a/guide/src/search_api.md b/guide/src/search_api.md
new file mode 100644
index 00000000..91b7c8e9
--- /dev/null
+++ b/guide/src/search_api.md
@@ -0,0 +1,29 @@
+
+# Search API
+
+The Elasticsearch indices used to power metadata search, statistics, and graphs
+on the fatcat web interface are exposed publicly at
+`https://search.fatcat.wiki`. Third parties can make queries using the
+Elasticsearch API, which is well documented online and has client libraries in
+many programming languages.
+
+A thin proxy (`es-public-proxy`) filters requests to avoid expensive queries
+which could cause problems for search queries on the web interface, but most of
+the Elasticsearch API is supported, including powerful aggregation queries.
+
+There is a short delay between updates to the fatcat catalog (via the main API)
+and updates to the search index.
+
+Notable indices include:
+
+- `fatcat_release`: release entity metadata
+- `fatcat_container`: container entity metadata
+- `fatcat_ref`: reference graph
+
+Schemas for these indices can be fetched directly from the index (eg,
+`https://search.fatcat.wiki/fatcat_release/_mapping`), and are versioned in the
+fatcat git repository under `fatcat:extra/eleasticsearch/`. They are a
+simplification and transform of the regular entity schemas, and include some
+synthesized fields (such as "preservation status" for releases). Note that the
+search schemas are likely to change over time with less notice and stability
+guarantees than the primary catalog API schema.