summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorBryan Newbold <bnewbold@robocracy.org>2019-05-30 12:21:05 -0700
committerBryan Newbold <bnewbold@robocracy.org>2019-05-30 12:21:05 -0700
commiteee39965eee92b5005df0d967be779c2f2bb15f8 (patch)
tree4323413580243f756d62446ce92065e068e23db4
parenta4b84445a96c0fbe6133331b02f96cf570c59149 (diff)
downloadfatcat-eee39965eee92b5005df0d967be779c2f2bb15f8.tar.gz
fatcat-eee39965eee92b5005df0d967be779c2f2bb15f8.zip
add work-in-progress elastic index notes
-rw-r--r--extra/elasticsearch/README.md11
1 files changed, 11 insertions, 0 deletions
diff --git a/extra/elasticsearch/README.md b/extra/elasticsearch/README.md
index 15c00b4c..60469250 100644
--- a/extra/elasticsearch/README.md
+++ b/extra/elasticsearch/README.md
@@ -60,6 +60,17 @@ Or, in a bulk production live-stream conversion:
time zcat /srv/fatcat/snapshots/release_export_expanded.json.gz | pv -l | parallel -j20 --linebuffer --round-robin --pipe ./fatcat_transform.py elasticsearch-releases - - | esbulk -verbose -size 20000 -id ident -w 8 -index fatcat_release -type release
time zcat /srv/fatcat/snapshots/container_export.json.gz | pv -l | ./fatcat_transform.py elasticsearch-containers - - | esbulk -verbose -size 20000 -id ident -w 8 -index fatcat_container -type container
+## Index Aliases
+
+To make re-indexing and schema changes easier, we can create versioned (or
+time-stamped) elasticsearch indexes, and then point to them using index
+aliases. The index alias updates are fast and atomic, so we can slowly build up
+a new index and then cut over with no downtime.
+
+ http put :9200/fatcat_release_v03 < release_schema.json
+
+TODO: more docs for actual cut-over
+
## Full-Text Querying
A generic full-text "query string" query look like this (replace "blood" with