diff options
author | Bryan Newbold <bnewbold@robocracy.org> | 2019-05-30 12:21:05 -0700 |
---|---|---|
committer | Bryan Newbold <bnewbold@robocracy.org> | 2019-05-30 12:21:05 -0700 |
commit | eee39965eee92b5005df0d967be779c2f2bb15f8 (patch) | |
tree | 4323413580243f756d62446ce92065e068e23db4 /extra | |
parent | a4b84445a96c0fbe6133331b02f96cf570c59149 (diff) | |
download | fatcat-eee39965eee92b5005df0d967be779c2f2bb15f8.tar.gz fatcat-eee39965eee92b5005df0d967be779c2f2bb15f8.zip |
add work-in-progress elastic index notes
Diffstat (limited to 'extra')
-rw-r--r-- | extra/elasticsearch/README.md | 11 |
1 files changed, 11 insertions, 0 deletions
diff --git a/extra/elasticsearch/README.md b/extra/elasticsearch/README.md index 15c00b4c..60469250 100644 --- a/extra/elasticsearch/README.md +++ b/extra/elasticsearch/README.md @@ -60,6 +60,17 @@ Or, in a bulk production live-stream conversion: time zcat /srv/fatcat/snapshots/release_export_expanded.json.gz | pv -l | parallel -j20 --linebuffer --round-robin --pipe ./fatcat_transform.py elasticsearch-releases - - | esbulk -verbose -size 20000 -id ident -w 8 -index fatcat_release -type release time zcat /srv/fatcat/snapshots/container_export.json.gz | pv -l | ./fatcat_transform.py elasticsearch-containers - - | esbulk -verbose -size 20000 -id ident -w 8 -index fatcat_container -type container +## Index Aliases + +To make re-indexing and schema changes easier, we can create versioned (or +time-stamped) elasticsearch indexes, and then point to them using index +aliases. The index alias updates are fast and atomic, so we can slowly build up +a new index and then cut over with no downtime. + + http put :9200/fatcat_release_v03 < release_schema.json + +TODO: more docs for actual cut-over + ## Full-Text Querying A generic full-text "query string" query look like this (replace "blood" with |