diff options
author | bnewbold <bnewbold@archive.org> | 2021-04-07 05:47:06 +0000 |
---|---|---|
committer | bnewbold <bnewbold@archive.org> | 2021-04-07 05:47:06 +0000 |
commit | 0b9fc884dad8e3147d10c273725157ba60f48069 (patch) | |
tree | 8090fcf43dfef8b2f46fc6a2161c46257e22ff2b /extra/elasticsearch/README.md | |
parent | c0b145978280d53207aa714aab67cb582d9399ad (diff) | |
parent | c23f050426c1422e84019fe60d4d67865b962f31 (diff) | |
download | fatcat-0b9fc884dad8e3147d10c273725157ba60f48069.tar.gz fatcat-0b9fc884dad8e3147d10c273725157ba60f48069.zip |
Merge branch 'bnewbold-es7' into 'master'
elasticsearch 7.x support
See merge request webgroup/fatcat!100
Diffstat (limited to 'extra/elasticsearch/README.md')
-rw-r--r-- | extra/elasticsearch/README.md | 22 |
1 files changed, 11 insertions, 11 deletions
diff --git a/extra/elasticsearch/README.md b/extra/elasticsearch/README.md index 17865bc0..196ac588 100644 --- a/extra/elasticsearch/README.md +++ b/extra/elasticsearch/README.md @@ -42,26 +42,26 @@ Drop and rebuild the schema: http delete :9200/fatcat_container http delete :9200/fatcat_file http delete :9200/fatcat_changelog - http put :9200/fatcat_release < release_schema.json - http put :9200/fatcat_container < container_schema.json - http put :9200/fatcat_file < file_schema.json - http put :9200/fatcat_changelog < changelog_schema.json + http put :9200/fatcat_release?include_type_name=true < release_schema.json + http put :9200/fatcat_container?include_type_name=true < container_schema.json + http put :9200/fatcat_file?include_type_name=true < file_schema.json + http put :9200/fatcat_changelog?include_type_name=true < changelog_schema.json Put a single object (good for debugging): - head -n1 examples.json | http post :9200/fatcat_release/release/0 - http get :9200/fatcat_release/release/0 + head -n1 examples.json | http post :9200/fatcat_release/_doc/0 + http get :9200/fatcat_release/_doc/0 Bulk insert from a file on disk: - esbulk -verbose -id ident -index fatcat_release -type release examples.json + esbulk -verbose -id ident -index fatcat_release -type _doc examples.json Or, in a bulk production live-stream conversion: export LC_ALL=C.UTF-8 - time zcat /srv/fatcat/snapshots/release_export_expanded.json.gz | pv -l | parallel -j20 --linebuffer --round-robin --pipe ./fatcat_transform.py elasticsearch-releases - - | esbulk -verbose -size 1000 -id ident -w 8 -index fatcat_release -type release - time zcat /srv/fatcat/snapshots/container_export.json.gz | pv -l | ./fatcat_transform.py elasticsearch-containers - - | esbulk -verbose -size 1000 -id ident -w 8 -index fatcat_container -type container - time zcat /srv/fatcat/snapshots/file_export.json.gz | pv -l | parallel -j20 --linebuffer --round-robin --pipe ./fatcat_transform.py elasticsearch-files - - | esbulk -verbose -size 1000 -id ident -w 8 -index fatcat_file -type file + time zcat /srv/fatcat/snapshots/release_export_expanded.json.gz | pv -l | parallel -j20 --linebuffer --round-robin --pipe ./fatcat_transform.py elasticsearch-releases - - | esbulk -verbose -size 1000 -id ident -w 8 -index fatcat_release -type _doc + time zcat /srv/fatcat/snapshots/container_export.json.gz | pv -l | ./fatcat_transform.py elasticsearch-containers - - | esbulk -verbose -size 1000 -id ident -w 8 -index fatcat_container -type _doc + time zcat /srv/fatcat/snapshots/file_export.json.gz | pv -l | parallel -j20 --linebuffer --round-robin --pipe ./fatcat_transform.py elasticsearch-files - - | esbulk -verbose -size 1000 -id ident -w 8 -index fatcat_file -type _doc ## Index Aliases @@ -94,7 +94,7 @@ To do an atomic swap from one alias to a new one ("zero downtime"): A generic full-text "query string" query look like this (replace "blood" with actual query string, and "size" field with the max results to return): - GET /fatcat_release/release/_search + GET /fatcat_release/_search { "query": { "query_string": { |