aboutsummaryrefslogtreecommitdiffstats
path: root/hbase/howto.md
diff options
context:
space:
mode:
Diffstat (limited to 'hbase/howto.md')
-rw-r--r--hbase/howto.md42
1 files changed, 0 insertions, 42 deletions
diff --git a/hbase/howto.md b/hbase/howto.md
deleted file mode 100644
index 26d33f4..0000000
--- a/hbase/howto.md
+++ /dev/null
@@ -1,42 +0,0 @@
-
-Commands can be run from any cluster machine with hadoop environment config
-set up. Most of these commands are run from the shell (start with `hbase
-shell`). There is only one AIT/Webgroup HBase instance/namespace; there may be
-QA/prod tables, but there are not QA/prod clusters.
-
-## Create Table
-
-Create column families (note: not all individual columns) with something like:
-
- create 'wbgrp-journal-extract-0-qa', 'f', 'file', {NAME => 'grobid0', COMPRESSION => 'snappy'}
-
-## Run Thrift Server Informally
-
-The Thrift server can technically be run from any old cluster machine that has
-Hadoop client stuff set up, using:
-
- hbase thrift start -nonblocking -c
-
-Note that this will run version 0.96, while the actual HBase service seems to
-be running 0.98.
-
-To interact with this config, use happybase (python) config:
-
- conn = happybase.Connection("bnewbold-dev.us.archive.org", transport="framed", protocol="compact")
- # Test connection
- conn.tables()
-
-## Queries From Shell
-
-Fetch all columns for a single row:
-
- hbase> get 'wbgrp-journal-extract-0-qa', 'sha1:3I42H3S6NNFQ2MSVX7XZKYAYSCX5QBYJ'
-
-Fetch multiple columns for a single row, using column families:
-
- hbase> get 'wbgrp-journal-extract-0-qa', 'sha1:3I42H3S6NNFQ2MSVX7XZKYAYSCX5QBYJ', 'f', 'file'
-
-Scan a fixed number of rows (here 5) starting at a specific key prefix, all
-columns:
-
- hbase> scan 'wbgrp-journal-extract-0-qa',{LIMIT=>5,STARTROW=>'sha1:A'}