aboutsummaryrefslogtreecommitdiffstats
path: root/extra/sql_dumps/README.md
diff options
context:
space:
mode:
authorBryan Newbold <bnewbold@robocracy.org>2018-09-13 00:23:09 -0700
committerBryan Newbold <bnewbold@robocracy.org>2018-09-13 00:23:09 -0700
commit9f25d84accb5a3657cb4c7dd87014d9f13ccf2ef (patch)
tree4854b13512c65fd0bffddfb791c4c014a41088dc /extra/sql_dumps/README.md
parentaa9abbbab67c6344d382a964b3c451e0bf212efe (diff)
downloadfatcat-9f25d84accb5a3657cb4c7dd87014d9f13ccf2ef.tar.gz
fatcat-9f25d84accb5a3657cb4c7dd87014d9f13ccf2ef.zip
improve dump scripts
Diffstat (limited to 'extra/sql_dumps/README.md')
-rw-r--r--extra/sql_dumps/README.md26
1 files changed, 26 insertions, 0 deletions
diff --git a/extra/sql_dumps/README.md b/extra/sql_dumps/README.md
new file mode 100644
index 00000000..6f24207d
--- /dev/null
+++ b/extra/sql_dumps/README.md
@@ -0,0 +1,26 @@
+
+## HOWTO: Ident Table Snapshots
+
+How to take a consistent (single transaction) snapshot of
+
+This will take somewhere around 15-25 GB of disk space on the database server
+(under /tmp). It would probably be better to stream this transaction over a
+network connection (saving database disk I/O), but I can't figure out how to do
+that with plain SQL (multiple table dumps in a single session), so would need
+to be a custom client.
+
+ ./ident_table_snapshot.sh
+
+## HOWTO: Dump abstracts, release identifiers, file hashes, etc
+
+These are run as regular old commands, and can run across the network in a
+couple different ways. We might not want database ports open to the network
+(even cluster/VPN); on the other hand we could proabably do SSH port
+forwarding anyways.
+
+ # Locally, or client running on a remote machine
+ psql fatcat < dump_abstracts.sql | egrep -v ^BEGIN$ | egrep -v ^ROLLBACK$ | pv -l | gzip > abstracts.json.gz
+
+ # Run on database server, write to file on remote host
+ psql fatcat < dump_abstracts.sql | egrep -v ^BEGIN$ | egrep -v ^ROLLBACK$ | pv -l | gzip | ssh user@host 'cat > abstracts.json.gz'
+