aboutsummaryrefslogtreecommitdiffstats
path: root/notes
diff options
context:
space:
mode:
Diffstat (limited to 'notes')
-rw-r--r--notes/database_dumps_backups.txt53
-rw-r--r--notes/rust_libraries.txt19
2 files changed, 72 insertions, 0 deletions
diff --git a/notes/database_dumps_backups.txt b/notes/database_dumps_backups.txt
new file mode 100644
index 00000000..60d4bba0
--- /dev/null
+++ b/notes/database_dumps_backups.txt
@@ -0,0 +1,53 @@
+
+## Dumps and Backups
+
+There are a few different database dump formats folks might want:
+
+- raw native database backups, for disaster recovery (would include
+ volatile/unsupported schema details, user API credentials, full history,
+ in-process edits, comments, etc)
+- a sanitized version of the above: roughly per-table dumps of the full state
+ of the database. Could use per-table SQL expressions with sub-queries to pull
+ in small tables ("partial transform") and export JSON for each table; would
+ be extra work to maintain, so not pursuing for now.
+- full history, full public schema exports, in a form that might be used to
+ mirror or enitrely fork the project. Propose supplying the full "changelog"
+ in API schema format, in a single file to capture all entity history, without
+ "hydrating" any inter-entity references. Rely on separate dumps of
+ non-entity, non-versioned tables (editors, abstracts, etc). Note that a
+ variant of this could use the public interface, in particular to do
+ incremental updates (though that wouldn't capture schema changes).
+- transformed exports of the current state of the database (aka, without
+ history). Useful for data analysis, search engines, etc. Propose supplying
+ just the Release table in a fully "hydrated" state to start. Unclear if
+ should be on a work or release basis; will go with release for now. Harder to
+ do using public interface because of the need for transaction locking.
+
+## Full Postgres Backup
+
+Backing up the entire database using `pg_dump`, with parallelism 1 (use more on
+larger machine with fast disks; try 4 or 8?), assuming the database name is
+'fatcat', and the current user has access:
+
+ pg_dump -j1 -Fd -f test-dump fatcat
+
+## Identifier Dumps
+
+The `extras/quick_dump.sql` script will dump abstracts and identifiers as TSV
+files to `/tmp/`. Pretty quick; takes about 15 GB of disk space (uncompressed).
+
+## Releases Export
+
+ # simple command
+ ./fatcat_export.py releases /tmp/fatcat_ident_releases.tsv /tmp/releases-dump.json
+
+ # usual command
+ time ./fatcat_export.py releases /tmp/fatcat_ident_releases.tsv - | pv -l | wc
+
+## Changelog Export
+
+ # simple command
+ ./fatcat_export.py changelog /tmp/changelog-dump.json
+
+ # usual command
+ time ./fatcat_export.py changelog - | pv -l | wc
diff --git a/notes/rust_libraries.txt b/notes/rust_libraries.txt
new file mode 100644
index 00000000..7e6f33eb
--- /dev/null
+++ b/notes/rust_libraries.txt
@@ -0,0 +1,19 @@
+
+libs:
+- iron_slog
+- testing: keep it simple: iron-test
+ => if that is annoying, shiny? mockers if needed.
+- sentry
+- start with dotenv+clap, then config-rs?
+- cadence (emits statsd)
+- frank_jwt and JWT for (simple?) auth
+
+similar:
+- https://github.com/DavidBM/templic-backend
+- https://github.com/alexanderbanks/rust-api
+- https://mgattozzi.com/diesel-powered-rocket
+- https://www.reddit.com/r/rust/comments/8j1xbs/new_to_rust_and_gitlab_ci/
+- https://crate-ci.github.io/
+
+"cool tools":
+- cargo-watch