aboutsummaryrefslogtreecommitdiffstats
path: root/notes
diff options
context:
space:
mode:
authorBryan Newbold <bnewbold@robocracy.org>2018-09-10 19:33:28 -0700
committerBryan Newbold <bnewbold@robocracy.org>2018-09-10 19:33:28 -0700
commit1aa49b2c613178f17a2cb6e85a22f183a1c81947 (patch)
tree2ff01dc89e2795c92d15f614a7ee46f534d793c0 /notes
parente386022ba051f46ff70c28b90d9a2fe1f856c57f (diff)
downloadfatcat-1aa49b2c613178f17a2cb6e85a22f183a1c81947.tar.gz
fatcat-1aa49b2c613178f17a2cb6e85a22f183a1c81947.zip
add some notes from cockroach tuning
Diffstat (limited to 'notes')
-rw-r--r--notes/cockroach_tuning.md96
1 files changed, 96 insertions, 0 deletions
diff --git a/notes/cockroach_tuning.md b/notes/cockroach_tuning.md
new file mode 100644
index 00000000..dc6acb4f
--- /dev/null
+++ b/notes/cockroach_tuning.md
@@ -0,0 +1,96 @@
+
+When doing a bulk import, could try:
+
+
+Probably want to set:
+
+ --cache=.25 --max-sql-memory=.25
+
+Check FD limits:
+
+ cat /proc/sys/fs/file-max
+ ulimit -a
+
+Set with:
+
+ echo 150000 > /proc/sys/fs/file-max
+
+TODO:
+- set `num_replicas=1` when testing
+- close browser to free up RAM
+- disable raft fsync
+ https://github.com/cockroachdb/cockroach/issues/19784
+- increase import batch size (10x current)
+
+## Log of Tests
+
+build: CCL v2.0.5 @ 2018/08/13 17:59:42 (go1.10)
+
+COCKROACH branch e386022ba051f46ff70c28b90d9a2fe1f856c57f
+
+cockroach start --insecure --store=fatcat-dev --host=localhost
+
+ time ./fatcat_import.py import-issn /home/bnewbold/code/oa-journals-analysis/upload-2018-04-05/journal_extra_metadata.csv
+
+ real 3m20.781s
+ user 0m6.900s
+ sys 0m0.336s
+
+ cat /data/crossref/crossref-works.2018-01-21.badsample_100k.json | time parallel -j4 --round-robin --pipe ./fatcat_import.py import-crossref - /data/issn/20180216.ISSN-to-ISSN-L.txt
+
+ => gave up after ~30 minutes with only ~40% done
+
+With cockroach, container lookups seem to be particularly slow, like 1000ms+,
+though not reflected in cockroach-reported stats . Very high CPU usage by
+cockroach.
+
+Cockroach is self-reporting P50 latency: 0.3 ms, P99 latency: 104.9 ms
+
+Seeing lots of Queue Processing Failures, of the "replication" type. Probably because there are no other nodes.
+
+CHANGES:
+- closed browser, restarted postgres, restarted fatcatd to free up RAM
+- cockroach start --insecure --store=fatcat-dev --host=localhost --cache=.25 --max-sql-memory=.25
+- cockroach zone set --insecure .default -f -
+ num_replicas: 1
+ ^D
+- SET CLUSTER SETTING kv.raft_log.synchronize=false;
+- batch size from default (50) to 500
+
+ time ./fatcat_import.py import-issn --batch-size 500 /home/bnewbold/code/oa-journals-analysis/upload-2018-04-05/journal_extra_metadata.csv
+
+ real 1m10.531s
+ user 0m5.204s
+ sys 0m0.128s
+
+Have 12 SQL connections
+
+ cat /data/crossref/crossref-works.2018-01-21.badsample_100k.json | time parallel -j4 --round-robin --pipe ./fatcat_import.py import-crossref --batch-size 500 - /data/issn/20180216.ISSN-to-ISSN-L.txt
+
+ GETs are still very slow end-to-end (~1sec)
+ during this time, though, cockroach is quoting very low latency
+ => halted
+
+It's the join performance (on lookups) that is killing things for release rev import:
+
+ select * from container_rev where issnl = '0010-7824';
+ Time: 2.16272ms
+
+ select * from container_rev inner join container_ident on container_rev.id = container_ident.rev_id where issnl = '0010-7824';
+ Time: 147.44647ms
+
+CHANGED: CCL v2.1.0-beta.20180904 @ 2018/09/04 03:51:03 (go1.10.3)
+
+ time ./fatcat_import.py import-issn --batch-size 500 /home/bnewbold/code/oa-journals-analysis/upload-2018-04-05/journal_extra_metadata.csv
+
+ real 0m59.903s
+ user 0m4.716s
+ sys 0m0.076s
+
+ cat /data/crossref/crossref-works.2018-01-21.badsample_100k.json | time parallel -j4 --round-robin --pipe ./fatcat_import.py import-crossref --batch-size 500 - /data/issn/20180216.ISSN-to-ISSN-L.txt
+
+ => still extremely slow API latency, though cockroach is reporting very
+ fast (P50 latency 1.0 ms, P99 latency 4.7 ms)
+ => maybe a timeout or something going on?
+ => single-level joins from CLI are ~200 ms
+