diff options
-rw-r--r-- | notes/cockroach_tuning.md | 96 |
1 files changed, 96 insertions, 0 deletions
diff --git a/notes/cockroach_tuning.md b/notes/cockroach_tuning.md new file mode 100644 index 00000000..dc6acb4f --- /dev/null +++ b/notes/cockroach_tuning.md @@ -0,0 +1,96 @@ + +When doing a bulk import, could try: + + +Probably want to set: + + --cache=.25 --max-sql-memory=.25 + +Check FD limits: + + cat /proc/sys/fs/file-max + ulimit -a + +Set with: + + echo 150000 > /proc/sys/fs/file-max + +TODO: +- set `num_replicas=1` when testing +- close browser to free up RAM +- disable raft fsync + https://github.com/cockroachdb/cockroach/issues/19784 +- increase import batch size (10x current) + +## Log of Tests + +build: CCL v2.0.5 @ 2018/08/13 17:59:42 (go1.10) + +COCKROACH branch e386022ba051f46ff70c28b90d9a2fe1f856c57f + +cockroach start --insecure --store=fatcat-dev --host=localhost + + time ./fatcat_import.py import-issn /home/bnewbold/code/oa-journals-analysis/upload-2018-04-05/journal_extra_metadata.csv + + real 3m20.781s + user 0m6.900s + sys 0m0.336s + + cat /data/crossref/crossref-works.2018-01-21.badsample_100k.json | time parallel -j4 --round-robin --pipe ./fatcat_import.py import-crossref - /data/issn/20180216.ISSN-to-ISSN-L.txt + + => gave up after ~30 minutes with only ~40% done + +With cockroach, container lookups seem to be particularly slow, like 1000ms+, +though not reflected in cockroach-reported stats . Very high CPU usage by +cockroach. + +Cockroach is self-reporting P50 latency: 0.3 ms, P99 latency: 104.9 ms + +Seeing lots of Queue Processing Failures, of the "replication" type. Probably because there are no other nodes. + +CHANGES: +- closed browser, restarted postgres, restarted fatcatd to free up RAM +- cockroach start --insecure --store=fatcat-dev --host=localhost --cache=.25 --max-sql-memory=.25 +- cockroach zone set --insecure .default -f - + num_replicas: 1 + ^D +- SET CLUSTER SETTING kv.raft_log.synchronize=false; +- batch size from default (50) to 500 + + time ./fatcat_import.py import-issn --batch-size 500 /home/bnewbold/code/oa-journals-analysis/upload-2018-04-05/journal_extra_metadata.csv + + real 1m10.531s + user 0m5.204s + sys 0m0.128s + +Have 12 SQL connections + + cat /data/crossref/crossref-works.2018-01-21.badsample_100k.json | time parallel -j4 --round-robin --pipe ./fatcat_import.py import-crossref --batch-size 500 - /data/issn/20180216.ISSN-to-ISSN-L.txt + + GETs are still very slow end-to-end (~1sec) + during this time, though, cockroach is quoting very low latency + => halted + +It's the join performance (on lookups) that is killing things for release rev import: + + select * from container_rev where issnl = '0010-7824'; + Time: 2.16272ms + + select * from container_rev inner join container_ident on container_rev.id = container_ident.rev_id where issnl = '0010-7824'; + Time: 147.44647ms + +CHANGED: CCL v2.1.0-beta.20180904 @ 2018/09/04 03:51:03 (go1.10.3) + + time ./fatcat_import.py import-issn --batch-size 500 /home/bnewbold/code/oa-journals-analysis/upload-2018-04-05/journal_extra_metadata.csv + + real 0m59.903s + user 0m4.716s + sys 0m0.076s + + cat /data/crossref/crossref-works.2018-01-21.badsample_100k.json | time parallel -j4 --round-robin --pipe ./fatcat_import.py import-crossref --batch-size 500 - /data/issn/20180216.ISSN-to-ISSN-L.txt + + => still extremely slow API latency, though cockroach is reporting very + fast (P50 latency 1.0 ms, P99 latency 4.7 ms) + => maybe a timeout or something going on? + => single-level joins from CLI are ~200 ms + |