diff options
author | Bryan Newbold <bnewbold@robocracy.org> | 2018-11-14 18:36:48 -0800 |
---|---|---|
committer | Bryan Newbold <bnewbold@robocracy.org> | 2018-11-14 18:36:48 -0800 |
commit | cb8067ca6e4abe3bea499fc45ece05a454c794d6 (patch) | |
tree | 8e005e4edb7a58f60547976dab2017be2e4fb1ef /notes | |
parent | 6bcd62005dd7eab94744f5f368d4724732bcfbd9 (diff) | |
download | fatcat-cb8067ca6e4abe3bea499fc45ece05a454c794d6.tar.gz fatcat-cb8067ca6e4abe3bea499fc45ece05a454c794d6.zip |
more kafka performance notes
Diffstat (limited to 'notes')
-rw-r--r-- | notes/performance/kafka_pipeline.txt | 16 |
1 files changed, 15 insertions, 1 deletions
diff --git a/notes/performance/kafka_pipeline.txt b/notes/performance/kafka_pipeline.txt index f0862d89..0a503a18 100644 --- a/notes/performance/kafka_pipeline.txt +++ b/notes/performance/kafka_pipeline.txt @@ -11,7 +11,21 @@ messages/second. Because this worker consumes from 8x partitions, I have a feeling it might be consumer group related. kafka-manager shows "0% coverage" for this topic. Note that this is a single worker process. -_consumer_offsets is seeing about 36 messages/sec. +`_consumer_offsets` is seeing about 36 messages/sec. Oh, looks like I just needed to enable auto_commit and tune parameters in pykafka! + +That helped reduce `_consumer_offsets` churn, significantly, but didn't +increase throughput (or not much). Might want to switch to kafka connect +(presuming it somehow does faster/bulk inserts/indexing), with a simple worker +doing the transforms. Probably worth doing a `> /dev/null` version of the +worker first (with a different consumer group) to make sure the bottlneck isn't +somewhere else. + +Another thing to try is more kafka fetch threads. + +elastic-release python processing is at 66% (of one core) CPU! and elastic at +~30%. Huh. + +But, in general, "seems to be working". |