Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
pram
Jun 10, 2001
my company uses splunk for logs and datadog for tracing/monitoring/alerts. these both cost millions of dollars a year. thanks for listening!

Adbot
ADBOT LOVES YOU

pram
Jun 10, 2001
nagios is extremely poo poo

pram
Jun 10, 2001
just lol that you still have to restart the whole loving thing to update anything

pram
Jun 10, 2001
also the latest version of opsview is shiiiiiitttttt

pram
Jun 10, 2001
kafka is about 1000x times shittier so count your blessings

pram
Jun 10, 2001

Share Bear posted:

oh no! promotheus.

pram
Jun 10, 2001

Progressive JPEG posted:

kafka is extremely good, maybe your just holding it wrong

lol no. it isnt. youve never used it for anything serious stfu. for example


1) kafka doesnt rebalance topics, ever. if a node is down thats it. the replica is just gone. it doesnt 'migrate' because this is 1998
2) kafka doesnt rebalance storage, ever. if you use JBOD it will just randomly put segments wherever it feels like. if a disk is full it just breaks
3) topic compaction impacts the entire cluster performance if its big enough. nothing you can do about it
4) will randomly break and require a full restart if it lags on the zookeeper state
https://issues.apache.org/jira/browse/KAFKA-2729
5) will effortlessly end up with two cluster controllers if one has degraded performance
6) will spend literal hours 'recovering' on a hard restart (kill) if you have compacted segments
7) replicating data to a replaced node will impact the entire cluster performance, hammering the socket server. and this cant be prevented BECAUSE
8) if you throttle performance it impacts the replica manager AND producers
9) leader rebalancing can still temporarily break producers


and more!

pram fucked around with this message at 05:54 on Mar 9, 2019

pram
Jun 10, 2001
people dont believe that kafka doesnt migrate anything or rebalance anything. because elasticsearch does so people assume something like kafka (which is pure magic ftw) does

but it literally doesnt. its all manual. if you want to reassign a partition replica, you have to do it yourself with the cli tools or some 3rd party thing. and the operation itself isnt transparent, it actually impacts all the consumers and producers while its doing it (tbf es does this too) its loving garbage

pram
Jun 10, 2001

ADINSX posted:

Hun, this is interesting. We were playing with kafka at old job because so many things support it and it has per-partition ordering. I knew it was a pain in the rear end to run one, but never knew the reasons why... so this is a lot of reasons.

We were working with Confluent to provide us with a managed instance... I guess they just do all this poo poo behind the scenes? I wonder how they'll do poo poo that actually effects cluster performance? Send the team a notification that its gonna happen? Just never do it?

yes we use confluent (the platform, not their cloud) and they said they basically made a bunch of proprietary additions for their managed service. in that sense its like redislabs cloud vs 'redis' in that you cant replicate it with off the shelf stuff (or even their own provided tools like replicator lol)

amazon msk is straight up vanilla kafka and i think its a big joke right now. same with the azure one, i think its literally just the hortonworks ambari kafka

if you have a single topic you wont have many issues. if you run multi-tenant clusters where people are doing compaction and exactly-once and theres 10000 different consumer groups then its a total shitshow

pram
Jun 10, 2001
'durrr kafka works fine on my laptop in docker'

pram
Jun 10, 2001
software just wants to be free - jeff bezos, free software advocate

pram
Jun 10, 2001
kafka is so very bad i cant even

pram
Jun 10, 2001
im reminded of this beauty of an error, for example. kafka partitions would literally just get corrupted and the broken log file would prevent the broker from STARTING

https://issues.apache.org/jira/browse/KAFKA-3919

the only way you could fix it was to go delete the actual file sitting on the disk. if it affected multiple brokers (because of unclean leader election) hope you like data loss

and of course during all this you're totally offline so its a huge outage. epic and ftw

pram
Jun 10, 2001
datadog is a metrics/alerting service you $$$ for. its nice

pram
Jun 10, 2001
nothing more money wont fix

pram
Jun 10, 2001
yes kafka is loving garbage software

pram
Jun 10, 2001

pointsofdata posted:

how much do you have to spend on datadog before they will negotiate on price, they're so expensive it's like candlesmeme.tweet

they only offer "discounts" on a bulk (static) amount of metrics from what ive seen

pram
Jun 10, 2001
ive recently freed myself from janitoring kafka. never again

pram
Jun 10, 2001
you use it to stream monitoring events genius

Adbot
ADBOT LOVES YOU

pram
Jun 10, 2001

Michaellaneous posted:

why not just use prometheus


i answered your question

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply