what the fuck is prometheus anyway? a thread about monitoring

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > YOSPOS > what the fuck is prometheus anyway? a thread about monitoring

ADINSX: Sep 9, 2003; Wanna run with my crew huh? Rule cyberspace and crunch numbers like I do?

thanks op didn't read

# ¿ Mar 8, 2019 01:28

Adbot: ADBOT LOVES YOU

# ¿ May 13, 2024 23:10

ADINSX: Sep 9, 2003; Wanna run with my crew huh? Rule cyberspace and crunch numbers like I do?

pram posted:

lol no. it isnt. youve never used it for anything serious stfu. for example

1) kafka doesnt rebalance topics, ever. if a node is down thats it. the replica is just gone. it doesnt 'migrate' because this is 1998
2) kafka doesnt rebalance storage, ever. if you use JBOD it will just randomly put segments wherever it feels like. if a disk is full it just breaks
3) topic compaction impacts the entire cluster performance if its big enough. nothing you can do about it
4) will randomly break and require a full restart if it lags on the zookeeper state
https://issues.apache.org/jira/browse/KAFKA-2729
5) will effortlessly end up with two cluster controllers if one has degraded performance
6) will spend literal hours 'recovering' on a hard restart (kill) if you have compacted segments
7) replicating data to a replaced node will impact the entire cluster performance, hammering the socket server. and this cant be prevented BECAUSE
8) if you throttle performance it impacts the replica manager AND producers
9) leader rebalancing can still temporarily break producers

and more!

Hun, this is interesting. We were playing with kafka at old job because so many things support it and it has per-partition ordering. I knew it was a pain in the rear end to run one, but never knew the reasons why... so this is a lot of reasons.

We were working with Confluent to provide us with a managed instance... I guess they just do all this poo poo behind the scenes? I wonder how they'll do poo poo that actually effects cluster performance? Send the team a notification that its gonna happen? Just never do it?

# ¿ Mar 9, 2019 06:54

ADINSX: Sep 9, 2003; Wanna run with my crew huh? Rule cyberspace and crunch numbers like I do?

lancemantis posted:

like spark had a super broken memory model for quite a while, lots of the Hadoop stack is brittle and needs a lot of babysitting

like the noteworthy parts of this stuff is it helps make some stuff feasible but it isn�t �good�

When you refer to a broken memory model is that for spark streaming stuff where the application might leak memory over time? Or does the problem come up in batch execution? I haven't done much spark stuff so I'm curious.

We were able to come up with a pretty solid BIG DATA pipeline using a lot of managed google stuff... but... it was managed by someone else, for all the reasons listed in the thread.

During my interview with the Kinesis team I got the distinct impression that a lot of their job is fighting fires; I realized its probably a lot more fun to USE these managed systems than it is to work on them

# ¿ Mar 9, 2019 08:47

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > YOSPOS > what the fuck is prometheus anyway? a thread about monitoring