what the fuck is prometheus anyway? a thread about monitoring

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > YOSPOS > what the fuck is prometheus anyway? a thread about monitoring

Guy Axlerod: Dec 29, 2008

CRIP EATIN BREAD posted:

we use the opentracing api for everything and use jaeger as our backend which feeds it into elasticsearch

its cool when someone used a constant sampler (samples every trace) and we were producing 10gb of traces each day in our test environment.

pro-tip: use a probabilistic sampler or something that says "sample 5% of traces" instead.

Yeah, I set up tracing in our app, and the dev team was like "Can we have 100% sample rate in staging?" and also the dev team "We want to do some load tests in staging."

# ¿ Jul 10, 2019 20:31

Adbot: ADBOT LOVES YOU

# ¿ May 15, 2024 07:34

Guy Axlerod: Dec 29, 2008

Yeah, it can be set by env var, but I don't trust them to not gently caress up.

We do have some stuff set to 100%, and some stuff set to 1/1000000, while most is at 10% or so.

# ¿ Jul 10, 2019 22:39

Guy Axlerod: Dec 29, 2008

I used paper trail at one point, they had an option to save a copy of everything you sent them to s3. Maybe datadog has something similar?

If you're using this for billing, it seems like you need something more transactional. What if the log never makes it to dd?

# ¿ Jun 9, 2020 18:43

Guy Axlerod: Dec 29, 2008

Scud Hansen posted:

When I worked at [ALLIED MASTERCOMPUTER] we had really insane metrics and monitoring, but making any changes to it involved editing a 5000 line xml file, and when you parsed it, the thing that checked its validity couldn't tell you which line errors were on, only that there was one.

Hell.

Prometheus is pretty awesome especially when you stick a fancy dash with grafana on it. Jurassic park poo poo.

I prometheus operator. If there's an error in some PrometheusRules object some other gently caress deployed last week, it just never updates the promethus config again until you find the object causing the error. At least there's a log somewhere that might give you a clue about which part is broken.

# ¿ Jul 28, 2020 23:27

Guy Axlerod: Dec 29, 2008

Why yes I love having to take a delta and then sum, instead of a sum then a delta. I like having metrics get stuck whenever I redeploy, it's great.

# ¿ Aug 19, 2020 02:48

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > YOSPOS > what the fuck is prometheus anyway? a thread about monitoring