what the fuck is prometheus anyway? a thread about monitoring

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > YOSPOS > what the fuck is prometheus anyway? a thread about monitoring

in a well actually: Jan 26, 2011; dude, you gotta end it on the rhyme

i really enjoyed when honeycomb did a demo with their tool on the real world example of a service outage on their platform to discover that one of their servers ran out of disk space, something that a loving nagios check would have picked up 20 years ago

# ¿ Feb 23, 2019 20:27

Adbot: ADBOT LOVES YOU

# ¿ May 14, 2024 20:44

in a well actually: Jan 26, 2011; dude, you gotta end it on the rhyme

prometheus is pretty cool but shoehorning everything into a pull model annoys me

# ¿ Feb 23, 2019 20:29

in a well actually: Jan 26, 2011; dude, you gotta end it on the rhyme

r u ready to WALK posted:

https://www.cacti.net is extremely underrated for generic snmp poo poo

the server i set up 10 years ago at work still works pretty much maintenance free

it�s ok if what you need to do is turn snmp (or snmplikes) into browser-viewable rrds on one host

also make sure you stay on top of the security updates or keep it inaccessible from the web

# ¿ Feb 23, 2019 20:42

in a well actually: Jan 26, 2011; dude, you gotta end it on the rhyme

tracing seems to be something you write into your web app

anybody doing non-http tracing or collecting trace data from apps you don�t write yourself and doesn�t have native tracing support?

# ¿ Feb 23, 2019 20:56

in a well actually: Jan 26, 2011; dude, you gotta end it on the rhyme

uncurable mlady posted:

i wouldn't say it's just in your web app - our entire application is instrumented from webapp all the way down to the db. that said, it's kind of a massive pain in the rear end right now to get trace data from resources you don't directly manage because not everyone uses opentracing and even if they did, wire formats are very tracer dependent. that said, w3c is working on a tracecontext/tracedata specification that's intended to address this problem by standardizing headers and wire formats for context so you could have a situation where you're using some sort of managed ingress proxy or w/e and it'd be able to create spans as part of a trace that started on a client, etc. could also see the same thing at a managed db where the database service on the provider side is able to pick up traces incoming from the application and emit spans that you'd collect.

are you using tracing now? something home-brewed, or opentracing/opencensus?

not really tracing today. I have event data in ES that looks like spans, I think? (request A started routine P on node N, and another event when it completes), and time series data from from node N and resource R, S, T that are slightly to tightly correlated (R can tag all requests from P, S can only show traffic from N, and T can only show high-level perf indicators.)

What I want is something to take this structured Elastic data, look at what resources are directly or indirectly used by that request, and show relevant TS data from Prometheus and log data from ES. If T crashes I want to be able to look at what requests are active in the system. Given 5 crashes, I want to bisect that down to see that requests like A were the only common requests in all five crashes; I'd also like to see that A are taking longer than normal because resource T is reporting high utilization, etc.

# ¿ Feb 24, 2019 17:20

Adbot: ADBOT LOVES YOU

# ¿ May 14, 2024 20:44

in a well actually: Jan 26, 2011; dude, you gotta end it on the rhyme

my stepdads beer posted:

my prometheus keeps corrupting its data because it's on an nfs share. that's fine because i only use it for some pretty graphs sometimes. prom really suffers from not having good examples for what I think are common scenarios.

anyway we use cacti for all our network stuff because of inertia. prom+snmp-exporter+grafana was tedious as hell. nagios for our alerting and it's OK but kind of a pain re: config files.

for "tracing" we use senty for catching our dumb php apps various issues and fatals

don�t use nfs for databases

# ¿ Mar 2, 2019 10:30

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > YOSPOS > what the fuck is prometheus anyway? a thread about monitoring