Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Elos
Jan 8, 2009

current monitoring job status: i have 15 different unsee dashboards open and there's anything from 10 to 700 warnings/alerts in bouning around in each of them. i'm somehow supposed to keep eye on all of these with only the two monitors i have

these are monitoring a bunch of datacenters that are a hellish mix of mesos-marathon microservice container stuff and poo poo running straight on the metal, located around the world. i'm connected to them through a collection of ssh-tunnels over connections and vpns that sometimes just decide to stop working.

there's a lot of gently caress A CONTAINER IS DOWN!!! alerts and then go check it and everything is fine. bunch of alerts that once fired will hang around in the dashboard for 6 hours because ??? i'm never confident i'll catch the real problems with all this useless noise. there's graphana too but a lot of the stuff isnt configured right so you have to go massage some prometheus queries by hand to get the graphs you need

documentation if of course nonexistent and/or poo poo. configuring the monitoring is some other teams' job and a lot of the time all i can do is open a ticket and hope. when poo poo hits the fan i have only a vague idea who's responsible for what and who the hell i'm supposed to call so i get to wake up my team leader at four in the morning so he can figure it out

welp thats my story, back to lurking

Adbot
ADBOT LOVES YOU

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply