|
you need an append-only table in a database that keeps track of the transaction event stream, and you need to write to it in a more robust way than just using stdout or ddog's possibly inconsistent log sending
|
# ? Jun 11, 2020 23:32 |
|
|
# ? May 15, 2024 03:17 |
|
I'm picturing someone calling a customer service rep with billing questions and the person on the line saying "hold on for one second, sir" and then SSHing into a server and grepping logs
|
# ? Jun 12, 2020 16:12 |
|
my CSR would write a perl script for that
|
# ? Jun 12, 2020 18:53 |
|
how much do you have to spend on datadog before they will negotiate on price, they're so expensive it's like candlesmeme.tweet
|
# ? Jul 22, 2020 09:49 |
|
CRIP EATIN BREAD posted:I'm picturing someone calling a customer service rep with billing questions and the person on the line saying "hold on for one second, sir" and then SSHing into a server and grepping logs lol my first tech support job was this, but it was a shell command that rsh'd into the relevant router and grepped the output of some commands nothing like letting your phone people touch production equipment 👩🍳👌
|
# ? Jul 22, 2020 10:05 |
|
dropped some points from OP into my phone screen, thx OP
|
# ? Jul 22, 2020 17:04 |
|
the code that's writing the log should also be writing to a database efb
|
# ? Jul 22, 2020 19:10 |
|
pointsofdata posted:how much do you have to spend on datadog before they will negotiate on price, they're so expensive it's like candlesmeme.tweet they only offer "discounts" on a bulk (static) amount of metrics from what ive seen
|
# ? Jul 22, 2020 19:50 |
|
When im on call and we get a nagios alert (version probably 2014), i have to ssh into our productive machines after getting an error code, search for a line with that error code in our log, get a second number that i can then use to trace a kibana log
|
# ? Jul 28, 2020 16:50 |
|
hell yeah
|
# ? Jul 28, 2020 16:53 |
|
When I worked at [ALLIED MASTERCOMPUTER] we had really insane metrics and monitoring, but making any changes to it involved editing a 5000 line xml file, and when you parsed it, the thing that checked its validity couldn't tell you which line errors were on, only that there was one. Hell. Prometheus is pretty awesome especially when you stick a fancy dash with grafana on it. Jurassic park poo poo.
|
# ? Jul 28, 2020 18:45 |
|
I'm fairly certain our datadog spend this month was higher than our (covid impacted) revenue
|
# ? Jul 28, 2020 19:35 |
|
a recruiter told swim datadog is creating a special task force to reduce cloud spend lmbo
|
# ? Jul 28, 2020 20:22 |
|
Scud Hansen posted:When I worked at [ALLIED MASTERCOMPUTER] we had really insane metrics and monitoring, but making any changes to it involved editing a 5000 line xml file, and when you parsed it, the thing that checked its validity couldn't tell you which line errors were on, only that there was one. I prometheus operator. If there's an error in some PrometheusRules object some other gently caress deployed last week, it just never updates the promethus config again until you find the object causing the error. At least there's a log somewhere that might give you a clue about which part is broken.
|
# ? Jul 28, 2020 23:27 |
|
motedek posted:a recruiter told swim datadog is creating a special task force to reduce cloud spend lmbo poo poo guys what are we gonna do with all this data
|
# ? Jul 28, 2020 23:49 |
|
I’m beginning to regret entrusting my data to a dog
|
# ? Jul 29, 2020 00:35 |
|
what do dogs know about cost accounting anyway
|
# ? Jul 29, 2020 05:06 |
|
my understanding is that they basically have just a massive loving Kafka cluster and boy I’d love to hear some stories from their SREs
|
# ? Jul 29, 2020 05:07 |
|
ive recently freed myself from janitoring kafka. never again
|
# ? Jul 29, 2020 05:35 |
|
why would you use kafka for monitoring
|
# ? Jul 29, 2020 06:13 |
|
you use it to stream monitoring events genius
|
# ? Jul 29, 2020 07:02 |
|
i want to send structured logs/events at some service that I wont have to janitor. the events will mostly be an entity with one or more uuids, and then some info like state changes, maybe some timestmaps or durations, idk. Nothing automated needs to consume this, just sometimes a person will do a search on the uuids to figure out what happened. I only need like 7 maybe 14 days of retention and the volume of incoming events wont be so high. Probably like 1gb/day, for sure less than 100. Only like 5 people will need access. what service should i use?
|
# ? Jul 29, 2020 07:40 |
|
pram posted:you use it to stream monitoring events genius why not just use prometheus Pardot posted:i want to send structured logs/events at some service that I wont have to janitor. the events will mostly be an entity with one or more uuids, and then some info like state changes, maybe some timestmaps or durations, idk. Nothing automated needs to consume this, just sometimes a person will do a search on the uuids to figure out what happened. I only need like 7 maybe 14 days of retention and the volume of incoming events wont be so high. Probably like 1gb/day, for sure less than 100. Only like 5 people will need access. an ELK stack
|
# ? Jul 29, 2020 07:45 |
|
Michaellaneous posted:why not just use prometheus i answered your question
|
# ? Jul 29, 2020 07:51 |
|
Pardot posted:i want to send structured logs/events at some service that I wont have to janitor. the events will mostly be an entity with one or more uuids, and then some info like state changes, maybe some timestmaps or durations, idk. Nothing automated needs to consume this, just sometimes a person will do a search on the uuids to figure out what happened. I only need like 7 maybe 14 days of retention and the volume of incoming events wont be so high. Probably like 1gb/day, for sure less than 100. Only like 5 people will need access. given just a gigabyte a day and only occasional use, it could honestly just be a set of hourly text files with regular backups that they would grep if you're feeling fancy then write as a gzip stream and tell them to use zgrep
|
# ? Jul 29, 2020 08:43 |
|
Pardot posted:i want to send structured logs/events at some service that I wont have to janitor. the events will mostly be an entity with one or more uuids, and then some info like state changes, maybe some timestmaps or durations, idk. Nothing automated needs to consume this, just sometimes a person will do a search on the uuids to figure out what happened. I only need like 7 maybe 14 days of retention and the volume of incoming events wont be so high. Probably like 1gb/day, for sure less than 100. Only like 5 people will need access. greylog or ELK probably
|
# ? Jul 29, 2020 10:45 |
|
Pardot posted:i want to send structured logs/events at some service that I wont have to janitor. the events will mostly be an entity with one or more uuids, and then some info like state changes, maybe some timestmaps or durations, idk. Nothing automated needs to consume this, just sometimes a person will do a search on the uuids to figure out what happened. I only need like 7 maybe 14 days of retention and the volume of incoming events wont be so high. Probably like 1gb/day, for sure less than 100. Only like 5 people will need access. Azure Monitor/Application Insights
|
# ? Jul 29, 2020 15:32 |
|
Ploft-shell crab posted:I’m beginning to regret entrusting my data to a dog
|
# ? Jul 29, 2020 18:45 |
|
Pardot posted:i want to send structured logs/events at some service that I wont have to janitor. the events will mostly be an entity with one or more uuids, and then some info like state changes, maybe some timestmaps or durations, idk. Nothing automated needs to consume this, just sometimes a person will do a search on the uuids to figure out what happened. I only need like 7 maybe 14 days of retention and the volume of incoming events wont be so high. Probably like 1gb/day, for sure less than 100. Only like 5 people will need access. honestly? honeycombs free tier might work out well for you edit - assuming you want to share a password, I think they have a user limit on the free account
|
# ? Aug 3, 2020 12:53 |
|
uncurable mlady posted:honestly? honeycombs free tier might work out well for you I’ll take a closer look at that. I know a couple of people that work there but figured it was all more about “observabity” whatever that is than just looking at a few logs. I’m trying out timber.io right now and it seems exactly right.
|
# ? Aug 18, 2020 07:31 |
|
promql sucks so bad. wishing I had gone with influxdb for its sql like language
|
# ? Aug 19, 2020 02:43 |
|
Why yes I love having to take a delta and then sum, instead of a sum then a delta. I like having metrics get stuck whenever I redeploy, it's great.
|
# ? Aug 19, 2020 02:48 |
|
my stepdads beer posted:promql sucks so bad. wishing I had gone with influxdb for its sql like language don't worry the influxdb one is also bad
|
# ? Aug 19, 2020 07:50 |
|
we've been using Xymon for monitoring our systems for years and it's pretty dated and it's got rather out of hand as we've grown. i may have convinced people that we should migrate to something more modern and less garbage... but i have no idea what. can anyone recommend a monitoring solution that would be suitable for stuff like checking connectivity to a couple of hundred servers, databases and http endpoints? we've got some complicated poo poo that that alerts based on log file parsing but i'd be happy to just get the important "server is down" alerts onto something more modern as a starting point.
|
# ? Sep 4, 2020 12:30 |
|
if you're up for a bit of learning and set up prometheus with alertmanager + blackbox exporter prometheus and promql is annoying and isn't a turn-key solution, requires a bit of set up but it works and it scales well.
|
# ? Sep 4, 2020 13:20 |
|
tfw you are seeing degraded hardware issues in aws
|
# ? Apr 29, 2022 17:24 |
|
who up observing they apps
|
# ? Feb 10, 2024 15:58 |
|
kitten emergency posted:who up observing they apps who needs they prometheussy metriced
|
# ? Feb 10, 2024 17:32 |
|
datadog is so expensive
|
# ? Feb 10, 2024 21:09 |
|
|
# ? May 15, 2024 03:17 |
|
I just ping poo poo and if it doesn't answer then I call the vendor.
|
# ? Feb 10, 2024 21:12 |