Continuous Integration/build engineering/devops thread

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Continuous Integration/build engineering/devops thread

«‹›156 »

Hadlock: Nov 9, 2004

I've been using podman symlinked as docker on my personal workstation and personal devices since like, 2020 or maybe 21 I usually forget until I run docker --help and it tells me in the documentation. Maybe 2019 on personal devices before it went formally GA

I don't know what the competition looks like because I haven't run into any situation to need to look elsewhere, other stuff might be better

# ? Aug 26, 2023 21:00

Adbot: ADBOT LOVES YOU

# ? May 18, 2024 04:23

my homie dhall: Dec 9, 2010; honey, oh please, it's just a machine

what�s wrong with docker?

# ? Aug 27, 2023 04:00

Trapick: Apr 17, 2006

my homie dhall posted:

what�s wrong with docker?

"Commercial use of Docker Desktop at a company of more than 250 employees OR more than $10 million in annual revenue requires a paid subscription"

# ? Aug 27, 2023 04:42

The Iron Rose: May 12, 2012; Cat Army

Trapick posted:

"Commercial use of Docker Desktop at a company of more than 250 employees OR more than $10 million in annual revenue requires a paid subscription"

Eminently reasonable if unfortunately unenforceable. I don�t have a problem with making enterprises pay for revolutionary new technologies.

# ? Aug 27, 2023 05:49

Gucci Loafers: May 20, 2006; Ask yourself, do you really want to talk to pair of really nice gaudy shoes?

necrobobsledder posted:

His idea is that innovation centers like Xerox PARC and similar are a result of wildly profitable companies having the luxury and lack of competition to invest in long term R&D instead of trying hard to make quarterly numbers constantly. It�s not 100% wrong but there�s a lot of horrible, dire implications for humanity with every variable that supports the argument.

Gotcha, this kind of makes sense. Microsoft Research is kind of interesting example of this too.

# ? Aug 27, 2023 06:01

Trapick: Apr 17, 2006

The Iron Rose posted:

Eminently reasonable if unfortunately unenforceable. I don�t have a problem with making enterprises pay for revolutionary new technologies.

I don't have a problem with it either, just saying that's why a lot of folks switched off it. My work didn't want to pay for docker licenses, so now we use podman.

# ? Aug 27, 2023 07:00

NihilCredo: Jun 6, 2011; iram omni possibili modo preme:
plus una illa te diffamabit, quam multæ virtutes commendabunt

my homie dhall posted:

what’s wrong with docker?

Docker is fine, but podman is very much an incremental improvement - rootless by default, does not need a daemon, integrates well with systemd.

# ? Aug 27, 2023 07:49

Hadlock: Nov 9, 2004

Yeah podman uses a more traditional execution model so it behaves like a traditional unix process that you would expect. You just run the binary as local user, no usermod -aG bullshit to access it

Run a container via docker and then run a container via podman then run 'tree' and observe the difference

And also no semi predatory licensing bullshit

# ? Aug 27, 2023 08:16

xzzy: Mar 5, 2009

Hadlock posted:

And also no semi predatory licensing bullshit

For now. Keep in mind who owns it! :v:

# ? Aug 27, 2023 10:06

Hughmoris: Apr 21, 2007; Let's go to the abyss!

Sunday Career Chat

I know this is an overly broad question but is it possible to work in DevOps without touching much application code? I'm self-teaching Docker and K8s at the moment and I'm enjoying it but my eyes start to glaze over when I start looking at application code. I am able to read and write basic apps but I don't particularly enjoy it. I really enjoy scripting/automating and have more fun putting the lego blocks together.

For those of you working professionally in this field, are you writing lots of .NET/Node/Java etc? Is it just inherent to the DevOp job? Maybe something SRE is more my speed?

(I ask here because I'm a data analyst and don't personally know anyone doing this work)

# ? Aug 28, 2023 00:12

Docjowles: Apr 9, 2009

Hughmoris posted:

Sunday Career Chat

I know this is an overly broad question but is it possible to work in DevOps without touching much application code? I'm self-teaching Docker and K8s at the moment and I'm enjoying it but my eyes start to glaze over when I start looking at application code. I am able to read and write basic apps but I don't particularly enjoy it. I really enjoy scripting/automating and have more fun putting the lego blocks together.

For those of you working professionally in this field, are you writing lots of .NET/Node/Java etc? Is it just inherent to the DevOp job? Maybe something SRE is more my speed?

(I ask here because I'm a data analyst and don't personally know anyone doing this work)

It depends what you mean by "work in DevOps" but yes there are plenty of companies (hell I would wager it is still most companies?) that distinguish between devs writing product code and ops building the infrastructure to deploy, run, monitor, and secure it. Titles in this field are a poo poo show you are not going to find a ton of consistency between companies as to what an SRE or "devops engineer" or a platform engineer does. If anything, the original version of SRE sounds like the opposite of what you want. It's meant to be a software engineer who happens to have some operational chops. They would be more likely to dive into the application code to add instrumentation and improve reliability and be working in the same languages and toolchains as the feature teams. A title like DevOps Engineer* is more likely to be focused on building and automating infrastructure.

I've spent my entire career in ops type roles and no, I do not ever really write .NET or Java or Node. Python is by far what I've written the most, followed by Go (and Bash of course if you count that).

It's annoying but you'll need to cast a wide net across all relevant job titles and then dive into the description to see if it's actually what you want to do. And even then it may not be clear until you start down the road of interviewing.

* People sometimes get hung up on the term DevOps Engineer or DevOps Team since if a company is "doing DevOps" engineers span both roles and the name is nonsensical. But back in reality there are thousands of job postings for DevOps Engineers that are a modern spin on a sysadmin who understands automation tools and can code to some degree

# ? Aug 28, 2023 00:54

The Iron Rose: May 12, 2012; Cat Army

You absolutely need to know how to code *a little bit* to be a good or great software operations engineer. You don�t need to know to code at all to be hired and be mediocre. Caveat that I�ve only worked at <2000 person firms as an SRE/platform/devops/cloud/bullshit engineer.

You won�t be effective as an operations engineer if you don�t understand what the applications you support do. You don�t need to get very deep into style or composition best practices or whatever, but being able to look at code and see where an app connects to a database or consumes from a queue, APM instrumentation, understanding what makes an application healthy, where a service interacts with other services or experiences backpressure, anything to do with authentication... these are all very critical components. I�ve written some code touching all those areas at least once in the past year, ranging from bash startup/readiness probes, to MRs to shared libraries, to terraform changes to vault or secret layers, and so on.

Ultimately, the more you understand the architecture of your services from the perspective of the service and your customers, the better. This applies whether you�re in devops, security, reliability, whatever platform means� there is no role within an engineering organization where knowing the above will not make you more effective at working with your developers. If you�re an SRE, some basic coding knowledge is as close to a hard requirement as you can get.

If you�re just a gatekeeper to the cloud/terraform you don�t need to know much, but there�s an upper limit there too, and that role isn�t as valuable to an organization anyways. It hasn�t happened often, but the three years I�ve been supporting k8s I�ve needed to look at the kube source code once a year or thereabouts. Similarly with many of the third party applications I work with. You should be meeting with your dev team and be aware of how their work interacts with the shared services you support and the other services your devs write. When someone says �<service>� is down, you should know how that impacts internal teams and your customers.

Learning to code is not mandatory to be hired (maybe 35% of interviews have coding tests), and maybe half the devops people I�ve worked with came from non-dev backgrounds. The ones who could code at least a little bit were all fine. The ones who came from dev backgrounds were universally fantastic. The ones who couldn�t code, or couldn�t code well and otherwise didn�t understand that their job was about making developers productive, were to a man mediocre or actively poor at their jobs, with only a handful of exceptions.

I really want to reiterate you don�t need to code very well, but you really do need to at least know how to read code and write enough to support automation in your language of choice. your devs won�t like you very much if you can�t speak their language, and you need to understand what their services do to support and monitor shared services you own effectively.

There is very little difference between what you enjoy about putting Lego blocks together and the code your developers write when you get down to it. Be not intimidated! It will get easier with time and familiarity. Coding is less scary than you think.

The Iron Rose fucked around with this message at 01:15 on Aug 28, 2023

# ? Aug 28, 2023 01:02

Hadlock: Nov 9, 2004

If you have absolutely no coding experience, struggle through writing a bash script that can parse a CSV file into three different txt files as lists, then go read "how to python good in 21 days"* cover to cover and maybe casually attempt the first half of the assignments

*Not a real book, probably but if you plug that into Amazon something in your price/skill range should appear

That ought to be enough to allow you to lie your way through a Jr DevOps interview

# ? Aug 28, 2023 04:50

Docjowles: Apr 9, 2009

I guess I read "I really enjoy scripting/automating" more charitably than you all. It doesn't sound like they hate coding or are hopeless at it. But there's a big difference between writing hundreds to low thousands of lines of code to glue some poo poo together and working in a huge rear end 500k+ LOC enterprise Java codebase. I expect this is extremely relatable to a lot of us ITT? I got deep into a CS degree before deciding being a full time SWE sounded lame and boring and I would much rather build and run servers and networks in support of the folks who actually wanted to write apps (SRE did not exist at this time or I might have felt differently...).

I fully agree with the point that getting better and more comfortable with coding and what your developers do is always a great idea. This will only improve your salary and effectiveness. You should have some familiarity with the app you are supporting even if you couldn't drop into a team and immediately start contributing code.

# ? Aug 28, 2023 05:28

Falcon2001: Oct 10, 2004; Eat your hamburgers, Apollo.; Pillbug

Hughmoris posted:

Sunday Career Chat

I know this is an overly broad question but is it possible to work in DevOps without touching much application code? I'm self-teaching Docker and K8s at the moment and I'm enjoying it but my eyes start to glaze over when I start looking at application code. I am able to read and write basic apps but I don't particularly enjoy it. I really enjoy scripting/automating and have more fun putting the lego blocks together.

For those of you working professionally in this field, are you writing lots of .NET/Node/Java etc? Is it just inherent to the DevOp job? Maybe something SRE is more my speed?

(I ask here because I'm a data analyst and don't personally know anyone doing this work)

Highly agree with what others said. I'm working at a FAANG company and as far as I can tell our term for this is not shared in the industry, so job titles are a mess, but my job is split between managing a bunch of infrastructure and working on some dev projects to build tools to manage it better, but I suspect a lot of teams like this wouldn't even have a project as big and semi-structured as we do. Most of the coding my team does is much smaller script stuff though, so my project is kind of an outlier.

Python is probably a major language for this sort of thing, once you get past Bash/Powershell (depending on your infrastructure) and it's what we use primarily for all of our internal tooling, although we have a few Java projects floating around too. Python is a great middle ground for languages for this sort of thing because it's robust enough to do all sorts of things and patterns, while remaining easy enough to read and write that you can slap out code very quickly in it. (You can also write some terrible inscrutable code in it, but coding standards can help stave this off.) It doesn't stand up in performance to Java or C# by any means but as it turns out if what you need to do is 'Modify this AWS account thing to do X instead of Y' it doesn't fuckin' matter if it takes 2 seconds instead of .02 since some human's running it by hand and typing in the commands will take longer than executing them anyway.

https://books.google.com/books?id=81UrjwEACAAJ is the book that as far as I can tell, helped popularize the idea of SRE, and as Docjowles said it was more of a 'hey what if a dev learned literally anything about operations' and less of 'what if an ops guy could write...code??!??'. Different companies will have different approaches to that spectrum, and honestly some teams even have people on any end of that; my team has a good scattering of folks at all points, I think I'm probably somewhere solidly in the middle.

Our coding level for (what do you call the level between junior and senior? That level) interviews tends to be along the lines of 'can you do a reasonably straightforward LeetCode question (something medium difficulty)' on the coding side, but otherwise we want to know that you understand infrastructure. From my perspective, that means you can talk about load balancers and stuff like that at a high level, and maybe dig into some stuff, but our size means that you don't have to be an expert in any single thing necessarily.

That being said, my company's hiring requirement does require coding knowledge, so I would recommend playing around on leetcode or codewars or whatever and getting used to working through algorithm questions. It can trip you up if you're not familiar, but in my experience we don't tend to ask the sort of bizarre ones that are like 'invert a binary tree', and generally give out questions that are more along the lines of 'Here's an imaginary situation you could conceivably run into, solve it in a shared online editor and talk through your steps'

# ? Aug 28, 2023 05:50

Extremely Penetrated: Aug 8, 2004; Hail Spwwttag.

Hughmoris posted:

my eyes start to glaze over when I start looking at application code. I am able to read and write basic apps but I don't particularly enjoy it. I really enjoy scripting/automating and have more fun putting the lego blocks together.
...
For those of you working professionally in this field, are you writing lots of .NET/Node/Java etc?

I'm a <stupid Cloud title> at a small (~100 devs) SaaS shop and came from an infra/SOE background. The platform's all .NET for the mid/backend components. The traditional programmer types massively outnumber anyone with operational knowledge, so we're spread pretty thin and are encouraged to push whatever work we can back onto the scrum teams. I need to know where to find a component's init/config code, and how to read it, but I don't write any C#. Like if they want to implement OpenTelemetry for APM or something that's cool and we'll help, but it's their story. What I do write a lot of is poo poo to automate and glue things together, mostly in powershell. The rest is terraform, yaml/json configs for deployment tooling, some python and bash. I absolutely never need to work on some app's internal business logic or API contracts or data schemas or any of that boring rear end poo poo.

# ? Aug 28, 2023 07:18

Hughmoris: Apr 21, 2007; Let's go to the abyss!

Thank you all for the insights. It can be hard to match the toy projects you're learning on against the reality of the job, so it's great to hear from people actually doing the work.

Docjowles posted:

I guess I read "I really enjoy scripting/automating" more charitably than you all. It doesn't sound like they hate coding or are hopeless at it. But there's a big difference between writing hundreds to low thousands of lines of code to glue some poo poo together and working in a huge rear end 500k+ LOC enterprise Java codebase. I expect this is extremely relatable to a lot of us ITT? I got deep into a CS degree before deciding being a full time SWE sounded lame and boring and I would much rather build and run servers and networks in support of the folks who actually wanted to write apps (SRE did not exist at this time or I might have felt differently...).

This is about right. I've written a lot of Python, and a little bit of PowerShell, for data analysis and scripting.

Hadlock posted:

That ought to be enough to allow you to lie your way through a Jr DevOps interview

This may be my next goal, to start angling for some DevOps interviews to get a better feel for expectations.

# ? Aug 28, 2023 13:35

Gucci Loafers: May 20, 2006; Ask yourself, do you really want to talk to pair of really nice gaudy shoes?

As an average DevOps person, I'd say that knowing how to code a few hundred lines with PowerShell, Bash, Python, etc. is pretty much a requirement at this point.

I do wonder what the future holds for this kind of role.

# ? Aug 28, 2023 15:13

Hadlock: Nov 9, 2004

Yeah 70% of the time they're going to ask you how to do some kind of log parsing riddle involving a hash table, dictionary and lightly quiz you about time complexity O(n)

Most recently it was a ipban lite thing ip address hits N times in Y seconds iterating in complexity

Hadlock fucked around with this message at 23:28 on Aug 28, 2023

# ? Aug 28, 2023 23:25

i am a moron: Nov 12, 2020; "I think if there’s one thing we can all agree on it’s that Penn State and Michigan both suck and are garbage and it’s hilarious Michigan fans are freaking out thinking this is their natty window when they can’t even beat a B12 team in the playoffs lmao"

OpenTF wars started, they have

(We have partners with products that rely on it that are dedicating engineers to it, im getting my popcorn ready)

# ? Aug 31, 2023 14:30

vanity slug: Jul 20, 2010

HashiCorp Terraform Registry is now for "HashiCorp Terraform" only. That's a nice gently caress you to OpenTF, I guess they'll just have to spin up their own registry.

# ? Aug 31, 2023 14:37

Vulture Culture: Jul 14, 2003; I was never enjoying it. I only eat it for the nutrients.

Dave McJannet really imagining himself as a Larry Ellison these days

# ? Aug 31, 2023 16:20

NihilCredo: Jun 6, 2011; iram omni possibili modo preme:
plus una illa te diffamabit, quam multæ virtutes commendabunt

Stupid question, but: having never used it, my impression was that Terraform was a lot like Pulumi (which I did use) - a declarative language to define a set of cloud resources.

You write what you want, then when your CLI or your CI server runs "terraform apply" it connects to AWS / other cloud provider with an api key and creates/destroys/changes resources as needed.

If that description is correct, then what *is* a "hosted Terraform" service? Shouldn't your "Terraforms" just be text files that live in a git repository, and shouldn't you only need an ephemeral stateless VM/container to apply them?

# ? Aug 31, 2023 16:27

Hadlock: Nov 9, 2004

vanity slug posted:

HashiCorp Terraform Registry is now for "HashiCorp Terraform" only. That's a nice gently caress you to OpenTF, I guess they'll just have to spin up their own registry.

Wow already? That's probably not good for user retention for them, and pretty poor show of face. I'd love to know which exec made that decision, so I know which products to avoid when they get fired and move on to the next place

If this were a less important product I think they might be able to get away with this but most devops people are going to be highly salty about this, like taking the knife out of a butchers hand, they're not going to forget this

It's hashicorps product and all, and they can do what they want but the community has been unwaivering in their opinion on this change

Hadlock fucked around with this message at 16:32 on Aug 31, 2023

# ? Aug 31, 2023 16:29

The Fool: Oct 16, 2003

NihilCredo posted:

Stupid question, but: having never used it, my impression was that Terraform was a lot like Pulumi (which I did use) - a declarative language to define a set of cloud resources.

You write what you want, then when your CLI or your CI server runs "terraform apply" it connects to AWS / other cloud provider with an api key and creates/destroys/changes resources as needed.

If that description is correct, then what *is* a "hosted Terraform" service? Shouldn't your "Terraforms" just be text files that live in a git repository, and shouldn't you only need an ephemeral stateless VM/container to apply them?

The hosted terraform services provide additional automation, rbac, state management and a few other bells and whistles.

SpaceLift is one that competes directly with Terraform Cloud/Enterprise.

Pulumi has a cloud service that offers similar features.

# ? Aug 31, 2023 16:35

vanity slug: Jul 20, 2010

Hadlock posted:

Wow already? That's probably not good for user retention for them, and pretty poor show of face. I'd love to know which exec made that decision, so I know which products to avoid when they get fired and move on to the next place

If this were a less important product I think they might be able to get away with this but most devops people are going to be highly salty about this, like taking the knife out of a butchers hand, they're not going to forget this

It's hashicorps product and all, and they can do what they want but the community has been unwaivering in their opinion on this change

"You may download providers, modules, policy libraries and/or other Services or Content from this website solely for use with, or in support of, HashiCorp Terraform."

This was not in the previous version of the terms of use (link to archive). Updated last week.

# ? Aug 31, 2023 16:47

i am a moron: Nov 12, 2020; "I think if there’s one thing we can all agree on it’s that Penn State and Michigan both suck and are garbage and it’s hilarious Michigan fans are freaking out thinking this is their natty window when they can’t even beat a B12 team in the playoffs lmao"

I was gonna say isn�t that just where people put their lovely modules no one uses, then I saw providers. Oof

# ? Aug 31, 2023 16:54

Vulture Culture: Jul 14, 2003; I was never enjoying it. I only eat it for the nutrients.

NihilCredo posted:

If that description is correct, then what *is* a "hosted Terraform" service? Shouldn't your "Terraforms" just be text files that live in a git repository, and shouldn't you only need an ephemeral stateless VM/container to apply them?

Yeah, so basically if Terraform is bash, Terraform Cloud is GitHub Actions

vanity slug posted:

"You may download providers, modules, policy libraries and/or other Services or Content from this website solely for use with, or in support of, HashiCorp Terraform."

This was not in the previous version of the terms of use (link to archive). Updated last week.

This clearly went so fast that it bypassed the lawyers; "in support of" is a big enough loophole to drive a truck through and there's no possible way to put teeth on this clause

e: limiting use of HashiCorp-operated services to HashiCorp products is probably the most reasonable part of this whole thing; even System Initiative uses similar language:
https://www.systeminit.com/open-source/

Vulture Culture fucked around with this message at 16:14 on Sep 1, 2023

# ? Sep 1, 2023 05:12

necrobobsledder: Mar 21, 2005; Lay down your soul to the gods rock 'n roll; Nap Ghost

Most companies I've seen that were planning on this move have been planning years ahead of time and have been preparing with a lot of general counsel helping out given the massive impact likely to the company - this seems like a sort of surprise to technical folks but this is kind of a "when, not if" reaction if you've got any inkling of business sense in your body. Which is why that language being so vague is kind of a glaring screw-up that I thought would be easy to have reviewed by an AI and marked as a problem. Contract review is kind of like auto-correct and should have been caught by automated systems for fancy lawyers these days.

# ? Sep 1, 2023 16:56

The Fool: Oct 16, 2003

any of you masochistic enough to have set up a local provider mirror for terraform?

# ? Sep 6, 2023 19:33

The Iron Rose: May 12, 2012; Cat Army

The Fool posted:

any of you masochistic enough to have set up a local provider mirror for terraform?

No is a complete sentence you can share with your security team, which is what we did when they started mandating CMKs for everything.

In other news, I have a Python service that runs in a kubernetes pod, consumes from a rabbitmq queue, interprets the results, and sends them to one of two possible dependencies by making HTTP calls. The code initializes, checks if it can connect to its dependencies, and exits if it doesn�t. There are no startup/readiness/liveness probes or init containers.

The problem here is that this logic is entirely in code, and yesterday they hosed up how they invoked a dependency so the pod was crashlooping. However, because there weren�t any startup hooks or startup/readiness probes defined in the deployment, helm applied the release just fine and nobody noticed till alerts fired like 15 minutes later. I want to make the CI pipeline fail if this happens, ideally by failing the helm release.

I�m trying to think of a good way to do a health check on this service without running an HTTP webserver, which my devs don�t want to do since it never receives inbound traffic. The easiest is to just run through the existing health check, write to a file on success, and have my readiness probe check for the existence of that file� except the only problem there is that my app exits immediately after failing to connect to the dependency, which means my readiness probe will never fire.

The stupid approach is to just sleep on failing the dependency and let the readiness probe kill the pod, but that seems sufficiently dumb I�m sure I�m missing something.

The liveness probe is easy enough - write to a file when work is successfully done and have the liveness probe check if the timestamp is < X seconds old. It�s defining startup health without exposing a health endpoint that I�m having trouble with.

We do instrument with opentelemetry, but that still doesn�t solve my �deploy succeeds but pod doesn�t start correctly� problem.

The Iron Rose fucked around with this message at 00:18 on Sep 7, 2023

# ? Sep 7, 2023 00:12

acksplode: May 17, 2004

Some assorted thoughts:

* Readiness probes are intended for letting loadbalancers know which pods can serve traffic, so using them with a service that doesn't receive traffic is a smell.
* Exposing application liveness to the kubelet is sufficient reason to add a simple webserver with a /healthz endpoint IMO. All it has to do is serve 200s. But liveness probes are for identifying running pods that can't do work and need a restart. Since your application is politely dying on its own when it can't do work, a liveness probe wouldn't buy you much.
* Helm isn't really intended for determining application health, so trying to make it responsible for that is a smell. It's a convenience that a helm release may fail due to liveness or readiness probes, but IMO that's not reason enough to use them where they don't otherwise make sense. I'm actually not sure what conditions will cause helm to report a release as failed except for failing to render and apply manifests, since that's all I really trust it to do.

Basically from my perspective your application is behaving reasonably, but CI and observability are lacking. A couple leading questions:

* Why did it take 15 minutes to alert on a deployment that obviously bad? Pods in crashloopbackoff would be a fine condition to alert on, in addition to whatever metrics you've implemented in the service. It seems like a problem if "poo poo completely don't work" takes 15 minutes to surface to a human.
* Why is CI relying on helm to know whether the application is healthy? A good thing to do would be to directly assert that all pods are running. A better thing to do would be to drop a job into the queue and assert its result arrived at the expected destination.

# ? Sep 7, 2023 01:39

The Iron Rose: May 12, 2012; Cat Army

All excellent thoughts, thanks!

With regards to helm, we use helmfile for deployments which has an atomic condition that will roll back a release if it's not healthy. I sometimes forget what's default helm and what's in the wrapper. It's a really convenient automatic rollback mechanism, but you're right that we're getting to the point that we should probably write our own logic in there somewhere.

With regards to readiness probes, the idea would be to prevent the pods from entering the Running state, which causes a failed deployment, which helmfile will rollback on. The liveness probes are for avoiding any unexpected deadlocks during normal operations.

I think adding the simple webserver with /healthz is probably the right way to go here, but I'm still curious about alternatives.

side note, also this failed release was the one where we were turning on apm in prod lol

The Iron Rose fucked around with this message at 01:55 on Sep 7, 2023

# ? Sep 7, 2023 01:48

George Wright: Nov 20, 2005

We use Prometheus internally so if teams are very reluctant to add a health check endpoint then using the metrics endpoint for health checks is better than nothing, as long as it initializes last.

But yes, adding a health check endpoint is probably the most reasonable way to go.

# ? Sep 7, 2023 01:56

The Fool: Oct 16, 2003

The Iron Rose posted:

No is a complete sentence you can share with your security team, which is what we did when they started mandating CMKs for everything.

I wish it was that easy

the c-suite is big on "quality" and "reliability" right now and the recent issues we had with the azurerm provider affected a high profile deployment which means we are being tasked with coming up with ways to keep it from happening again

# ? Sep 7, 2023 02:24

Hadlock: Nov 9, 2004

Someone did a "benchmark" of various LLMs (like chatGPT) one of the questions was to argue "Argue for and against the use of kubernetes in the style of a haiku."

https://benchmarks.llmonitor.com/k8s

There's at least ~~100~~ 30 examples there, here are two

Alpaca7B posted:

Arguments for kubernetes:
Flexible, scalable, and reliable
Helps to manage complex applications
Supports multiple cloud providers

Arguments against kubernetes:
Costly to implement and maintain
Requires skilled personnel to operate
Relatively new technology, unproven in the long-term

GPT4 posted:

For:
Scalable and swift,
Kubernetes leads the fleet,
Cloud's masterful gift.

Against:
Complex to manage,
Errors often enrage,
Not for every stage.

# ? Sep 9, 2023 21:41

The Iron Rose: May 12, 2012; Cat Army

Should I include trace or span IDs in structured log output?

On the one hand it feels redundant but on the other hand for getting devs to look to traces rather than just logs it feels like a nice value add.

# ? Sep 10, 2023 21:34

luminalflux: May 27, 2005

Unless it's going to gently caress with cardinality somewhere, I'd definitely add them if they're available.

Where you really want them is as a comment in the SQL queries so you can go find traces from the slow query log / process list.

# ? Sep 11, 2023 01:01

The Iron Rose: May 12, 2012; Cat Army

luminalflux posted:

Unless it's going to gently caress with cardinality somewhere, I'd definitely add them if they're available.

Where you really want them is as a comment in the SQL queries so you can go find traces from the slow query log / process list.

I mean it�ll be a high cardinality field for sure, but that�s why you don�t index it as text since it�s not like you�d need to do substring searches.

Sometimes I deeply regret using self managed elasticsearch but at least it makes conversations like this easier.

Follow up to your SQL point, which is brilliant and i hadn�t thought of this. We use Python and sqlalchemy� with structlog I can just add a processor to auto-insert the trace/span ids, but using comments here I�m pretty sure I�d just have to modify every query, right? Sure I can use the otel API to get the IDs on a recording span, but it doesn�t seem as scalable to require that every time.

# ? Sep 11, 2023 17:00

Adbot: ADBOT LOVES YOU

# ? May 18, 2024 04:23

luminalflux: May 27, 2005

We use SQLAlchemy too, it's fairly easy with sqlalchemy.events.ConnectionEvents.before_cursor_execute to hook the cursor event and modify the query. We've done this with Datadog, where we actually kick open a new span inside this hook, add some tags we glean from various context handlers, and write the span and trace ID inside the comment as well as the callsite. This makes it easy to look at the process list, find the trace and span, but also see which goddamn file it's coming from.

# ? Sep 12, 2023 01:04

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Continuous Integration/build engineering/devops thread

«‹›156 »