Continuous Integration/build engineering/devops thread

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Continuous Integration/build engineering/devops thread

«‹›156 »

Methanar: Sep 26, 2013; by the sex ghost

It took 30 minutes into the first change window of 2023 for somebody to cause an outage.

Some random security person set some very dumb sysctl settings in a very dumb way that managed to circumvent guard rails that I personally put up over a year ago to protect against exactly, EXACTLY this threat model.

Methanar fucked around with this message at 05:20 on Jan 4, 2023

# ? Jan 4, 2023 05:18

Adbot: ADBOT LOVES YOU

# ? May 17, 2024 18:34

Twerk from Home: Jan 17, 2009; This avatar brought to you by the 'save our dead gay forums' foundation.

What's everyone's general sentiment about using Cloud Native Buildpacks to build Docker images vs hand-writing Dockerfiles? I'm just starting to kick the tires on https://paketo.io/, and immediately discovering that installing just one targeted system library that native code will call into seems more difficult than it should be.

# ? Jan 4, 2023 22:28

necrobobsledder: Mar 21, 2005; Lay down your soul to the gods rock 'n roll; Nap Ghost

We're using a wrapper for Buildah to build our images because none of the build pack style systems seems to lead anywhere other than lock you into a particular flavor of cloud and because we're trying to build distributable containers for the public this worked out better for our needs as well.

# ? Jan 5, 2023 01:24

Vulture Culture: Jul 14, 2003; I was never enjoying it. I only eat it for the nutrients.

Methanar posted:

It took 30 minutes into the first change window of 2023 for somebody to cause an outage.

Some random security person set some very dumb sysctl settings in a very dumb way that managed to circumvent guard rails that I personally put up over a year ago to protect against exactly, EXACTLY this threat model.

This is how I ended up with an E2E acceptance test suite with modifications under lock and key

# ? Jan 5, 2023 01:25

Hadlock: Nov 9, 2004

Vulture Culture posted:

This is how I ended up with an E2E acceptance test suite with modifications under lock and key

This is the way

# ? Jan 5, 2023 01:36

EoRaptor: Sep 13, 2003; by Fluffdaddy

So it appears that Azure Shared Image Galleries don't understand that 01/01/2023 is more recent than 12/31/2022, and aren't correctly flagging the 'latest' image.
:suicide101:

# ? Jan 7, 2023 03:14

Wizard of the Deep: Sep 25, 2005; Another productive workday

EoRaptor posted:

So it appears that Azure Shared Image Galleries don't understand that 01/01/2023 is more recent than 12/31/2022, and aren't correctly flagging the 'latest' image.

I mean, 12 is a lot higher than 1 or 11.

# ? Jan 7, 2023 10:32

StumblyWumbly: Sep 12, 2007; Batmanticore!

As always, the software is fine, it is the millions of people doing things the same way they have for decades who are wrong.

# ? Jan 7, 2023 15:56

Methanar: Sep 26, 2013; by the sex ghost

Another day another insane 4 hour prod fire network troubleshoot session.

gently caress me. I'm 2 hours late in taking an advil.

# ? Jan 11, 2023 05:49

George Wright: Nov 20, 2005

What Linux and K8s distros are folks using for K8s on metal these days?

Looks like on the Linux distro side there are the usual suspects, but also distros like Flatcar, Bottlerocket, and Talos. Any experience here with those? Any horror stories?

As for K8s, EKS-D looks appealing so we could have the same distro on metal as we do AWS. Any experience with EKS-D or any other K8s distros? Any horror stories?

# ? Jan 15, 2023 19:39

freeasinbeer: Mar 26, 2015; by Fluffdaddy

If on EKS I like bottlerocket, as between that and karpenter it can get nodes online in as fast as 60s, but the standard aws linux image also is 100% fine.

On the other clouds I�d default to whatever their flavor of hosted K8s uses.

For my home lab I use Ubuntu, flatcar is nifty but if your doing anything weird like GPUs or playing with containerd plugins like stargz, it�s easier to not use one of the �hermetically� sealed OSes.

TalosOS is the only other one I�ve really looked at, but it is very opinionated, and wants you to use their K8s tooling, so that if your bringing anything else it�d be a headache.

The only thing I�d really avoid is any of the fedora OS or using centos/red hat. They killed coreos in favor of fedora OS for what externally appears to be not invented here reasons, and with mainline OS having really old kernels that sometimes makes it a PITA to use stuff that uses ebpf.

Edit: I�d avoid EKS-D, it�s not really like EKS at all, and it�s very new to recommend it over existing stuff.

I really like k3s, in particular if you have experience running some flavor of database already and have backups figured out, but it�s different

freeasinbeer fucked around with this message at 00:41 on Jan 16, 2023

# ? Jan 16, 2023 00:36

Methanar: Sep 26, 2013; by the sex ghost

I checked my email for the first time in a few days.
I got a thank you email from a random director ccing my own management chain and 160 dollar gift card in exchange for giving myself PTSD over the past 3 weeks after responding to 5-6 prod incidents.

im a team player

# ? Jan 19, 2023 04:17

Docjowles: Apr 9, 2009

Optimistically, ammo for that promotion packet? :yaycloud:

Alternatively, I've lost the thread of the Methanar saga over the years but haven't you been at current job for a long time? Maybe it's time to look around for something that does not trigger PTSD, especially if promotion/raise is poo poo.

I know there are lots of layoffs going on and general gloom about Economic Slowdown. But someone who has expert level k8s, AWS, BGP, Linux, etc knowledge has options.

# ? Jan 19, 2023 04:28

Methanar: Sep 26, 2013; by the sex ghost

Docjowles posted:

Optimistically, ammo for that promotion packet?

Alternatively, I've lost the thread of the Methanar saga over the years but haven't you been at current job for a long time? Maybe it's time to look around for something that does not trigger PTSD, especially if promotion/raise is poo poo.

I know there are lots of layoffs going on and general gloom about Economic Slowdown. But someone who has expert level k8s, AWS, BGP, Linux, etc knowledge has options.

I was no joke half way through writing something positive when pagerduty paged me for the 4th time today.
(For the third time this month. Somebody on the security team pushed out broad changes and walked away without testing or validating poo poo and broke everything leaving me to get called in for it because Kubernetes is the most visible thing to fail. Wasn't even Kubernetes-specific this time - just broke everything that depends on running the base Chef role)

I produce millions and millions of value to this org every year. My original pre-IPO equity grant has two vests left. My stress and responsibility is through the roof and has been for a long time. And my stock value is 1/3 of what it was 12 months ago.
If I don't get senior II, and another fat equity grant, for my 4 year anniversary during annual reviews this summer I'm absolutely ragequitting.

Really, though, I should just became management here because it seems like a way easier job. And I might actually be able to root cause fix some of the underlying problem patterns here that also burned out the previous 2 tech leads of my group.

Methanar fucked around with this message at 06:36 on Jan 19, 2023

# ? Jan 19, 2023 04:57

Hadlock: Nov 9, 2004

Methanar posted:

Really, though, I should just became management here because it seems like a way easier job. And I might actually be able to root cause fix some of the underlying problem patterns here that also burned out the previous 2 tech leads of my group.

As manager you'll still need rapport and consensus with the management team to get your changes greenlit and scheduled. If two principals left and you're about to rage quit I suspect the problem goes all the way to the CTO and back down again

Good luck

# ? Jan 19, 2023 20:56

jaegerx: Sep 10, 2012; Maybe this post will get me on your ignore list!

Anyone done the switch from istio to cilium yet? What am I looking at?

# ? Jan 20, 2023 02:24

madmatt112: Jul 11, 2016; Is that a cat in your pants, or are you just a lonely excuse for an adult?

Hey I have a good idea, let's make a Thursday deadline to migrate every single loving thing in the entire platform to new subnets, and then at 3pm on Friday we'll turn off the old subnets.

What? not everybody managed to move every little noodly bit and bob into the new subnets, and make sure that their codebases and systems are set up to work with the new proxy systems?

Too fuckin' bad, kill it and take a weekend, fuckers!

WHAT THE CHRIST

# ? Jan 20, 2023 21:32

Docjowles: Apr 9, 2009

madmatt112 posted:

Hey I have a good idea, let's make a Thursday deadline to migrate every single loving thing in the entire platform to new subnets, and then at 3pm on Friday we'll turn off the old subnets.

What? not everybody managed to move every little noodly bit and bob into the new subnets, and make sure that their codebases and systems are set up to work with the new proxy systems?

Too fuckin' bad, kill it and take a weekend, fuckers!

WHAT THE CHRIST

# ? Jan 20, 2023 21:46

madmatt112: Jul 11, 2016; Is that a cat in your pants, or are you just a lonely excuse for an adult?

Docjowles posted:

Like, what's a grace period? Do these idiots realize how much they've broken across the entire platform? Setting us all up for a lovely weekend too.

# ? Jan 20, 2023 22:11

Wizard of the Deep: Sep 25, 2005; Another productive workday

madmatt112 posted:

Like, what's a grace period? Do these idiots realize how much they've broken across the entire platform? Setting us all up for a lovely weekend too.

It's a real shame your phone broke Friday at 4:30 and the earliest time the phone store can get you in is Monday at 9 am.

# ? Jan 21, 2023 07:40

freeasinbeer: Mar 26, 2015; by Fluffdaddy

jaegerx posted:

Anyone done the switch from istio to cilium yet? What am I looking at?

Cilium is very alpha quality at the moment unless you are just replacing your existing CNI, I�d wait, but it�s still the right direction

# ? Jan 23, 2023 21:43

George Wright: Nov 20, 2005

freeasinbeer posted:

Cilium is very alpha quality at the moment unless you are just replacing your existing CNI, I�d wait, but it�s still the right direction

From a CNI perspective, a service mesh perspective, or both?

# ? Jan 24, 2023 01:21

jaegerx: Sep 10, 2012; Maybe this post will get me on your ignore list!

freeasinbeer posted:

Cilium is very alpha quality at the moment unless you are just replacing your existing CNI, I�d wait, but it�s still the right direction

It�s the default for gke and eks now I think. I�m on prem though.

# ? Jan 24, 2023 01:28

Methanar: Sep 26, 2013; by the sex ghost

Cilium is mostly fine these days. Just stay n-1 off current major release and you'll be okay.

# ? Jan 24, 2023 01:37

freeasinbeer: Mar 26, 2015; by Fluffdaddy

I was specifically playing with the bgp peering side last weekend and using the newer bgp setup, it was way more frustrating then I wanted it to be, but it is nifty to be able to directly hit pod IPs over the network.

LoadBalancers in bgp mode don�t support local target mode, and it�s very much alpha. To be fair I even think it�s tagged that way, but if your looking for bgp peering they�ve only added it at all very recently, that entire implementation around metallb is being ripped out it seems, and replaced with the new stuff which seems like a big deal for on prem.

Other then that DSR was super fiddly, options are not explained well, and if you install the defaults are kinda opaque.

I was only playing with it in my homelab, but the service mesh is not as well evolved as istio is, but that�s ok for now.

So while all the features are nifty it took my way longer then I wanted to get all running and feels on the whole a bit trying to be all things for all people. So many features don�t work in one mode or another or have severe caveats.

# ? Jan 24, 2023 01:56

my homie dhall: Dec 9, 2010; honey, oh please, it's just a machine

just use calico

# ? Jan 24, 2023 12:53

Lucid Nonsense: Aug 6, 2009; Welcome to the jungle, it gets worse here every day

I asked this in the Infosec thread, and thought you guys might have some feedback on this:

We're in the process of rewriting our storage engine (log management software) and are adding data silos. I've been tasked with the architecture for this, including rbac. What is everyone's requirements for this in a SIEM? Do you handle it by host access, or on a more granular level?

# ? Jan 26, 2023 19:29

Vulture Culture: Jul 14, 2003; I was never enjoying it. I only eat it for the nutrients.

Lucid Nonsense posted:

I asked this in the Infosec thread, and thought you guys might have some feedback on this:

We're in the process of rewriting our storage engine (log management software) and are adding data silos. I've been tasked with the architecture for this, including rbac. What is everyone's requirements for this in a SIEM? Do you handle it by host access, or on a more granular level?

For us, a host is often a container that might live for as short as several seconds, so organizing things by host is frequently not useful

# ? Jan 26, 2023 20:26

Lucid Nonsense: Aug 6, 2009; Welcome to the jungle, it gets worse here every day

Vulture Culture posted:

For us, a host is often a container that might live for as short as several seconds, so organizing things by host is frequently not useful

Would you typically configure logging on that? I think the host logging in that situation would be the one running the container.

# ? Jan 26, 2023 21:20

Docjowles: Apr 9, 2009

Lucid Nonsense posted:

Would you typically configure logging on that? I think the host logging in that situation would be the one running the container.

That's one style. It's also common to launch a pod that has the main app container, which writes to stdout/stderr, and a sidecar container that reads from those and ships to one or more logging destinations. Hell if you are running in something like AWS Fargate you don't even have access to the underlying host to install log management tools.

Personally I would enjoy machines authenticating with some sort of token or certificate, and humans authenticating with the usual SSO suspects (AzureAD, Okta, etc).

# ? Jan 26, 2023 21:43

12 rats tied together: Sep 7, 2006

for rbac permissions in the app, let me set: principal plus action plus resource scope. provide documentation on every action for every resource scope

it's fine if the actions are generic (e.g. read, write, execute) across every resource scope

i can put usernames and passwords on my crap to authenticate. i would not make a ton of assumptions about how or why people do this and what their deployment looks like. i would try to fully decouple authentication from authorization.

# ? Jan 26, 2023 21:47

Lucid Nonsense: Aug 6, 2009; Welcome to the jungle, it gets worse here every day

I guess I should have said source type rather than host. Would all of your AWS logs go into one bucket/silo with the same rbac rules and retention? Here's how I have the flow now.

Or would it be better to decide which silo data goes to after rules processing?

# ? Jan 26, 2023 21:58

Vulture Culture: Jul 14, 2003; I was never enjoying it. I only eat it for the nutrients.

Lucid Nonsense posted:

Would all of your AWS logs go into one bucket/silo with the same rbac rules and retention?

For SIEM? We'd probably treat them the same. The second that the log search is opened up to anyone outside the security org, though, all bets are off.

# ? Jan 26, 2023 23:03

luminalflux: May 27, 2005

Lucid Nonsense posted:

I asked this in the Infosec thread, and thought you guys might have some feedback on this:

We're in the process of rewriting our storage engine (log management software) and are adding data silos. I've been tasked with the architecture for this, including rbac. What is everyone's requirements for this in a SIEM? Do you handle it by host access, or on a more granular level?

That it be better be loving integrated with Okta

also no per-host licensing because we cycle hosts faster than George Santos invents new lies

# ? Jan 26, 2023 23:47

Lucid Nonsense: Aug 6, 2009; Welcome to the jungle, it gets worse here every day

Vulture Culture posted:

For SIEM? We'd probably treat them the same. The second that the log search is opened up to anyone outside the security org, though, all bets are off.

Exactly why we're doing this. If you send your logs to the centralized server, the data in the aws bucket would only be accessible by your team. Others would see whatever they have rights to, but they'd be in different silos. I'm trying to figure out if there is any reason for aws logs to be split up, or if they could be grouped together with the same retention and access.

luminalflux posted:

That it be better be loving integrated with Okta

also no per-host licensing because we cycle hosts faster than George Santos invents new lies

We have ldap, which I think Okta has an agent for, not sure if that would satisfy that need without digging into it. We don't license per host, just per server. But I'm not trying to push a product here, just figure out how our data silos would work. Sounds like an aws/azure/gcp bucket would work for the devops guys, then have a route/switch/server bucket for the sysadmins, and firewall/traffic bucket for the security guys, for an example. It will be user configurable, so I just need to find out if there are any common reasons to parse cloud logs and sort them into different buckets.

# ? Jan 27, 2023 00:15

luminalflux: May 27, 2005

Lucid Nonsense posted:

We don't license per host, just per server.

What's the distinction here between "host" and "server"?

# ? Jan 27, 2023 17:36

Lucid Nonsense: Aug 6, 2009; Welcome to the jungle, it gets worse here every day

The server you install our software on needs a license. Sending devices don't affect licensing, but licensing is based on the volume ingested. So if all devices are sending a total of 50 million events per day, you would need a license for that volume.

# ? Jan 27, 2023 17:54

luminalflux: May 27, 2005

Ah ok, makes sense.

# ? Jan 27, 2023 18:23

Erwin: Feb 17, 2006

Lucid Nonsense posted:

The server you install our software on needs a license. Sending devices don't affect licensing, but licensing is based on the volume ingested.

Dear Datadog�

# ? Jan 27, 2023 22:37

Adbot: ADBOT LOVES YOU

# ? May 17, 2024 18:34

luminalflux: May 27, 2005

Erwin posted:

Dear Datadog�

Oh don't worry, they have both kinds of pricing

# ? Jan 27, 2023 22:44

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Continuous Integration/build engineering/devops thread

«‹›156 »