Continuous Integration/build engineering/devops thread

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Continuous Integration/build engineering/devops thread

«‹›158 »

minato: Jun 7, 2004; cutty cain't hang, say 7-up.; Taco Defender

Yeah, that's ripe for an application of Goodhart's Law. If you get paid a bonus to find and destroy useless stuff, people will go out of their way to create useless stuff to later "find".

# ? Jan 9, 2022 22:02

Adbot: ADBOT LOVES YOU

# ? Jun 4, 2024 14:22

Hadlock: Nov 9, 2004

And/or wait until the right time try and make the case for a promotion/raise like my coworker did.

# ? Jan 9, 2022 22:14

Warbird: May 23, 2012; America's Favorite Dumbass

I'm thinking about using k3s to spread out some local services to some RasPis and comparatively lower power machines on my network and get them off my NAS. Is there any real reason to look at K8s or am I good to go with just sticking to k3s given the smaller scope of work?

# ? Jan 9, 2022 23:46

Hadlock: Nov 9, 2004

For home or production use

K3S is I think certified k8s compliant for, some very large subset of k8s APIs, but often misses the newest features and/or lags behind them

If you just want to run CRUD apps, LAMP stacks etc you shouldn't have any issue doing that with k3s. At all. I'm doing it

Suggesting using k3s for production use when managed k8s is available may raise some eyebrows come architectural review time though

# ? Jan 10, 2022 06:50

Bruegels Fuckbooks: Sep 14, 2004; Now, listen - I know the two of you are very different from each other in a lot of ways, but you have to understand that as far as Grandpa's concerned, you're both pieces of shit! Yeah. I can prove it mathematically.

minato posted:

Yeah, it's completely reasonable to think "I just saved the company a bunch of money! I will be showered with praise and bonuses!" but in my experience it's just not true.

Don't assume upper management wants to save money. It's like those dumb situations where managers take everyone out for expensive training near the end of the year because if they don't spend their budget, they'll lose it next year. It makes perfect sense to think "but if I save $1MM here, that's money that could be used to hire more people, upgrade that old system, etc etc" but that's not how budgets work, the money doesn't slosh around between buckets like that. As someone whose whole job is about technical efficiency, seeing these gigantic holes in financial efficiency feels incredibly frustrating.

A relatively new IT infrastructure manager at my company came in 100k under budget. Guess how much the budget was cut the next year?

# ? Jan 10, 2022 18:00

Warbird: May 23, 2012; America's Favorite Dumbass

Does K8s/k3s not play nice on Proxmox VMs or something? I've been trying to get this to work for nearly 3 days straight and it's been nothing but frustration on my end.

edit - I think I finally have it sorted. It seems that my initiative of "sure, let's let Ubuntu install docker and k8s since I'll be using anyway" was misplaced, which seems pretty self evident in retrospect. That's what chronic lack of sleep'll get ya. Love that baby, but christ.

Warbird fucked around with this message at 20:38 on Jan 11, 2022

# ? Jan 11, 2022 18:49

Gyshall: Feb 24, 2009; Had a couple of drinks.
Saw a couple of things.

KOPS or kubespray my dude

# ? Jan 13, 2022 14:31

Dukes Mayo Clinic: Aug 31, 2009

Hadlock posted:

Suggesting using k3s for production use when managed k8s is available may raise some eyebrows come architectural review time though

�k8s without all the poo poo we don�t need� was all the pitch we needed to go hard on k3s in production. Time will tell.

# ? Jan 13, 2022 14:50

ephex: Nov 4, 2007; PHWOAR CRIMINAL

Bruegels Fuckbooks posted:

A relatively new IT infrastructure manager at my company came in 100k under budget. Guess how much the budget was cut the next year?

The budget wasn't cut! He got a significant proportion of the saved money as a bonus to incentivze proactive and business sensible decision making. The rest was invested in the team.

*wakes up*

Oh well

# ? Jan 13, 2022 15:14

BaseballPCHiker: Jan 16, 2006

Looking for some advice from the more knowledgeable folks here.

I'm a full time security engineer, focused exclusively on AWS security. I come from a networking background. I am a copy/paste coder in just python. I can read poo poo and no what its trying to do but am generally awful at writing code.

I am trying to learn more about DevSecOps and pivot that direction. Both out of personal interest, and to further my career, and help move this clown car of a company I work for into the future.

To that end I wasted my time getting a Jenkins cert. Yes they exist, no I dont list it on my resume. It did give me a good introduction into the build pipeline, and working with Git which I had never done before.

Now I am trying to learn more to continue my development. I want to make sure I'm at least working in the right direction. My plan was to purchase a course on Ansible and Terraform on Udemy and start working towards learning those tools next. I picked Ansible because I've seen it the most often in my career, and Terraform because its platform agnostic.

Are these reasonable areas to focus on? My plan was to write some Terraform config files and some Lambdas that interact with GuardDuty, Config, and other AWS services I work with and throw them in a personal Git repository as examples of my work. Seem reasonable?

# ? Jan 13, 2022 15:33

Warbird: May 23, 2012; America's Favorite Dumbass

Gyshall posted:

KOPS or kubespray my dude

I�m retrospect, sure. However part of the onus for this exercise in annoyance is getting more familiar with how this works at a base level. In practice for sure use the thing that�s going to provide you OTS functionality but a client isn�t alway lsu just going to take you at your word that a thing is the way to go.

# ? Jan 13, 2022 16:15

my homie dhall: Dec 9, 2010; honey, oh please, it's just a machine

Dukes Mayo Clinic posted:

�k8s without all the poo poo we don�t need� was all the pitch we needed to go hard on k3s in production. Time will tell.

isn't it the same API, but the binary is just smaller and it supports SQL backends? lol c'mon man

# ? Jan 14, 2022 00:52

Hadlock: Nov 9, 2004

K3S is a greenfield project from scratch, it's certified API compatible (and kubectl uses the API so it feels the same) but it's a completely different product, v0.0.1 was written by one guy over a couple of months from scratch, he didn't download the kubetnetes source code and just start stripping out the fat, they are similar but not equal

I'm super glad we don't run k3s at my shop, I wouldn't advise doing so in production, but to each their own

One major advantage of k3s is it has a tiny memory and cpu footprint, and is a lot better for use on low end systems like a raspberry pi or a developer laptop than minikube

# ? Jan 14, 2022 04:13

NihilCredo: Jun 6, 2011; iram omni possibili modo preme:
plus una illa te diffamabit, quam multæ virtutes commendabunt

K3s was API compatible for a long time, but it didn't officially support high availability until it hit v1.0, and it didn't have a stable embedded HA store until v.1.19 (you had to rely on an external RDBMS for that or experimental options).

Until then I thought - sure, it's useful if you already had K8s projects lying around and needed to run them on cheaper infra. But otherwise, it seemed reasonable to ask why the gently caress am I bothering with all this stuff if I'm not even getting HA out of the deal?

# ? Jan 14, 2022 09:44

Warbird: May 23, 2012; America's Favorite Dumbass

Because I want to learn how this works, have extremely limited hardware to work with, and if I�m going to put up with Jenkins I may as well do it �right�.

# ? Jan 14, 2022 14:57

Warbird: May 23, 2012; America's Favorite Dumbass

I don't know if this is common knowledge, but the CloudBees YouTube channel is a goddamn goldmine of "How the hell do I do X in Jenkins". They're very well produced, walk through the process, encounter errors you'd reasonably run into, and explain why they're showing up and what to do about it. They're a refreshing breath of fresh air after the nightmare of "lol copy paste this" blog posts I've been trawling through trying to get my head wrapped around using k8s/3s/whatever as build agents.

https://www.youtube.com/watch?v=c?CloudBeesTV?videos

Specifically the k8s integration video has been extremely helpful:
https://www.youtube.com/watch?v=ZXaorni-icg

# ? Jan 15, 2022 04:10

LochNessMonster: Feb 3, 2005; I need about three fitty

Has anyone ever tried to get aws efs-utils working on an Alpine docker image? It looks like I can only build rpm/deb packages from it and while you can install dpkg and install debs with apk these days it also requires to install all dependencies to be installed through dpkg (python3, stunnel, openssl).

At that point I'm better off just rebuilding the image based on ubuntu/debian or an rpm based distro right?

# ? Jan 19, 2022 19:52

vanity slug: Jul 20, 2010

Yeah, gently caress Alpine.

# ? Jan 20, 2022 18:49

Hadlock: Nov 9, 2004

Hiring for a full time remote Very Senior/lead position, I think this link works

https://forums.somethingawful.com/showthread.php?threadid=3075135&pagenumber=115&perpage=40&userid=0#post520860623

# ? Jan 20, 2022 21:15

Methanar: Sep 26, 2013; by the sex ghost

Today I was requested to help with an architecture decision thing. As part of the conversation I was introduced to a snippet of the application's logic of a case statement of some very questionable hardcoded string transformations of environment variables that get passed in; transformations which are very tightly coupled to infrastructure implementation details. Which I guess is fine as long as literally nothing ever changes ever.

This is one of the most textbook problems that I've ever seen of problems that 'devops' is meant to fix. Obviously if you give a dev a problem they'll write some code and dev their way through it, but jesus christ there had to be a better way.

How does anything ever work anywhere lol

---
Bonus spinnaker gore.

Spinnaker docker registry triggers work by repeatedly polling the registry for its available tags and images and then diffing the poll results. When you deploy a new spinnaker for the first time, well there's nothing to diff against so the entire tag collection is 'new' to spinnaker's state. The max cache size of the poll result diff is only 1000. So if you have more than 1000 tags on your docker registry, trigger executions fail because the diff will be too large for spinnaker to handle.

You need to explicitly fast forward the poller so it doesn't try to consider the entire registry catalogue as a 'new' diff.
https://github.com/spinnaker/igor/blob/version-1.18.0/igor-web/src/main/groovy/com/netflix/spinnaker/igor/admin/AdminController.java#L56

code:

 curl -X POST -vvv 'http://localhost:8088/admin/pollers/fastforward/dockerMonitor'

Methanar fucked around with this message at 00:57 on Jan 22, 2022

# ? Jan 22, 2022 00:35

Blinkz0rz: May 27, 2001; MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

Methanar posted:

How does anything ever work anywhere lol

I�m spite of itself and on a wing and a prayer

# ? Jan 22, 2022 00:56

luminalflux: May 27, 2005

Methanar posted:

---
Bonus spinnaker gore.

Spinnaker docker registry triggers work by repeatedly polling the registry for its available tags and images and then diffing the poll results. When you deploy a new spinnaker for the first time, well there's nothing to diff against so the entire tag collection is 'new' to spinnaker's state. The max cache size of the poll result diff is only 1000. So if you have more than 1000 tags on your docker registry, trigger executions fail because the diff will be too large for spinnaker to handle.

You need to explicitly fast forward the poller so it doesn't try to consider the entire registry catalogue as a 'new' diff.
https://github.com/spinnaker/igor/blob/version-1.18.0/igor-web/src/main/groovy/com/netflix/spinnaker/igor/admin/AdminController.java#L56
code:
 curl -X POST -vvv 'http://localhost:8088/admin/pollers/fastforward/dockerMonitor'

Spinnaker is such a great project, where it works great if you're Netflix and have zero rate limiting, but running it somewhere else you definitely run into fun issues like that. I've never used triggers off docker containers, we just trigger everything off webhooks that CircleCI calls.

I keep finding fun stuff in their ECS support. Like even though you tell it to replace the IAM role in a task definition it's pulling from an artifact, it totally won't. Instead have to do some fun stuff with SpEL expressions inside the task definition, and in the deploy stage use the environment variables or tags to store data to use for SpEL processing.

# ? Jan 22, 2022 17:29

The Fool: Oct 16, 2003

I�m doing a presentation about IaC in general and Terraform specifically to an audience of junior devs and interns.

I have a general plan and idea on what points I want to cover, but was wondering if any of you guys could recommend anything specific that I should touch on that would be good for that kind of audience.

It�s my first time doing this sort of thing

# ? Jan 25, 2022 20:49

New Yorp New Yorp: Jul 18, 2003; Only in Kenya.; Pillbug

The Fool posted:

I�m doing a presentation about IaC in general and Terraform specifically to an audience of junior devs and interns.

I have a general plan and idea on what points I want to cover, but was wondering if any of you guys could recommend anything specific that I should touch on that would be good for that kind of audience.

It�s my first time doing this sort of thing

Modularization and more importantly overmodularization. i.e. don't make modules that just wrap a single resource to enforce usage/naming patterns -- enforcing usage restrictions and naming should be the domain of your cloud platform's policy engine. Also ensure that modules are properly versioned using sane semver practices so teams can opt in to changes at their own pace on their own schedule.

In general allow teams to own their own IaC so that they can rapidly iterate -- trying to centralize and overabstract IaC results in teams being trapped.

Don't fall into the trap of "all of our applications are the same!" because they use the same cloud resources -- they are similar, but almost always end up varying in strange and interesting ways. It's like dog breeds. Yes, a pitbull and a golden retriever are both canines, but you can't expect them to behave identically or have the same exact requirements around feeding, training, and vet care.

New Yorp New Yorp fucked around with this message at 21:01 on Jan 25, 2022

# ? Jan 25, 2022 20:53

xzzy: Mar 5, 2009

Our legacy single Jenkins instance has gotten too big for people to be managing, so I've been given the job to split into multiple Jenkins instances with each group of developers getting their own Jenkins. This is the trivial part.

The part that's giving me pause however is making the executors available to all our instances and preventing an executor from getting overloaded because the instances have no idea there's other Jenkins' firing off builds. For our linux builds it's easy, it's all gonna be done in k8s but we have a bank of OSX systems that I gotta deal with. The best I've found so far is the "node sharing executor" plugin that looks like it does exactly what I need, but the implementation seems kinda kludgy and making Jenkins do something it's not designed to.

So I'm curious if anyone in here has had to do something like this and knows of a better plan.

# ? Jan 25, 2022 21:19

necrobobsledder: Mar 21, 2005; Lay down your soul to the gods rock 'n roll; Nap Ghost

Underprovision the executors for each Jenkins master with maybe 1 or 2 executors each and it won't be too bad at the cost of some compute efficiency and some contention perhaps. If it's just two separate Jenkins masters it will probably work for a while. I'd go with one dedicated Mac for each Jenkins master and if there's a need for more capacity let it spillover to a shared pool of Macs hoping the different Jenkins instances don't step on each other enough to matter in a CI system vs a real-time required processing setup. It's possible to have Jenkins setup a semaphore by using fixed ports for the agent (and tear it down following a build) or similar with the connection commands so Jenkins masters will flail around trying to provision new nodes if they're occupied. Also, maybe the rather dated dynaslave plugin is worth a try?

# ? Jan 26, 2022 00:25

my homie dhall: Dec 9, 2010; honey, oh please, it's just a machine

does faang interview SREs like they interview SWEs? like, do I need to start reviewing stuff like red black trees?

# ? Jan 26, 2022 13:52

minato: Jun 7, 2004; cutty cain't hang, say 7-up.; Taco Defender

I'm way out of date, but no they do not. Had a basic coding exercise (read data from 2 files and conjoin them) as the screen, then on the day it was this (literally copied from what the recruiter sent me):

quote:

Systems: Systems administration (core concepts like DNS, web servers, databases, security etc), storage (RAID types), and troubleshooting. The discussions will start off fairly high level and progressively get more in depth. You may also be asked a scenario type question where something has failed and you need to figure out what�s wrong and how to fix it. We will be looking for your ability to ask the right questions to gather more information and how you approach the problem. All of these questions will be more than likely based in FAANG scale so please do your best in trying to think of solutions that would apply and be effective in our environment.

Coding: The questions we ask require skills used every day by SREs, including text manipulation, handling input / output, automating tasks, interfacing with external systems / processes, etc. The questions can be a real problem, or something contrived to use these skills. Be prepared to show your work on the white board and be as concise and efficient as possible with your answers. Take hints from the interviewer and be open to other solutions as you go. It may also help to study your strongest coding language, algorithms, design patterns, core CS concepts and topics related to the scale of our environment before the interview. Efficiency, structure, syntax, bugs and working code will be the criteria as to how the interviewer will assess your abilities.

Networking: Study up on the tcp/ip stack and make sure you know it inside and out. Be prepared to have discussions around load balancers, protocols, networking tools, troubleshooting etc. An example questions is �what is your favorite protocol and why?"

Design/Architecture: Will test your abilities to design systems/processes that scale in a FAANG environment. The interviewer may give you a system we currently have in place and have you make suggestions on how it can be improved. He/she may also down the other route and have you design a system completely from scratch on the whiteboard (i.e. design an infrastructure that will handle 10M users for our new XXXXXX system) Questions can range from "How would you scale a service from 1 million to 100 million users" to something general like "Design Google Search."

Culture: You will be meeting with one of the SRE managers and he/she will be asking you questions about any significant projects you have worked on to get a feel of your capabilities and how you work on teams and with other groups across an organization. There may be questions involving how you�ve dealt with conflicts in the past, or what a project you are most proud of is. You might also get questions like �why <FAANG>� or �what do you understand SRE role to be�.

# ? Jan 26, 2022 15:21

minato: Jun 7, 2004; cutty cain't hang, say 7-up.; Taco Defender

By the way, I made a huge mistake. I consider myself a developer, but I didn't think I was smart enough to pass the SWE L33tcode tests so when I got headhunted to be a SRE I figured it would be a lower bar to entry into a FAANG. While that turned out to be true, the job is very different from development and has negatively pigeonholed my career; recruiters only see me as a SRE now, despite me not wanting those kind of roles.

# ? Jan 26, 2022 15:26

Hadlock: Nov 9, 2004

Just change your resume to say swe, do a find and replace for sre, same with your LinkedIn and give it about 30 days and a different set of recruiters will come pounding on your door

# ? Jan 26, 2022 16:53

Warbird: May 23, 2012; America's Favorite Dumbass

Anyone have a write up of best practices for Nexus/artifact repo structure? Got to get one stood up and I�m musing about how to best lay that all out. Scoped to a single org each, but with multiple teams accessing/using.

# ? Jan 26, 2022 18:13

Docjowles: Apr 9, 2009

For what it�s worth (probably nothing at this point lol) I interviewed for a position at google that was explicitly ops, not even SRE. I still had to go through a panel of coding questions like �write a class to implement a Stack, on a white board�, which I failed because it�s just not a thing I had had to think about once in the 10 years since college.

This was also in like 2015 or something though so absolutely ancient history in tech terms. But my understanding was that�s still not unusual there. I dunno about the other FAANG companies. If I knew I wouldn�t have to spend a couple months practicing algorithms and data structures crap I would be more inclined to try my hand at getting in again.

# ? Jan 27, 2022 00:36

minato: Jun 7, 2004; cutty cain't hang, say 7-up.; Taco Defender

SREs are expected to write the automation that manages the monitoring & deployment of thousands of machines, so it's totally conceivable that they'll be processing lots of data which might call for "advanced" data structures / algorithms. That said, I think 99% of the time they're writing scripting so their coding ability is not held to the same level as SWEs.

At a Google SRE interview I got asked the following coding question. Your code gets passed lines of input that look like:

code:

ABCDEFG
BCDEFGH
CDEFGHI
DEFGHIJ
EFGHIJK
...

(where the values of A...Z are numbers)

How can you detect when the input stream doesn't match this format?

Many quickly notice the pattern: each line is shifted left one position from the line above. So to validate the pattern, it's only necessary to compare 2 adjacent lines at a time. Ignore the first char of the upper line and the last char of the line beneath it, and compare the two lines. If they differ, the input doesn't match the format.

The follow up questions were a little trickier: "These lines might be so long that 2 rows don't fit in memory. How do you adapt your algorithm to compensate?", and "It's expensive to read each value. How can you adapt your algorithm so that you only read each value at most once?" Answer for both: Hash the values

I ended up flubbing this one, not because I didn't pick up on the key to the solution, but because I brain-froze and assumed I'd maintain a single hash per line. In fact you need to maintain 2: (one for the chars [0:end-2], and the other for [1:end-1])

# ? Jan 27, 2022 02:21

12 rats tied together: Sep 7, 2006

minato posted:

How can you adapt your algorithm so that you only read each value at most once?

Just wanted to mention briefly that this is kind of the inverse of a stupid interview trick I learned a while back, where someone might ask you to detect if a word is a "palindrome or rotated palindrome".

For example, kayak should return True, but so should yakka. You can take the word and duplicate it -> yakka -> yakkayakka, the palindromic substring appears in the duplicated word.

e: For me, I refuse to participate in any SRE interview that contains "tricks" or even array indeces. I'll ask for some glue code, or a basic infrastructure-as-code example in the candidate's preferred toolset, and then I'll introduce a new dependency in the followup evaluation assuming that they don't submit some insanely awful thing.

12 rats tied together fucked around with this message at 03:06 on Jan 27, 2022

# ? Jan 27, 2022 03:03

Zorak of Michigan: Jun 10, 2006

That sort of interview question worries me, because I can talk conceptually, but I can't white board or write pristine code worth a drat. I usually get the algorithm straight in my head and then trial-and-error the hell out of it.

# ? Jan 27, 2022 04:18

Vulture Culture: Jul 14, 2003; I was never enjoying it. I only eat it for the nutrients.

Zorak of Michigan posted:

That sort of interview question worries me, because I can talk conceptually, but I can't white board or write pristine code worth a drat. I usually get the algorithm straight in my head and then trial-and-error the hell out of it.

My dumbest SRE interview involved me forgetting how to do basic probability and statistics and I spent several minutes working out how to express Pascal's triangle as a recurrence relation. It was fine

# ? Jan 29, 2022 06:29

Methanar: Sep 26, 2013; by the sex ghost

quote:

I'm afraid you will mess up my DNS records!
ExternalDNS since v0.3 implements the concept of owning DNS records. This means that ExternalDNS will keep track of which records it has control over, and will never modify any records over which it doesn't have control. This is a fundamental requirement to operate ExternalDNS safely when there might be other actors creating DNS records in the same target space.

For now ExternalDNS uses TXT records to label owned records, and there might be other alternatives coming in the future releases.

lol external-dns lied to me and happily nuked a bunch of DNS without txt records at all.

# ? Feb 1, 2022 17:24

Vulture Culture: Jul 14, 2003; I was never enjoying it. I only eat it for the nutrients.

Methanar posted:

lol external-dns lied to me and happily nuked a bunch of DNS without txt records at all.

Wow, what a lovely FAQ entry. This features prominently in the README since v0.3, and makes much clearer via words like "can" that this is not a default option:

quote:

From this release, ExternalDNS can become aware of the records it is managing (enabled via --registry=txt), therefore ExternalDNS can safely manage non-empty hosted zones. We strongly encourage you to use v0.3 with --registry=txt enabled and --txt-owner-id set to a unique value that doesn't change for the lifetime of your cluster. You might also want to run ExternalDNS in a dry run mode (--dry-run flag) to see the changes to be submitted to your DNS Provider API.

I've had a number of issues with the defaults on external-dns in some other regards too. For example, if you're using Route 53 and letting it auto-discover your zones, and you provide it the verbatim name of a hosted zone to limit its management scope, it will also match any Route 53 hosted zones that are subdomains of the one you provided it. This makes the zone names not extremely useful for limiting its scope; you're much safer with zone IDs.

Vulture Culture fucked around with this message at 20:16 on Feb 5, 2022

# ? Feb 5, 2022 20:10

Methanar: Sep 26, 2013; by the sex ghost

pop quiz:

PDNS supports wildcard records.
* can match anything as a fallback if no specific record exists.

example01.example.com A does not actually exist, but leverages * to point to 1.1.1.1
example01.example.com TXT is created with content "hello"

What happens?

# ? Feb 9, 2022 01:34

Adbot: ADBOT LOVES YOU

# ? Jun 4, 2024 14:22

Warbird: May 23, 2012; America's Favorite Dumbass

Probably a lot of emails and my evening being ruined.

# ? Feb 9, 2022 02:43

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Continuous Integration/build engineering/devops thread

«‹›158 »