Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
minato
Jun 7, 2004

cutty cain't hang, say 7-up.
Taco Defender
Yeah, that's ripe for an application of Goodhart's Law. If you get paid a bonus to find and destroy useless stuff, people will go out of their way to create useless stuff to later "find".

Adbot
ADBOT LOVES YOU

Hadlock
Nov 9, 2004

And/or wait until the right time try and make the case for a promotion/raise like my coworker did.

Warbird
May 23, 2012

America's Favorite Dumbass

I'm thinking about using k3s to spread out some local services to some RasPis and comparatively lower power machines on my network and get them off my NAS. Is there any real reason to look at K8s or am I good to go with just sticking to k3s given the smaller scope of work?

Hadlock
Nov 9, 2004

For home or production use

K3S is I think certified k8s compliant for, some very large subset of k8s APIs, but often misses the newest features and/or lags behind them

If you just want to run CRUD apps, LAMP stacks etc you shouldn't have any issue doing that with k3s. At all. I'm doing it

Suggesting using k3s for production use when managed k8s is available may raise some eyebrows come architectural review time though

Bruegels Fuckbooks
Sep 14, 2004

Now, listen - I know the two of you are very different from each other in a lot of ways, but you have to understand that as far as Grandpa's concerned, you're both pieces of shit! Yeah. I can prove it mathematically.

minato posted:

Yeah, it's completely reasonable to think "I just saved the company a bunch of money! I will be showered with praise and bonuses!" but in my experience it's just not true.

Don't assume upper management wants to save money. It's like those dumb situations where managers take everyone out for expensive training near the end of the year because if they don't spend their budget, they'll lose it next year. It makes perfect sense to think "but if I save $1MM here, that's money that could be used to hire more people, upgrade that old system, etc etc" but that's not how budgets work, the money doesn't slosh around between buckets like that. As someone whose whole job is about technical efficiency, seeing these gigantic holes in financial efficiency feels incredibly frustrating.

A relatively new IT infrastructure manager at my company came in 100k under budget. Guess how much the budget was cut the next year?

Warbird
May 23, 2012

America's Favorite Dumbass

Does K8s/k3s not play nice on Proxmox VMs or something? I've been trying to get this to work for nearly 3 days straight and it's been nothing but frustration on my end.


edit - I think I finally have it sorted. It seems that my initiative of "sure, let's let Ubuntu install docker and k8s since I'll be using anyway" was misplaced, which seems pretty self evident in retrospect. That's what chronic lack of sleep'll get ya. Love that baby, but christ.

Warbird fucked around with this message at 20:38 on Jan 11, 2022

Gyshall
Feb 24, 2009

Had a couple of drinks.
Saw a couple of things.
KOPS or kubespray my dude

Dukes Mayo Clinic
Aug 31, 2009

Hadlock posted:

Suggesting using k3s for production use when managed k8s is available may raise some eyebrows come architectural review time though

“k8s without all the poo poo we don’t need” was all the pitch we needed to go hard on k3s in production. Time will tell.

ephex
Nov 4, 2007





PHWOAR CRIMINAL

Bruegels Fuckbooks posted:

A relatively new IT infrastructure manager at my company came in 100k under budget. Guess how much the budget was cut the next year?

The budget wasn't cut! He got a significant proportion of the saved money as a bonus to incentivze proactive and business sensible decision making. The rest was invested in the team.

*wakes up*

Oh well

BaseballPCHiker
Jan 16, 2006

Looking for some advice from the more knowledgeable folks here.

I'm a full time security engineer, focused exclusively on AWS security. I come from a networking background. I am a copy/paste coder in just python. I can read poo poo and no what its trying to do but am generally awful at writing code.

I am trying to learn more about DevSecOps and pivot that direction. Both out of personal interest, and to further my career, and help move this clown car of a company I work for into the future.

To that end I wasted my time getting a Jenkins cert. Yes they exist, no I dont list it on my resume. It did give me a good introduction into the build pipeline, and working with Git which I had never done before.

Now I am trying to learn more to continue my development. I want to make sure I'm at least working in the right direction. My plan was to purchase a course on Ansible and Terraform on Udemy and start working towards learning those tools next. I picked Ansible because I've seen it the most often in my career, and Terraform because its platform agnostic.

Are these reasonable areas to focus on? My plan was to write some Terraform config files and some Lambdas that interact with GuardDuty, Config, and other AWS services I work with and throw them in a personal Git repository as examples of my work. Seem reasonable?

Warbird
May 23, 2012

America's Favorite Dumbass

Gyshall posted:

KOPS or kubespray my dude

I’m retrospect, sure. However part of the onus for this exercise in annoyance is getting more familiar with how this works at a base level. In practice for sure use the thing that’s going to provide you OTS functionality but a client isn’t alway lsu just going to take you at your word that a thing is the way to go.

my homie dhall
Dec 9, 2010

honey, oh please, it's just a machine

Dukes Mayo Clinic posted:

“k8s without all the poo poo we don’t need” was all the pitch we needed to go hard on k3s in production. Time will tell.

isn't it the same API, but the binary is just smaller and it supports SQL backends? lol c'mon man

Hadlock
Nov 9, 2004

K3S is a greenfield project from scratch, it's certified API compatible (and kubectl uses the API so it feels the same) but it's a completely different product, v0.0.1 was written by one guy over a couple of months from scratch, he didn't download the kubetnetes source code and just start stripping out the fat, they are similar but not equal

I'm super glad we don't run k3s at my shop, I wouldn't advise doing so in production, but to each their own

One major advantage of k3s is it has a tiny memory and cpu footprint, and is a lot better for use on low end systems like a raspberry pi or a developer laptop than minikube

NihilCredo
Jun 6, 2011

iram omni possibili modo preme:
plus una illa te diffamabit, quam multæ virtutes commendabunt

K3s was API compatible for a long time, but it didn't officially support high availability until it hit v1.0, and it didn't have a stable embedded HA store until v.1.19 (you had to rely on an external RDBMS for that or experimental options).

Until then I thought - sure, it's useful if you already had K8s projects lying around and needed to run them on cheaper infra. But otherwise, it seemed reasonable to ask why the gently caress am I bothering with all this stuff if I'm not even getting HA out of the deal?

Warbird
May 23, 2012

America's Favorite Dumbass

Because I want to learn how this works, have extremely limited hardware to work with, and if I’m going to put up with Jenkins I may as well do it “right”.

Warbird
May 23, 2012

America's Favorite Dumbass

I don't know if this is common knowledge, but the CloudBees YouTube channel is a goddamn goldmine of "How the hell do I do X in Jenkins". They're very well produced, walk through the process, encounter errors you'd reasonably run into, and explain why they're showing up and what to do about it. They're a refreshing breath of fresh air after the nightmare of "lol copy paste this" blog posts I've been trawling through trying to get my head wrapped around using k8s/3s/whatever as build agents.

https://www.youtube.com/watch?v=c?CloudBeesTV?videos

Specifically the k8s integration video has been extremely helpful:
https://www.youtube.com/watch?v=ZXaorni-icg

LochNessMonster
Feb 3, 2005

I need about three fitty


Has anyone ever tried to get aws efs-utils working on an Alpine docker image? It looks like I can only build rpm/deb packages from it and while you can install dpkg and install debs with apk these days it also requires to install all dependencies to be installed through dpkg (python3, stunnel, openssl).

At that point I'm better off just rebuilding the image based on ubuntu/debian or an rpm based distro right?

vanity slug
Jul 20, 2010

Yeah, gently caress Alpine.

Hadlock
Nov 9, 2004

Hiring for a full time remote Very Senior/lead position, I think this link works

https://forums.somethingawful.com/showthread.php?threadid=3075135&pagenumber=115&perpage=40&userid=0#post520860623

Methanar
Sep 26, 2013

by the sex ghost
Today I was requested to help with an architecture decision thing. As part of the conversation I was introduced to a snippet of the application's logic of a case statement of some very questionable hardcoded string transformations of environment variables that get passed in; transformations which are very tightly coupled to infrastructure implementation details. Which I guess is fine as long as literally nothing ever changes ever.

This is one of the most textbook problems that I've ever seen of problems that 'devops' is meant to fix. Obviously if you give a dev a problem they'll write some code and dev their way through it, but jesus christ there had to be a better way.

How does anything ever work anywhere lol

---
Bonus spinnaker gore.

Spinnaker docker registry triggers work by repeatedly polling the registry for its available tags and images and then diffing the poll results. When you deploy a new spinnaker for the first time, well there's nothing to diff against so the entire tag collection is 'new' to spinnaker's state. The max cache size of the poll result diff is only 1000. So if you have more than 1000 tags on your docker registry, trigger executions fail because the diff will be too large for spinnaker to handle.

You need to explicitly fast forward the poller so it doesn't try to consider the entire registry catalogue as a 'new' diff.
https://github.com/spinnaker/igor/blob/version-1.18.0/igor-web/src/main/groovy/com/netflix/spinnaker/igor/admin/AdminController.java#L56
code:
 curl -X POST -vvv 'http://localhost:8088/admin/pollers/fastforward/dockerMonitor'

Methanar fucked around with this message at 00:57 on Jan 22, 2022

Blinkz0rz
May 27, 2001

MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

Methanar posted:

How does anything ever work anywhere lol

I’m spite of itself and on a wing and a prayer

luminalflux
May 27, 2005



Methanar posted:

---
Bonus spinnaker gore.

Spinnaker docker registry triggers work by repeatedly polling the registry for its available tags and images and then diffing the poll results. When you deploy a new spinnaker for the first time, well there's nothing to diff against so the entire tag collection is 'new' to spinnaker's state. The max cache size of the poll result diff is only 1000. So if you have more than 1000 tags on your docker registry, trigger executions fail because the diff will be too large for spinnaker to handle.

You need to explicitly fast forward the poller so it doesn't try to consider the entire registry catalogue as a 'new' diff.
https://github.com/spinnaker/igor/blob/version-1.18.0/igor-web/src/main/groovy/com/netflix/spinnaker/igor/admin/AdminController.java#L56
code:
 curl -X POST -vvv 'http://localhost:8088/admin/pollers/fastforward/dockerMonitor'

Spinnaker is such a great project, where it works great if you're Netflix and have zero rate limiting, but running it somewhere else you definitely run into fun issues like that. I've never used triggers off docker containers, we just trigger everything off webhooks that CircleCI calls.

I keep finding fun stuff in their ECS support. Like even though you tell it to replace the IAM role in a task definition it's pulling from an artifact, it totally won't. Instead have to do some fun stuff with SpEL expressions inside the task definition, and in the deploy stage use the environment variables or tags to store data to use for SpEL processing.

The Fool
Oct 16, 2003


I’m doing a presentation about IaC in general and Terraform specifically to an audience of junior devs and interns.

I have a general plan and idea on what points I want to cover, but was wondering if any of you guys could recommend anything specific that I should touch on that would be good for that kind of audience.

It’s my first time doing this sort of thing

New Yorp New Yorp
Jul 18, 2003

Only in Kenya.
Pillbug

The Fool posted:

I’m doing a presentation about IaC in general and Terraform specifically to an audience of junior devs and interns.

I have a general plan and idea on what points I want to cover, but was wondering if any of you guys could recommend anything specific that I should touch on that would be good for that kind of audience.

It’s my first time doing this sort of thing

Modularization and more importantly overmodularization. i.e. don't make modules that just wrap a single resource to enforce usage/naming patterns -- enforcing usage restrictions and naming should be the domain of your cloud platform's policy engine. Also ensure that modules are properly versioned using sane semver practices so teams can opt in to changes at their own pace on their own schedule.

In general allow teams to own their own IaC so that they can rapidly iterate -- trying to centralize and overabstract IaC results in teams being trapped.

Don't fall into the trap of "all of our applications are the same!" because they use the same cloud resources -- they are similar, but almost always end up varying in strange and interesting ways. It's like dog breeds. Yes, a pitbull and a golden retriever are both canines, but you can't expect them to behave identically or have the same exact requirements around feeding, training, and vet care.

New Yorp New Yorp fucked around with this message at 21:01 on Jan 25, 2022

xzzy
Mar 5, 2009

Our legacy single Jenkins instance has gotten too big for people to be managing, so I've been given the job to split into multiple Jenkins instances with each group of developers getting their own Jenkins. This is the trivial part.

The part that's giving me pause however is making the executors available to all our instances and preventing an executor from getting overloaded because the instances have no idea there's other Jenkins' firing off builds. For our linux builds it's easy, it's all gonna be done in k8s but we have a bank of OSX systems that I gotta deal with. The best I've found so far is the "node sharing executor" plugin that looks like it does exactly what I need, but the implementation seems kinda kludgy and making Jenkins do something it's not designed to.

So I'm curious if anyone in here has had to do something like this and knows of a better plan.

necrobobsledder
Mar 21, 2005
Lay down your soul to the gods rock 'n roll
Nap Ghost
Underprovision the executors for each Jenkins master with maybe 1 or 2 executors each and it won't be too bad at the cost of some compute efficiency and some contention perhaps. If it's just two separate Jenkins masters it will probably work for a while. I'd go with one dedicated Mac for each Jenkins master and if there's a need for more capacity let it spillover to a shared pool of Macs hoping the different Jenkins instances don't step on each other enough to matter in a CI system vs a real-time required processing setup. It's possible to have Jenkins setup a semaphore by using fixed ports for the agent (and tear it down following a build) or similar with the connection commands so Jenkins masters will flail around trying to provision new nodes if they're occupied. Also, maybe the rather dated dynaslave plugin is worth a try?

my homie dhall
Dec 9, 2010

honey, oh please, it's just a machine
does faang interview SREs like they interview SWEs? like, do I need to start reviewing stuff like red black trees?

minato
Jun 7, 2004

cutty cain't hang, say 7-up.
Taco Defender
I'm way out of date, but no they do not. Had a basic coding exercise (read data from 2 files and conjoin them) as the screen, then on the day it was this (literally copied from what the recruiter sent me):

quote:

Systems: Systems administration (core concepts like DNS, web servers, databases, security etc), storage (RAID types), and troubleshooting. The discussions will start off fairly high level and progressively get more in depth. You may also be asked a scenario type question where something has failed and you need to figure out what’s wrong and how to fix it. We will be looking for your ability to ask the right questions to gather more information and how you approach the problem. All of these questions will be more than likely based in FAANG scale so please do your best in trying to think of solutions that would apply and be effective in our environment.

Coding: The questions we ask require skills used every day by SREs, including text manipulation, handling input / output, automating tasks, interfacing with external systems / processes, etc. The questions can be a real problem, or something contrived to use these skills. Be prepared to show your work on the white board and be as concise and efficient as possible with your answers. Take hints from the interviewer and be open to other solutions as you go. It may also help to study your strongest coding language, algorithms, design patterns, core CS concepts and topics related to the scale of our environment before the interview. Efficiency, structure, syntax, bugs and working code will be the criteria as to how the interviewer will assess your abilities.

Networking: Study up on the tcp/ip stack and make sure you know it inside and out. Be prepared to have discussions around load balancers, protocols, networking tools, troubleshooting etc. An example questions is “what is your favorite protocol and why?"

Design/Architecture: Will test your abilities to design systems/processes that scale in a FAANG environment. The interviewer may give you a system we currently have in place and have you make suggestions on how it can be improved. He/she may also down the other route and have you design a system completely from scratch on the whiteboard (i.e. design an infrastructure that will handle 10M users for our new XXXXXX system) Questions can range from "How would you scale a service from 1 million to 100 million users" to something general like "Design Google Search."

Culture: You will be meeting with one of the SRE managers and he/she will be asking you questions about any significant projects you have worked on to get a feel of your capabilities and how you work on teams and with other groups across an organization. There may be questions involving how you’ve dealt with conflicts in the past, or what a project you are most proud of is. You might also get questions like “why <FAANG>” or “what do you understand SRE role to be”.

minato
Jun 7, 2004

cutty cain't hang, say 7-up.
Taco Defender
By the way, I made a huge mistake. I consider myself a developer, but I didn't think I was smart enough to pass the SWE L33tcode tests so when I got headhunted to be a SRE I figured it would be a lower bar to entry into a FAANG. While that turned out to be true, the job is very different from development and has negatively pigeonholed my career; recruiters only see me as a SRE now, despite me not wanting those kind of roles.

Hadlock
Nov 9, 2004

Just change your resume to say swe, do a find and replace for sre, same with your LinkedIn and give it about 30 days and a different set of recruiters will come pounding on your door

Warbird
May 23, 2012

America's Favorite Dumbass

Anyone have a write up of best practices for Nexus/artifact repo structure? Got to get one stood up and I’m musing about how to best lay that all out. Scoped to a single org each, but with multiple teams accessing/using.

Docjowles
Apr 9, 2009

For what it’s worth (probably nothing at this point lol) I interviewed for a position at google that was explicitly ops, not even SRE. I still had to go through a panel of coding questions like “write a class to implement a Stack, on a white board”, which I failed because it’s just not a thing I had had to think about once in the 10 years since college.

This was also in like 2015 or something though so absolutely ancient history in tech terms. But my understanding was that’s still not unusual there. I dunno about the other FAANG companies. If I knew I wouldn’t have to spend a couple months practicing algorithms and data structures crap I would be more inclined to try my hand at getting in again.

minato
Jun 7, 2004

cutty cain't hang, say 7-up.
Taco Defender
SREs are expected to write the automation that manages the monitoring & deployment of thousands of machines, so it's totally conceivable that they'll be processing lots of data which might call for "advanced" data structures / algorithms. That said, I think 99% of the time they're writing scripting so their coding ability is not held to the same level as SWEs.

At a Google SRE interview I got asked the following coding question. Your code gets passed lines of input that look like:
code:
ABCDEFG
BCDEFGH
CDEFGHI
DEFGHIJ
EFGHIJK
...
(where the values of A...Z are numbers)

How can you detect when the input stream doesn't match this format?

Many quickly notice the pattern: each line is shifted left one position from the line above. So to validate the pattern, it's only necessary to compare 2 adjacent lines at a time. Ignore the first char of the upper line and the last char of the line beneath it, and compare the two lines. If they differ, the input doesn't match the format.

The follow up questions were a little trickier: "These lines might be so long that 2 rows don't fit in memory. How do you adapt your algorithm to compensate?", and "It's expensive to read each value. How can you adapt your algorithm so that you only read each value at most once?" Answer for both: Hash the values

I ended up flubbing this one, not because I didn't pick up on the key to the solution, but because I brain-froze and assumed I'd maintain a single hash per line. In fact you need to maintain 2: (one for the chars [0:end-2], and the other for [1:end-1])

12 rats tied together
Sep 7, 2006

minato posted:

How can you adapt your algorithm so that you only read each value at most once?

Just wanted to mention briefly that this is kind of the inverse of a stupid interview trick I learned a while back, where someone might ask you to detect if a word is a "palindrome or rotated palindrome".

For example, kayak should return True, but so should yakka. You can take the word and duplicate it -> yakka -> yakkayakka, the palindromic substring appears in the duplicated word.

e: For me, I refuse to participate in any SRE interview that contains "tricks" or even array indeces. I'll ask for some glue code, or a basic infrastructure-as-code example in the candidate's preferred toolset, and then I'll introduce a new dependency in the followup evaluation assuming that they don't submit some insanely awful thing.

12 rats tied together fucked around with this message at 03:06 on Jan 27, 2022

Zorak of Michigan
Jun 10, 2006


That sort of interview question worries me, because I can talk conceptually, but I can't white board or write pristine code worth a drat. I usually get the algorithm straight in my head and then trial-and-error the hell out of it.

Vulture Culture
Jul 14, 2003

I was never enjoying it. I only eat it for the nutrients.

Zorak of Michigan posted:

That sort of interview question worries me, because I can talk conceptually, but I can't white board or write pristine code worth a drat. I usually get the algorithm straight in my head and then trial-and-error the hell out of it.
My dumbest SRE interview involved me forgetting how to do basic probability and statistics and I spent several minutes working out how to express Pascal's triangle as a recurrence relation. It was fine

Methanar
Sep 26, 2013

by the sex ghost

quote:

I'm afraid you will mess up my DNS records!
ExternalDNS since v0.3 implements the concept of owning DNS records. This means that ExternalDNS will keep track of which records it has control over, and will never modify any records over which it doesn't have control. This is a fundamental requirement to operate ExternalDNS safely when there might be other actors creating DNS records in the same target space.

For now ExternalDNS uses TXT records to label owned records, and there might be other alternatives coming in the future releases.

lol external-dns lied to me and happily nuked a bunch of DNS without txt records at all.

Vulture Culture
Jul 14, 2003

I was never enjoying it. I only eat it for the nutrients.

Methanar posted:

lol external-dns lied to me and happily nuked a bunch of DNS without txt records at all.
Wow, what a lovely FAQ entry. This features prominently in the README since v0.3, and makes much clearer via words like "can" that this is not a default option:

quote:

From this release, ExternalDNS can become aware of the records it is managing (enabled via --registry=txt), therefore ExternalDNS can safely manage non-empty hosted zones. We strongly encourage you to use v0.3 with --registry=txt enabled and --txt-owner-id set to a unique value that doesn't change for the lifetime of your cluster. You might also want to run ExternalDNS in a dry run mode (--dry-run flag) to see the changes to be submitted to your DNS Provider API.

I've had a number of issues with the defaults on external-dns in some other regards too. For example, if you're using Route 53 and letting it auto-discover your zones, and you provide it the verbatim name of a hosted zone to limit its management scope, it will also match any Route 53 hosted zones that are subdomains of the one you provided it. This makes the zone names not extremely useful for limiting its scope; you're much safer with zone IDs.

Vulture Culture fucked around with this message at 20:16 on Feb 5, 2022

Methanar
Sep 26, 2013

by the sex ghost
pop quiz:

PDNS supports wildcard records.
* can match anything as a fallback if no specific record exists.

example01.example.com A does not actually exist, but leverages * to point to 1.1.1.1
example01.example.com TXT is created with content "hello"

What happens?

Adbot
ADBOT LOVES YOU

Warbird
May 23, 2012

America's Favorite Dumbass

Probably a lot of emails and my evening being ruined.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply