Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Docjowles
Apr 9, 2009

Extremely Penetrated posted:

We're starting to implement this after our TAM pushed hard for it for 2 years. I feel like it's going to be a lot of effort for a sidegrade, but I don't care enough to fight it. Any advice on how to minimize the pain?

Username / post combo :chanpop:

This is probably obvious but you HAVE to automate everything about the account lifecycle and standardize as much as possible. Creating an account, installing your guard rails and defaults, maintaining standards, closing the account. This should all be easily done by clicking a button in a UI or making a merge request or a Jira ticket or whatever your team’s preferred work flow looks like. It should not involve an engineer spending hours doing things interactively

When we first started out Control Tower was embarrassingly bad, but it seems like Amazon has invested in it a lot so it might be useful now? We more or less created our own bespoke control tower while we were waiting for AWS to figure it out. I have not looked at it again seriously because every time they announce a feature it’s something we already did. Which is not to say we are doing anything impressive so much as “our 5 person team is keeping pace with Amazon on top of our other duties clearly they care a lot about this product :lmao:

Use Identity Center / SSO. Non negotiable

What do you hope to gain from the change? We were fortunate to be mostly building greenfield. If you have an account structure that works for you what will be the return on the investment of rebuilding workloads in new accounts? Your TAM should have a good answer for this if they are pushing you to do it. How much do you agree with and trust your TAM? The level of talent and investment in your account you will get from a given rep, uh, varies.

Docjowles fucked around with this message at 05:33 on Feb 22, 2024

Adbot
ADBOT LOVES YOU

Extremely Penetrated
Aug 8, 2004
Hail Spwwttag.
A team member's stood up Account Factory / Control Tower and has just started switching us over to Identity Center (the only part of this I actually like), so hopefully we can get it as automated as you described.

Nothing will be truly greenfield. All of our containerized microservices will need migrations. The main thing we'll get out of this IMO is having small blast radii, so we can run terraform from dozens of pipeline-specific roles (presently 100% manual) without a nightmare spiderweb of bespoke IAM policies given the wildly inconsistent naming/structure of the legacy account. We also have to redo most of the networking to support the shared VPC model, so it's a good time to clean house and ditch our awful security group design.

But neither of these things require micro accounts. I've had a bit of an adversarial relationship with our TAM and tend to think that their proposals are a bad fit for our org given our small headcount. I definitely have the impression that they push what's best for AWS rather than what's best for us.

Vulture Culture
Jul 14, 2003

I was never enjoying it. I only eat it for the nutrients.
The main security benefit of micro-segmentation is that you consistently get to use managed IAM policies, rather than rewriting them all into tag-based policies that fail open if you leave out a condition. It's up to you what kind of benefit that confers. For any migration to go smoothly you're probably going to find yourself adopting zero-trust (VPC Lattice or some other service mesh, OIDC federation out of all your managed platforms, etc.), at which point there's actually less benefit still on the table for all the remaining parts of the migration.

Put another way: micro-segmentation is a killer feature for pure-play AWS, but the more of this functionality you have rolled into high-level abstractions through an internal developer platform or portal, the less useful it's going to be to you.

I view it as pretty similar to containerization: there are some apps that adapt really well into a native container orchestration world, and some legacy/enterprise apps that are best left for now as fat containers that mix too many concerns. The goal is to minimize the number of changes moving each piece of your larger system into a new account/VPC environment, because doing the opposite usually goes poorly unless you have central ops doing all the lift-and-shift work.

Vulture Culture fucked around with this message at 17:48 on Feb 23, 2024

The Iron Rose
May 12, 2012

:minnie: Cat Army :minnie:
We recently acquired a company that has a FedRAMP IL5 environment that I need to support and develop deployment tooling for. We cannot make a connection from gitlab to the IL5 environment, but can publish artifacts (helm charts, container images, etc) to an ACR registry, which our IL5 environment can then pull from.

I have never used CI tooling other than code I’ve written myself in gitlab runners. The company we were acquiring was looking at flux for this. ArgoCD was also mentioned. Whatever solution we end up going with we will want to use for our commercial deployments also, which are in azure/GCP/aws (so no azure DevOps).

Any advice on tooling I should consider? The fundamental restriction is no network connectivity to/from gitlab, but gitlab can publish to an artifact registry and the CI tooling can fetch from that registry without problem.

The Iron Rose fucked around with this message at 22:05 on Feb 23, 2024

FISHMANPET
Mar 3, 2007

Sweet 'N Sour
Can't
Melt
Steel Beams
I don't understand why Azure DevOps is out of the question, you can still do stuff to AWS and GCP with it.

The Fool
Oct 16, 2003


azure devops is going to have the same agent connectivity problem as gitlab though

LochNessMonster
Feb 3, 2005

I need about three fitty


The Iron Rose posted:

We recently acquired a company that has a FedRAMP IL5 environment that I need to support and develop deployment tooling for. We cannot make a connection from gitlab to the IL5 environment, but can publish artifacts (helm charts, container images, etc) to an ACR registry, which our IL5 environment can then pull from.

I have never used CI tooling other than code I’ve written myself in gitlab runners. The company we were acquiring was looking at flux for this. ArgoCD was also mentioned. Whatever solution we end up going with we will want to use for our commercial deployments also, which are in azure/GCP/aws (so no azure DevOps).

Any advice on tooling I should consider? The fundamental restriction is no network connectivity to/from gitlab, but gitlab can publish to an artifact registry and the CI tooling can fetch from that registry without problem.

Can you access the environment where your product will be deployed? Because this sounds like a recipe for disaster wirhout clear boundaries on who’s responsible for what.

If you don’t have access or can’t observe the result of the artefact being deployed you’ll get your client/users making GBS threads on you everytime a deployment breaks. If you can’t see or troubleshoot the failing deployment it’s going to be an absolute nightmare.

GitOps sounds like a decent solution for this though. If you can make sure you’re only responsible for the delivery of a (proven) working artefact, I’d say you’re good.

Personally I found flux really lacking in showing why things were failing. Could be a me problem though.
Haven’t used Argo myself but if I’d get to pick I’d probably try Argo over flux.

Hadlock
Nov 9, 2004

The Iron Rose posted:

We recently acquired a company that has a FedRAMP IL5 environment that I need to support and develop deployment tooling for. We cannot make a connection from gitlab to the IL5 environment, but can publish artifacts (helm charts, container images, etc) to an ACR registry, which our IL5 environment can then pull from.

I have never used CI tooling other than code I’ve written myself in gitlab runners. The company we were acquiring was looking at flux for this. ArgoCD was also mentioned. Whatever solution we end up going with we will want to use for our commercial deployments also, which are in azure/GCP/aws (so no azure DevOps).

Any advice on tooling I should consider? The fundamental restriction is no network connectivity to/from gitlab, but gitlab can publish to an artifact registry and the CI tooling can fetch from that registry without problem.

Sounds like you have some kind of air gap system

I would stand up an internal git server/host that consumes the blob data. Most every git ops system is dependent on git sha etc

Read back through my posts in the last three months I've been struggle bussing with finding the best state of the art system

IMO:

Flux: better overall product, batteries included solution, poor/lovely observability involves deep knowledge of the system to troubleshoot

ArgoCD: critical parts of the system are missing, you need to use an unsupported plugin or build your own to get what flux has it if the box. Pros: observability is very very high, and the average developer can wrap their head around it in half an hour, the UI rocks

Flux at it's core is a dead dead simple 200 line bash script, but it's bloated with correct modular software architecture and 90,000 bells and whistles, and I forget but v1 definitely had it's own controller cli (probably inspired by kubectl)

The Iron Rose
May 12, 2012

:minnie: Cat Army :minnie:
Yes, humans will be able to access the IL5 environment (via virtual workstations, bleh, but I get kubectl and sudo at least). My team is going to be responsible for the shared services, devs for the health of their services, but it’s the classic DevOps setup where the platform team gets the blame whenever anything goes wrong. We have moderately robust o11y with opentelemetry and elastic, so it’s largely just deployment failures of kube resources and OPA violations that our CI platform would need to surface and alert on.

still, I don’t want multiple deployment patterns, so whatever solution we adopt for the feds is what we want to adopt for the commercial side too.

Right now, the acquired company has pipelines that publish artifacts to an OCI repo in ACR, which is then configured as a source in flux. Values are encoded into the helm chart, which will have to change, but that’s a solvable problem.

If nothing else it’s an interesting problem set at least! We’re likely going to do something similar for IaC as well. Git -> OCI registry -> flux deploys a CR defining the infrastructure -> X service deploys it from an IL5 cluster.

Hadlock
Nov 9, 2004

So I ran across a blog post the other day that had an interesting term, "reference architecture" specific to platform architecture/DevOps and that's sent me deep down a philosophical rabbit hole. I've really been struggling to find/define "best practices" or "state of the art" I think it's loosely defined as "containers using git ops and iac"

Should reference architectures be opinionated, like Ruby on Rails? Or left wide open
Should reference architectures be cross-cloud (AWS/gcp/azure etc)
Should reference architectures support all deployment types? CRUD, stream processing, LLM etc
Where does the definition of a reference architecture start and stop? Is it a helm chart, or terraform that deploys the cluster + bootstrappy charts to provide XYZ base functionality? Plus flux/Argo?
Are IAM/Secrets management/password rotation part of the reference architecture?
How do you encode/validate best practices across all "layers" of the reference architecture
Which DNS provider(s) would you support
Is GHA/Jenkins/spinnaker part of this? It's turtles all the way down where do you draw the line

I'm pretty close, I think, to publishing a generic "reference architecture" similar to what I've built at work that uses terraform, k8s, ArgoCD, GitHub actions, but it lacks ownership of IAM and doesn't have any automation of secrets management beyond basic kms access for one user per environment

Most of the "blogs" or medium.com articles I've seen are written by guys who are trying to build a reputation and seem like they just barely know what the hell they're doing, or the toy demo they're deploying only works in a vacuum and is not extensible and in general garbage and you spend an inordinate amount of time splicing a working answer into your existing IaC

Nobody here has pointed me at a reference architecture but I guess I'll take a stab at it one more time: does anyone have a favorite reference architecture they like, or have seen that's moderately kept up to date?

Open to any and all commentary on the subject up to and including "this is a stupid idea, nobody is only creating a crud app, a simplified reference architecture is a stupid idea it's barely better than the medium.com articles"

12 rats tied together
Sep 7, 2006

The Iron Rose posted:

We recently acquired a company that has a FedRAMP IL5 environment that I need to support and develop deployment tooling for. We cannot make a connection from gitlab to the IL5 environment, but can publish artifacts (helm charts, container images, etc) to an ACR registry, which our IL5 environment can then pull from.

Any advice on tooling I should consider? The fundamental restriction is no network connectivity to/from gitlab, but gitlab can publish to an artifact registry and the CI tooling can fetch from that registry without problem.
:smugmrgw:

Blinkz0rz
May 27, 2001

MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

Alright this was pretty good lmao

Falcon2001
Oct 10, 2004

Eat your hamburgers, Apollo.
Pillbug

Hadlock posted:

So I ran across a blog post the other day that had an interesting term, "reference architecture" specific to platform architecture/DevOps and that's sent me deep down a philosophical rabbit hole. I've really been struggling to find/define "best practices" or "state of the art" I think it's loosely defined as "containers using git ops and iac"

Should reference architectures be opinionated, like Ruby on Rails? Or left wide open
Should reference architectures be cross-cloud (AWS/gcp/azure etc)
Should reference architectures support all deployment types? CRUD, stream processing, LLM etc
Where does the definition of a reference architecture start and stop? Is it a helm chart, or terraform that deploys the cluster + bootstrappy charts to provide XYZ base functionality? Plus flux/Argo?
Are IAM/Secrets management/password rotation part of the reference architecture?
How do you encode/validate best practices across all "layers" of the reference architecture
Which DNS provider(s) would you support
Is GHA/Jenkins/spinnaker part of this? It's turtles all the way down where do you draw the line

I'm pretty close, I think, to publishing a generic "reference architecture" similar to what I've built at work that uses terraform, k8s, ArgoCD, GitHub actions, but it lacks ownership of IAM and doesn't have any automation of secrets management beyond basic kms access for one user per environment

Most of the "blogs" or medium.com articles I've seen are written by guys who are trying to build a reputation and seem like they just barely know what the hell they're doing, or the toy demo they're deploying only works in a vacuum and is not extensible and in general garbage and you spend an inordinate amount of time splicing a working answer into your existing IaC

Nobody here has pointed me at a reference architecture but I guess I'll take a stab at it one more time: does anyone have a favorite reference architecture they like, or have seen that's moderately kept up to date?

Open to any and all commentary on the subject up to and including "this is a stupid idea, nobody is only creating a crud app, a simplified reference architecture is a stupid idea it's barely better than the medium.com articles"

I suspect the answer is "If it's simple enough to make into a medium.com article it's useless, but if it's complicated enough to be real-world usable, it's too complex to be summarized that quickly."

A lot of business processes are highly dependent on your company; we recently defined our fleet standards and it was like...thousands of various bits and bobs at the end of the day. Many companies might never care about any of that, but the scale we operate at changes a lot of what we care about - including that some things a smaller company would care about, we simply ignore because it's not meaningful at scale. That doesn't make our approach right and I wouldn't recommend it.

Another way to think of this is how drug companies do manufacturing - they tend to have a 'platform', or sort of a default state; you can go off-platform, but it means that you're on the hook to define the deviations from that platform, and anyone you work with inside the company isn't guaranteed to know how you operate. The term 'platform' is pretty overloaded in IT, but I think it's a pretty reasonable way of looking at things, so what you're basically trying to do is setup a default 'platform', if I'm understanding you correctly. I think that even a naive or short-sighted approach to that problem will teach you and your company a lot, so I think you're heading in a good direction.

(That's a lot of words to fail to give you what you're asking for, unfortunately.)

Resdfru
Jun 4, 2004

I'm a freak on a leash.

Hadlock posted:

Flux: better overall product, batteries included solution, poor/lovely observability involves deep knowledge of the system to troubleshoot

ArgoCD: critical parts of the system are missing, you need to use an unsupported plugin or build your own to get what flux has it if the box. Pros: observability is very very high, and the average developer can wrap their head around it in half an hour, the UI rocks

I just started messing around with both of these. What does flux do better and what critical parts of Argo are missing? I tried to look for a write up or something before asking but it seems everything is just marketing for one or the other

Hadlock
Nov 9, 2004

Resdfru posted:

I just started messing around with both of these. What does flux do better and what critical parts of Argo are missing? I tried to look for a write up or something before asking but it seems everything is just marketing for one or the other

I wrote like a short novel about the shortcomings of Argo; if you tap the "?" below my avatar, go to the last page and do a Ctrl+f for "Argo" it ought to show up

Basically Argo is great, it's exactly what you think you're looking for, it deploys the helm chart, and updates the helm release on the cluster.

Ok, great. I've now deployed v1.0.0 of resdfru app as resd:v1.0.0 or maybe you use the gitsha so it's resd:0183fadc doesn't matter

Bob the senior engineer realized there's a typo on the front page and has issued a hotfox, release v1.0.1

How do you tell Argo to redeploy the helm chart with the updated container image? Does Joe the part time release manager/QA dude just manually update the helm chart, in perpetuity? What if you have dev and staging environments? Are you now maintaining three nearly identical helm charts? Do you have three different values files you update? What's going to update it? Will it live near ArgoCD or maybe since other group will own they custom tooling/business process. Who the gently caress knows? It's not ArgoCD's problem. :airquote: we don't like to force choices on the end user. gently caress you ArgoCD. gently caress you

Flux has an opinionated model of how to update a file after parsing the container registry; and then reads that file. Bing bong so simple

That said, I'm currently using Argo on my greenfield platform. Mostly because I'm dumb and impatient and didn't read the documentation more than a paragraph ahead at a time during initial setup

Hughmoris
Apr 21, 2007
Let's go to the abyss!
Are any of you devoppers(?) heavy on the database side? If so, could you tell me a little bit on how databases fit into your ci/cd process?

I'm comfortable with the hobbyist basics of setting up a new postgresql cluster, loading data, and writing queries. Now, I'm trying to better understand how databases fit into ci/cd workflows that one might realistically see on the job.

minato
Jun 7, 2004

cutty cain't hang, say 7-up.
Taco Defender
Tests need known state in order to not be flakey, so it's useful to completely reset the DB between test runs. We have a containerized Postgres (vanilla docker.io/library/postgres) which auto-initializes and keeps its state in the container, so after a test run we just toss the whole container and make a new one.

But a fresh DB every test run also means you need to recreate the schema (at a minimum) and possibly inject some core data, and that can slow down your test runs when you have thousands of tests. So another way to do it is to use Postgres's ability to clone a DB from a template DB that's already setup with the right schema. Or, depending on the app, run the entire test in a transaction and roll it back at the end of the test.

Hadlock
Nov 9, 2004

Containerized DB is ok but the pro move is to make the reference DB a template DB on Postgres so you can just createdb from template0 or whatever. Boom instant database. Database from template is ~0.5-1.0 seconds to provision, vs probably minutes for a containerized DB. It accomplishes this though copy on write, of whatever.

FISHMANPET
Mar 3, 2007

Sweet 'N Sour
Can't
Melt
Steel Beams
A fun fact I learned about the "official" MySQL images is that when it starts up, if no database is present, it will execute any SQL scripts in a particular directory. We use this for the local dev environment that devs run on their machines. Just stick a database dump in there and you got yourself a stew database baby.

NihilCredo
Jun 6, 2011

iram omni possibili modo preme:
plus una illa te diffamabit, quam multæ virtutes commendabunt

FISHMANPET posted:

A fun fact I learned about the "official" MySQL images is that when it starts up, if no database is present, it will execute any SQL scripts in a particular directory. We use this for the local dev environment that devs run on their machines. Just stick a database dump in there and you got yourself a stew database baby.

postgres official image does the same, it can also run a mix of sql and shell scripts

Hadlock
Nov 9, 2004

My other secret trick for containerized DB is to have the final script that runs, do a "CREATE TABLE CONTAINER_DB_READY" and then have the init container for the crud app do a SELECT * FROM CONTAINER_DB_READY until the DB is initialized

Extremely Penetrated
Aug 8, 2004
Hail Spwwttag.

Hadlock posted:

How do you tell Argo to redeploy the helm chart with the updated container image? Does Joe the part time release manager/QA dude just manually update the helm chart, in perpetuity? What if you have dev and staging environments? Are you now maintaining three nearly identical helm charts? Do you have three different values files you update? What's going to update it? Will it live near ArgoCD or maybe since other group will own they custom tooling/business process. Who the gently caress knows? It's not ArgoCD's problem. :airquote: we don't like to force choices on the end user. gently caress you ArgoCD. gently caress you

I'll chime in with how I handled the above points, because yeah they're definitely things that need dealing with that are kinda outside Argo's scope. You're essentially asking, "how do you handle updated container images across different environments?"

I approach this in two ways, both from the CICD pipeline (I use GitHub Actions). All my helm charts are in their own monorepo. I update charts on the main branch when prod images are published, and on feature/hotfix branches when their images are published (usually by a pull request in the microservice's own repo). So given that, when the pipeline for a microservice runs it can do things like:

1) Update the image tag in values.yml whenever it publishes a new container image, on the appropriate monorepo branch.

2) Patch specific ArgoCD Application / ApplicationSet CRs for a given environment. They have optional values.yml overrides for the image tag and other environment-specific things. In general this is where I differentiate the deployment variables for different environments, though there can be overlap/duplication with ones stored in GHA.

Secrets are always messy. I typically have ones required for builds & automation testing stored in GHA, and ones required by containers at runtime stored in AWS Secrets Manager (sync'd via externalsecrets).

Extremely Penetrated
Aug 8, 2004
Hail Spwwttag.

Hughmoris posted:

Are any of you devoppers(?) heavy on the database side? If so, could you tell me a little bit on how databases fit into your ci/cd process?

I'm comfortable with the hobbyist basics of setting up a new postgresql cluster, loading data, and writing queries. Now, I'm trying to better understand how databases fit into ci/cd workflows that one might realistically see on the job.

For schema changes we use tooling that runs the app's SQL scripts idempotently. It's not too smart, just has a hierarchy for running things in the right sequence, and keeps track of whether any given script has run before or not (or if it should run every time).

Where it gets tricky is marrying up schema changes with rolling deployments. You could easily introduce a breaking schema change that requires a new app version, which means all your blue/green or staggered rolling deployments are going to poo poo a brick once the database updates.

We handle it through policy -- breaking schema changes MUST be implemented gradually across multiple deployments and a distinct release phase. For example, if you wanted to rename a column:
- 1st deployment adds a new column with the new name. Don't loving touch the old one. App version n+1 will initially use the old column, but will be feature flagged to use the new one. You deploy new code and app versions n and n+1 are happily running side by side, until it completes and they're all n+1 using the old column.
- 1st release is when you sync the rows from the old column to the new one and toggle that feature flag. n+1 is now using the new column and you didn't break anything.
- 2nd deployment for n+2 deletes the old column to clean up and reduce DBA grumpiness. Also de-feature flag it in the app.
- After 2nd deployment is complete you can delete the feature flag itself.

Vulture Culture
Jul 14, 2003

I was never enjoying it. I only eat it for the nutrients.

Hadlock posted:

So I ran across a blog post the other day that had an interesting term, "reference architecture" specific to platform architecture/DevOps and that's sent me deep down a philosophical rabbit hole. I've really been struggling to find/define "best practices" or "state of the art" I think it's loosely defined as "containers using git ops and iac"

Should reference architectures be opinionated, like Ruby on Rails? Or left wide open
Should reference architectures be cross-cloud (AWS/gcp/azure etc)
Should reference architectures support all deployment types? CRUD, stream processing, LLM etc
Where does the definition of a reference architecture start and stop? Is it a helm chart, or terraform that deploys the cluster + bootstrappy charts to provide XYZ base functionality? Plus flux/Argo?
Are IAM/Secrets management/password rotation part of the reference architecture?
How do you encode/validate best practices across all "layers" of the reference architecture
Which DNS provider(s) would you support
Is GHA/Jenkins/spinnaker part of this? It's turtles all the way down where do you draw the line

I'm pretty close, I think, to publishing a generic "reference architecture" similar to what I've built at work that uses terraform, k8s, ArgoCD, GitHub actions, but it lacks ownership of IAM and doesn't have any automation of secrets management beyond basic kms access for one user per environment

Most of the "blogs" or medium.com articles I've seen are written by guys who are trying to build a reputation and seem like they just barely know what the hell they're doing, or the toy demo they're deploying only works in a vacuum and is not extensible and in general garbage and you spend an inordinate amount of time splicing a working answer into your existing IaC

Nobody here has pointed me at a reference architecture but I guess I'll take a stab at it one more time: does anyone have a favorite reference architecture they like, or have seen that's moderately kept up to date?

Open to any and all commentary on the subject up to and including "this is a stupid idea, nobody is only creating a crud app, a simplified reference architecture is a stupid idea it's barely better than the medium.com articles"
Key problem is that most cloud-based businesses are also bought into verbal recitation of Conway's Law solving all of their problems, and there is no cloud architecture in existence resilient enough to withstand a vanity reorg

12 rats tied together
Sep 7, 2006

Vulture Culture posted:

[...] there is no cloud architecture in existence resilient enough to withstand a vanity reorg

lolled at this but also, too real and a little too close to home for me.

I think all that stuff (reference architecture, SLO which im lumping in as a similar type of thing) is dumb. I think the best thing you can do as a cloud org is obsess over your ticket queue. Anything that lets you get tickets out of the queue faster is good, anything that slows down mean-time-to-done-column is bad. If you don't have a way to deny a ticket and shunt it to a project manager who will work with the requester to redefine the ticket in such a way that it meets your standards, that's the first thing you should change.

The tools especially don't matter, except the degree to which they help you clear out the ticket queue. I think we're far along enough in "devops" to have all internalized that ecosystem-esque benefits of any particular toolset are fake bullshit. The only value is handing a completed cloud thing back to a developer so they can realize they forgot something.

Hadlock
Nov 9, 2004

Extremely Penetrated posted:

We use Cloudflare instead of Cloudfront, but the idea is externaldns provisions Cloudflare DNS records for both static assets and APIs.

Can you explain just how the hell you're doing this?

I think I want to create a cloudfront distribution using an ACK* CRD and have external dns read a value (arn?) out of the ACK CRD

I'm guessing I need to program external dns to point at a different resource than an ingress

*Aws controller for Kubernetes, a helm chart thing for AWS services, among them S3 and cloudfront

Also open to pointing external-dns at cloudflare CDN, since clearly you have that working, and we're also cloudflare customers

Extremely Penetrated
Aug 8, 2004
Hail Spwwttag.

Hadlock posted:

Can you explain just how the hell you're doing this?

I think I want to create a cloudfront distribution using an ACK* CRD and have external dns read a value (arn?) out of the ACK CRD

I'm guessing I need to program external dns to point at a different resource than an ingress

*Aws controller for Kubernetes, a helm chart thing for AWS services, among them S3 and cloudfront

Also open to pointing external-dns at cloudflare CDN, since clearly you have that working, and we're also cloudflare customers

I have 3 copies of externaldns running, one each for Cloudflare, public route53, and private route53. They watch for their own custom annotations on both ingresses & services. I can get a static CNAME in Cloudflare just by making a service of type ExternalName and giving it the same annotation I configured my externaldns-cloudflare provider with. You can use the other types (ClusterIP/NodePort) to get dynamic CNAMEs pointing to load balancers -- externaldns will figure it out from the related ingress.

I think your problem is the need to look up that S3 bucket URI and patch it to the service. Do you really need to create/destroy S3 buckets on the fly? If you can have static bucket URIs for your CNAMEs then your problem goes away. Otherwise, this is where ACK falls on its face and you have to use a Crossplane XRD (which supports arbitrary patching of attributes from one resource to another). This is something that's dead simple in terraform/cloudformation but is pretty poo poo in kubernetes. Like I wanted to create EC2 instances and then use their new private IPs for target group registrations as well as private route53 records, and a Crossplane XRD was the only decent option I found.

I think ACK support from AWS got yanked hard. Once my TAM got wind that I was even considering using EKS they started having Come to Jesus interventions with me, and even put together a call with 3 other AWS specialist support engineers to try to dissuade me. (They didn't offer a better solution, just a nebulous "automate Cloudformation!") It's cynical of me but I suspect they have orders from on high to discourage kubernetes use however they can, to make sure you're locked in to the AWS-specific services.

Hughmoris
Apr 21, 2007
Let's go to the abyss!

Extremely Penetrated posted:

For schema changes we use tooling that runs the app's SQL scripts idempotently. It's not too smart, just has a hierarchy for running things in the right sequence, and keeps track of whether any given script has run before or not (or if it should run every time).

Where it gets tricky is marrying up schema changes with rolling deployments. You could easily introduce a breaking schema change that requires a new app version, which means all your blue/green or staggered rolling deployments are going to poo poo a brick once the database updates.

We handle it through policy -- breaking schema changes MUST be implemented gradually across multiple deployments and a distinct release phase. For example, if you wanted to rename a column:
- 1st deployment adds a new column with the new name. Don't loving touch the old one. App version n+1 will initially use the old column, but will be feature flagged to use the new one. You deploy new code and app versions n and n+1 are happily running side by side, until it completes and they're all n+1 using the old column.
- 1st release is when you sync the rows from the old column to the new one and toggle that feature flag. n+1 is now using the new column and you didn't break anything.
- 2nd deployment for n+2 deletes the old column to clean up and reduce DBA grumpiness. Also de-feature flag it in the app.
- After 2nd deployment is complete you can delete the feature flag itself.

Thanks for the detailed insight!

You mention DBA grumpiness, so I'm guessing you have a stand-alone DBA team that you work with? How involved are they with your team on day to day work?

Hadlock
Nov 9, 2004

I am going to read your post about external dns when things are mildly less on fire

I saw this today:

https://techcrunch.com/2024/02/27/githubs-copilot-enterprise-hits-general-availability/

This weekend I was writing a lot of README.md documentation by splatting out thoughts, selecting it and telling copilot to "make this sound better, and easier to understand" to which it made it significantly better she much easier to read. That's why I find this exciting:

quote:

The highlight here is the ability to reference an organization’s internal code and knowledge base. Copilot is now also integrated with Microsoft’s Bing search engine (currently in beta) and soon, ...

With that, new developers on a team can, for example, ask Copilot how to deploy a container image to the cloud and get an answer that is specific to the process in their organization.

It's especially great news because I'm gearing up to train my org/team on how to do this exact process

The Fool
Oct 16, 2003


prepare for disappointment

Extremely Penetrated
Aug 8, 2004
Hail Spwwttag.

Hughmoris posted:

Thanks for the detailed insight!

You mention DBA grumpiness, so I'm guessing you have a stand-alone DBA team that you work with? How involved are they with your team on day to day work?

We have a pair of dedicated DBAs / data architects, who work alongside a BI team. My team (infra/ops) has fairly regular requests for work to do for them, but DBA involvement in any CICD-related stuff is rare. We might work together when migrating a component, or for standing up a new ETL process, or general troubleshooting when poo poo's blowing up. I'm a bit of an outlier in that I'm the only one on my team who jumps in to our component repos and submits PRs and deeply understands the entire deployment process(es). I typically work with the software architects, developer leads, and platform folks.

Falcon2001
Oct 10, 2004

Eat your hamburgers, Apollo.
Pillbug

The Fool posted:

prepare for disappointment

The chance that any LLM manages to figure this out from the lovely documentation every company has is essentially somehow a negative number. I assume at this point that the marketing team for copilot at MSFT is basically wholly disconnected from any actual engineering team and is just trying to get as much cocaine as possible before the hype dies.

Vulture Culture
Jul 14, 2003

I was never enjoying it. I only eat it for the nutrients.
It sounds like it's just NLP-enhanced search results, nothing more or less

Blinkz0rz
May 27, 2001

MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

Hadlock posted:

I am going to read your post about external dns when things are mildly less on fire

I saw this today:

https://techcrunch.com/2024/02/27/githubs-copilot-enterprise-hits-general-availability/

This weekend I was writing a lot of README.md documentation by splatting out thoughts, selecting it and telling copilot to "make this sound better, and easier to understand" to which it made it significantly better she much easier to read. That's why I find this exciting:

It's especially great news because I'm gearing up to train my org/team on how to do this exact process

Confluence already has a "summarize this article" feature hth

Hadlock
Nov 9, 2004

Falcon2001 posted:

The chance that any LLM manages to figure this out from the lovely documentation every company has is essentially somehow a negative number. I assume at this point that the marketing team for copilot at MSFT is basically wholly disconnected from any actual engineering team and is just trying to get as much cocaine as possible before the hype dies.

Have you used copilot in vscode yet

The Fool
Oct 16, 2003


Hadlock posted:

Have you used copilot in vscode yet

yes, which is why my expectations are appropriately low

FISHMANPET
Mar 3, 2007

Sweet 'N Sour
Can't
Melt
Steel Beams
I have a new coworker that got hired on as a "standard" DevOps engineer (not a Junior as I don't think we use that title, but not a Senior like I got hired as). He uses ChatGPT a lot for answering questions that I would be Google searching for. I genuinely don't know if this is related or not, or just because he's more junior, but he seems to not always have a ton of understanding of what exactly he's doing. As in, he can copy a command out of ChatGPT or some online tutorial, and it will work, but he won't really understand what that command does or why it solved his problem. And this lack of deeper understanding seems to be kneecapping him a lot as he's trying to build something new or learn something new.

Like I said, I don't know if this is just the inexperience of a junior, or if the type of thinker that isn't always so innately curious about what they're doing leads itself to going to ChatGPT, but I do not like it.

Maybe I've just been incredibly lucky, but I've so very rarely encountered someone that isn't by default figuring out the "why" of whatever they're doing that I'm kind of shocked to encounter that behavior in the wild. I'm hopeful I and our team lead can mentor that out of him, and get him digging into the "why" a little more.

I guess my previous employer was pretty unstructured, so to do much of anything you had to figure out the why, whereas he came from a much more structured environment where it sounds like his job was just following instructions to execute the tasks that were given him, and now we're here in a more unstructured environment where we have to figure out how to do stuff vs just following some instructions.

Docjowles
Apr 9, 2009

FISHMANPET posted:

I have a new coworker that got hired on as a "standard" DevOps engineer (not a Junior as I don't think we use that title, but not a Senior like I got hired as). He uses ChatGPT a lot for answering questions that I would be Google searching for. I genuinely don't know if this is related or not, or just because he's more junior, but he seems to not always have a ton of understanding of what exactly he's doing. As in, he can copy a command out of ChatGPT or some online tutorial, and it will work, but he won't really understand what that command does or why it solved his problem. And this lack of deeper understanding seems to be kneecapping him a lot as he's trying to build something new or learn something new.

Like I said, I don't know if this is just the inexperience of a junior, or if the type of thinker that isn't always so innately curious about what they're doing leads itself to going to ChatGPT, but I do not like it.

Maybe I've just been incredibly lucky, but I've so very rarely encountered someone that isn't by default figuring out the "why" of whatever they're doing that I'm kind of shocked to encounter that behavior in the wild. I'm hopeful I and our team lead can mentor that out of him, and get him digging into the "why" a little more.

I guess my previous employer was pretty unstructured, so to do much of anything you had to figure out the why, whereas he came from a much more structured environment where it sounds like his job was just following instructions to execute the tasks that were given him, and now we're here in a more unstructured environment where we have to figure out how to do stuff vs just following some instructions.

I can't really throw stones at this since a huge percentage of my own work over the years has been assisted by typing the problem into Google. Typing it into ChatGPT just feels like the next iteration of that. But yes, your manager, as well as you as a senior on the team, should be encouraging him to actually learn something from the experience and think for himself. We've all seen LLM's confidently spit out absolute bullshit so it's only a matter of time before this bites him in the rear end. It's also not going to scale beyond solving simple junior problems and he'll hit a performance/career wall when he encounters something he can't solve without the crutch. Or that requires experience and creativity to even craft a prompt for ChatGPT that he doesn't have.

I would also say that yeah, if you have never worked with someone like this, you've been lucky or sheltered. Not everyone finds tech fascinating and wants to deeply understand it, or cares about climbing the career ladder. For a lot of people it's just a job like any other and they just want to get their pay check and log off to do whatever is more important to them in life.

Resdfru
Jun 4, 2004

I'm a freak on a leash.
I ask chat gpt stuff i know but can never remember so I know when it's wrong.

Earlier today I had 200 numbers I wanted to add up and I tried chat gpt, gemini, and copilot. I asked each of them to add the numbers twice and I got 6 different numbers. I got the right number from sum in google sheets but lol

Docjowles posted:

I would also say that yeah, if you have never worked with someone like this, you've been lucky or sheltered. Not everyone finds tech fascinating and wants to deeply understand it, or cares about climbing the career ladder. For a lot of people it's just a job like any other and they just want to get their pay check and log off to do whatever is more important to them in life.

More of these people are popping up I'd say with all the noise made about how much money you could make by touching the right computers.

I've worked with plenty of people who are somewhere in the middle of care about all this crap and don't care. They're awesome at their job, they know all the things but they don't look at computers or game consoles or anything more than a phone in their free time.

Resdfru fucked around with this message at 20:12 on Feb 28, 2024

Adbot
ADBOT LOVES YOU

FISHMANPET
Mar 3, 2007

Sweet 'N Sour
Can't
Melt
Steel Beams
Yeah, I've had plenty of coworkers for who computing isn't their hobby, but they were usually the type of people who could sort of figure stuff out, or at the very least knew if they needed to figure something out (full disclosure: The first 14 years of my career were at the same employer, this is only my second real job). Like I said, maybe that was just a factor of the environment, because there was so little firm procedure, that to do anything you had to have some understanding of what you were copy/pasting from Google.

Like right now we're working through something in terraform in AWS, this is his first experience with terraform or AWS. He can find some stuff about the various settings that he needs for a terraform resources. But does he know what they mean? Not really. He's just finding a value that makes terraform plan not complain, and then more often than not confused about why terraform apply fails.

At my last place we hired someone who came from an MSP, and he was great, because his whole career had been "no documented process, figure it out"

Anyway, even as I type this I can see the new guy figuring new things out for himself, this was all just mostly cathartic at this point.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply