Continuous Integration/build engineering/devops thread

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Continuous Integration/build engineering/devops thread

«‹›4 »

Blinkz0rz: May 27, 2001; MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

Cancelbot posted:

RabbitMQ surely is consistent though? even if it misses out the whole "available" and "partition tolerant" parts of the ~~fire~~ CAP triangle.

Ask me how we run a 3 node RabbitMQ cluster in loving Windows and watch it burn because Windows cluster aware updating will fail to work or occasionally drop the RabbitMQ disks.

Holy poo poo that sounds awful.

I used to operate multiple clusters running 3.6.5 which was before they distributed the stats db, so under heavy load the node where the stats db landed would continue to eat up memory until it OOMed and the cluster partitioned. At one point we had a runbook for memory usage on rabbit nodes that basically linked to this page: https://www.rabbitmq.com/management.html#stats-db and told the person on-call to keep trying to reset the stats db until it actually took.

That was the worst on-call experience of my entire career. Of course even though upgrading to a later version would have fixed it, the engineering team decided to migrate all of their queues to SQS which meant we finally got to murder rabbit. That was the best on-call experience of my entire career.

# ¿ May 14, 2019 00:50

Adbot: ADBOT LOVES YOU

# ¿ May 17, 2024 07:17

Blinkz0rz: May 27, 2001; MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

uncurable mlady posted:

I�m curious if anyone itt has looked at Honeycomb/LightStep/Omnition.

my team is currently having an absolutely awful time with our logging solution and are considering moving to honeycomb. i wasn't with the team when they did the demo but from what i heard it was great tech that was just a little too pricey to consider switching to

# ¿ Sep 28, 2019 00:19

Blinkz0rz: May 27, 2001; MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

For those folks who have Jenkins or something else apply their Terraform changes, how do you handle cases where the apply fails because Terraform can't work out the correct order to create things, or if you get rate limited, or if your IAM role's (you're using roles, right?) temporary creds expire, or if you have a resource limit that you hit during the apply, or if really any number of other things that might cause a plan to succeed but an apply to fail?

Our platform team has been a bit shy about the idea of automating applies but I'd love to be able to do it if someone has a good answer for how to recover TF from a bad state that a machine put it in.

# ¿ Nov 17, 2019 01:37

Blinkz0rz: May 27, 2001; MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

CMYK BLYAT! posted:

Kubernetes good. Kubernetes users bad. Organizations bad.

You, a software vendor, release an application, and the application runs reasonably well within Kubernetes despite needing to work around some pre-Kubernetes design decisions. If you are familiar with the building blocks used to create Kubernetes deployments, you can likely figure these out.

Your users, however, don't know Kubernetes, and don't have the slightest idea how to start writing a manifest. They fly into a panic the instant anything goes wrong, and are paralyzed with fear of the new and confusing Kubernetes landscape. Checking application logs or making the most basic effort to interpret error messages (what does "could not find Secret foobar" mean? AN IMPOSSIBLE MYSTERY) is an insurmountable, herculean task--although they're what sysadmindevopswordsalad people have done for eons, those same tasks are now treated as inscrutable dark magic once kubectl is involved. Unfortunately, your users' management cares little about whether their reports are able to effectively manage applications in Kubernetes and has no interest in providing them with training: a big important CTO type has issued a mandate to move onto Kubernetes, and by god, management is going to demonstrate its ability to deliver on that mandate through traditional poo poo management practice of "yell at your employees to do poo poo faster". Challenges and poor planning be damned, the deadline WILL be met.

Enter Helm. Helm is a fantastic, if flawed tool for expediting many things an experienced Kubernetes admin learns they need to do when managing deployments manually. Know that you'll change basically three values and leave the rest of the manifest the same? Great! You can stick those in a brief file and have them inserted at the correct location! Need to run something before starting an upgrade rollout? We got lifecycle hooks! Need to create the same three mostly identical pieces of configuration for each configmap you add? Template loops!

Sadly, the aforementioned users treat Helm as an easy button. Someone already wrote most of the manifest and provided a concise set of things they probably will need to change, so it's no longer necessary, from their point of view, to understand what the rest of the manifest is doing, why it's doing it, or how to fix it when it goes wrong. Hardly any will try to modify the templates to add features their particular environment needs; everything is submitted back to the vendor to implement. Users treat Helm as another layer of abstraction, where it's actually augmenting an existing abstraction layer and doesn't excuse you from understanding that layer. Those augmentations add further complexity alongside the existing abstraction layer, so you need to know more about what you're doing to use them effectively, not less.

If you as an operations engineer can't provide easy to use tools for product engineers to release and run their code then you've failed in your job.

Kubernetes is a layer of abstraction that makes deployment and operations by users more complex than it needs to be and provides so many moving parts that touching it in the wrong way can cause cascading issues in places a user couldn't possibly imagine.

Golden image worked perfectly and was extremely simple, easy to reason about failure cases, and scaled nicely. I have no idea why people needed to completely reinvent the wheel.

# ¿ Dec 14, 2019 20:19

Blinkz0rz: May 27, 2001; MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

CMYK BLYAT! posted:

I'd love for things to be simpler, I really would. Sadly, we don't have infinite time and resources to try and abstract away every possible decision that's necessary when deploying complex, rapidly-evolving software into every conceivable (and often badly-designed) network architecture.

This is sort of the crux of my argument against Kubernetes. 99% of applications deployed in k8s don't require the complexity that k8s brings. It's resume-driven development for ops teams and it loving sucks to be on the product eng side when dealing with it.

# ¿ Dec 14, 2019 21:16

Blinkz0rz: May 27, 2001; MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

12 rats tied together posted:

Cloudformation is a fantastic service that everyone who works in AWS should know, even if you don't use it actively, simply because the Cloudformation resource reference doubles as the best API documentation available for the platform.

Cloudformation is awful and no one will ever convince me otherwise :colbert:

# ¿ Feb 5, 2020 23:05

Blinkz0rz: May 27, 2001; MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

Osmosisch posted:

AWS interfaces

There's your first problem. Don't use the AWS console to build or set up anything. It's just not worth the pain and frustration.

# ¿ Feb 12, 2020 18:56

Blinkz0rz: May 27, 2001; MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

Osmosisch posted:

Oh no i meant their overloaded web interface.

I'd love any pointers because trying to pick this stuff up on my own has been extremely frustrating. Just as an example, mounting ssl key/cert pairs into the file system from a secrets volume is apparently rocket surgery or I'm just missing something obvious (or both).

Yeah, their web interface is called "The Console".

Methanar posted:

My company built and maintains their own internal version of the aws web console with flask.

This is how all the devs set up LTs/ASGs/iam/SGs/dns/environment variables.

lol

lmao this is awful

# ¿ Feb 13, 2020 01:26

Blinkz0rz: May 27, 2001; MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

New Yorp New Yorp posted:

As awful as having all cloud infrastructure maintained by a separate team time-shifted by about 10 hours, driven by service now tickets?

Nope that's way worse!

# ¿ Feb 13, 2020 01:43

Blinkz0rz: May 27, 2001; MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

Hadlock posted:

I put up an ops manager job listing on the jobs thread. If you do Linux + containers, and have not terrible opinions, hit me up. Full time remote.

Hadlock posted:

Pay: low to mid 100s

Looking for someone with actual management experience

Lmao, try $180k+ if you want a decent candidate

# ¿ Feb 22, 2020 16:27

Blinkz0rz: May 27, 2001; MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

Necronomicon posted:

So I've got an AWS/boto3/Rancher conundrum I'm hoping y'all can help out with. I've been trying to put together a python script to do the following:

1. List all IAM access keys that are marked "active" and are more than 90 days old
2. Create a new key for that IAM user, then invalidate the old one
3. For each key being invalidated, check to see if it's being used as an environment variable in a pod in Rancher.
4. If it *is* being used, replace said environment variable with the newly-created key (we have an internal tool to do this that I'll just invoke)

The long and short of it is I need to automate the rotation of API keys in AWS to prepare us for SOC2 and other audits coming down the pipe. I can query the keys and create new ones / invalidate old ones just fine, but my code at this point is approaching Frankenstein status and I'm hoping somebody has found a more elegant solution. The main issues I'm trying to solve are automating this tedious nonsense and mitigating the issue where we have no idea where an API key is actually active in our system, which makes blowing things up way too easy.

EDIT: I can post this in the Python thread if that would get more results. I'm a little rusty because honestly I've only been asked to write Terraform for most of the time at this new job.

Use IAM roles instead of users and keys

# ¿ Feb 25, 2020 23:57

Blinkz0rz: May 27, 2001; MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

Nomnom Cookie posted:

kiam is crap and kube2iam is worse

Use EKS and the OIDC provider

https://docs.aws.amazon.com/eks/latest/userguide/enable-iam-roles-for-service-accounts.html

# ¿ Feb 26, 2020 00:08

Blinkz0rz: May 27, 2001; MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

Nomnom Cookie posted:

Migrating to EKS is our perpetual good to do but other stuff is higher priority project, so we use api keys.

In that case, while kube2iam isn't great, it's heaps better than static creds.

# ¿ Feb 26, 2020 00:25

Blinkz0rz: May 27, 2001; MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

Nomnom Cookie posted:

It�s not though! It�s racy and abandoned. A thing that works is always better than a thing that doesn�t work.

Static creds are probably one of the most dangerous things to have floating in your environment. At least tell me you're using policy conditions to bind them to specific ec2 instances rather than usable by anyone from anywhere.

# ¿ Feb 26, 2020 12:46

Blinkz0rz: May 27, 2001; MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

Nomnom Cookie posted:

You�re not wrong, but if forced to choose between spraying api keys everywhere or deploying kube2iam I�d take the
spraying. Also, and this is key, I don�t give a poo poo about security. We have a security guy. If he spends all his time on compliance and doesn�t have any left to spend on actually securing things, that�s not my problem. I got actual useful work to do.

Yikes

# ¿ Feb 26, 2020 14:40

Blinkz0rz: May 27, 2001; MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

I've heard bad things about kube2iam but we ran it in prod for a year and a half in 5 regions with only one, transient issue I can remember that went away when we did a rolling restart.

Not to say it's not a dumpster fire but when faced with that or key spray I'll gladly take the dumpster fire any day of the week.

# ¿ Feb 26, 2020 23:37

Blinkz0rz: May 27, 2001; MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

What's the current hotness for managing Jenkins job definitions in code? Is it still pipelines with Jenkinsfiles in the project repo or something else?

# ¿ Mar 2, 2020 14:57

Blinkz0rz: May 27, 2001; MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

CyberPingu posted:

Ive got that in already, its just how to pass that into the below so that it reads likle
code:
"UnauthorizedAPICalls" = {
    pattern = "{($.errorCode= \"*UnauthorizedOperation\") || ($.errorCode= \"AccessDenied*\")}"
    description = "A user has made an unauthorized API call"
  }
description = "A user in account <foo> has made an unauthorized API call"

code:

data "aws_caller_identity" "current" {}

...

description = "A user in account ${data.aws_caller_identity.current} has made an unauthorized API call"

That'll give you the account number. If you maintain a map of number to friendly name you'll be in good shape.

# ¿ Mar 5, 2020 14:14

Blinkz0rz: May 27, 2001; MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

CyberPingu posted:

You cant add a variable into a variable though.

code:


  on terraform.tfvars line 4:
   4:     description = "A user has in account ${data.aws_caller_identity.current} made an unauthorized API call"

Variables may not be used here.

Can you use join to compose the whole string?

# ¿ Mar 5, 2020 14:48

Blinkz0rz: May 27, 2001; MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

The troll answer is to make you nervous so you vendor them yourself.

Real answer is probably no good reason other than some internal CNCF agreement

# ¿ Apr 16, 2020 22:55

Blinkz0rz: May 27, 2001; MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

We use job DSL for the precise reason Gyshall mentioned: it let's our teams manage things.

Of course it really means that product teams complain to the platform team every time something goes wrong because why bother reading logs or trying to understand how systems work?

# ¿ Jun 5, 2020 17:08

Blinkz0rz: May 27, 2001; MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

Our aws misconfiguration situation is wild.

We sell a product that does it well, recently acquired a company that does a great job at it, and I wrote a tool that does it decently before the first 2 things existed.

Please don't dox me, ok?

# ¿ Aug 15, 2020 15:42

Blinkz0rz: May 27, 2001; MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

Datagrip or intellij with the DB plugin

# ¿ Aug 28, 2020 15:41

Blinkz0rz: May 27, 2001; MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

Use okta to federate AWS account access and then use AWS auth for wks

Ez pz

# ¿ Sep 14, 2020 01:58

Blinkz0rz: May 27, 2001; MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

we're using the alb ingress controller and it's quite needs suiting

might not apply for you tho

# ¿ Oct 16, 2020 00:25

Blinkz0rz: May 27, 2001; MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

Not a comprehensive list but I follow Seth Vargo (@sethvargo), Charity Majors (@mipsytipsy), Corey Quinn (@QuinnyPig), Mitchell Hashimoto (@mitchellh), and @SimpsonsOps and find them to be pretty good

# ¿ Oct 29, 2020 22:46

Blinkz0rz: May 27, 2001; MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

PCjr sidecar posted:

Kelsey Hightower, Liz Fong Jones, Erowid Recruiter

Oooh yeah forgot about Liz

# ¿ Oct 30, 2020 01:45

Blinkz0rz: May 27, 2001; MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

Methanar posted:

the only acceptable package management system is git clone

Ok Rob Pike

# ¿ Nov 7, 2020 13:42

Blinkz0rz: May 27, 2001; MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

We use data dog for metrics and monitoring and it's fine I guess. Lot fewer things to configure and keep running versus most other comparable metrics platforms. If I had to solve logging I'd go with a managed elastic setup 'cause right now we dogfood our siem product's log management tool and it's not great for application logging.

# ¿ Nov 14, 2020 16:56

Blinkz0rz: May 27, 2001; MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

12 rats tied together posted:

The best scaffolding tool for terraform is ansible. It doesn't come up very often during searches because if you're going to use terraform you've likely already encountered and chosen not to use ansible, and if you were already using ansible you didn't need terraform in the first place. The integration is there for you and available though, and I've enjoyed using it a lot at past roles when dealing with people who are ideologically biased against ansible for some reason (usually a misunderstanding of YAML).

If you ask terraform people what the best scaffolding tool is, last I heard it was terragrunt. It seems pretty good, but I've never had to use it because I just use ansible instead.

e: To answer the previous question, in the past we used management tools for terraform because the existing language was really bad and lacked support for extremely basic things like "using a variable in a module path", so instead of updating modules with sed we wrapped terraform deployments in ansible so we could use jinja2 to write terraform files with an actual templating language.

I understand this was supposed to get better for terraform in 0.14 but I remember looking at it after the release and finding it to still be extremely limited.

Do you work at red hat or something? I can't think of a single post you've made that hasn't pushed ansible as a catch-all solution for automation. It's not a bad tool but if someone asks how to scaffold a terraform project the answer isn't some entirely unrelated too jfc

# ¿ Dec 12, 2020 15:12

Blinkz0rz: May 27, 2001; MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

12 rats tied together posted:

I do not and would not work for red hat but I have been using ansible at work for the past 7 years. As a couple other posters have mentioned, it's not an entirely unrelated tool, it being pitched as config management is purely a post-acquisition piece of marketing that you shouldn't take too seriously. Red hat wants to sell Tower licenses and support and to do that they're pitching it the best way they know how.

It's an orchestration tool, not a config management tool. You can absolutely use it to orchestrate your terrafom just like you can use it to orchestrate everything else.

It's absolutely a config management tool that was written as an alternative to chef and puppet and saying anything else is just revisionist history.

quote:

I think you'd be surprised, there is no better way to orchestrate cloudformation that I've come across since like 2015 or so. Cloudformation by itself is definitely an awful tool but using ansible to drive is a top tier workflow in the "declarative cloud provider api" space.

Cdk is far better than templating cloudformation yaml or json via ansible and jinja

# ¿ Dec 12, 2020 19:04

Blinkz0rz: May 27, 2001; MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

12 rats tied together posted:

You can learn and then string together 5 or 6 different tools for this or you can just pip install ansible and get to work. I hope this helps illustrate why I bring it up in every single IT post I make.

You learned how to use a pocket knife and now, to you, every problem can be solved with one regardless of whether there's a more appropriate tool.

# ¿ Dec 12, 2020 19:30

Blinkz0rz: May 27, 2001; MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

The Fool posted:

Ok, my question was pretty vague and I think some of you were searching for an xy problem.

The team that I�m on is building an infrastructure pipeline workflow so that our app teams can just say �I want resource1, resource2, and it needs to be load balanced� and our tools take that information and builds out all the required infrastructure to make it work, the nsg�s, the storage accounts, makes sure asp�s are in the right ase�s and a bunch of other stuff. It also enables easy promotion from dev to load testing to prod.

Right now the app teams interact with this by writing terraform using modules that we built, which when checked in trigger azure devops pipelines and tfe.

This is having a heavier support burden for teams that are less familiar with tf and we are having to troubleshoot and help them deploy their environments.

My idea was to explore the possibility of having the app teams use a scaffolding/code generation tool to ask them a couple questions then generates a base folder structure and tf files that would deploy what they need based on some common design patterns.

Mostly inspired by web dev tools like create-react-app and django.

I got what you were asking. Our infra team generally delegates ownership to dev teams so most of the team TF repos are kinda choose your own adventure wrt organization. However, at the account management level (i.e. for organization config or idp config), they use cookiecutter to generate a well-defined structure. I bet you could do something similar as a way to let teams bootstrap their setups.

# ¿ Dec 12, 2020 21:12

Blinkz0rz: May 27, 2001; MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

i haven't done much complex stuff in ansible but couldn't you curl the es health endpoint as a blocking operation until it reports ready and then continue to the next task?

we did something similar with our chef cookbooks around ensuring consul availability. i think we had to ultimately write a little bit of ruby to do it but it wasn't particularly painful.

if percentage waits until playbook completion for a given node you should be fine as long as your percentage groups are small enough that you're not negatively impacting the cluster.

# ¿ Dec 13, 2020 15:54

Blinkz0rz: May 27, 2001; MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

LochNessMonster posted:

The problem is that task 1 restarts the service and task 2 is the health check. No matter which solution I�m trying, ansible keeps running task 1 on all nodes before doing the health check (which it should do after each node).

The restart service task is currently a handler and it didn�t work if I let handler 1 notify handler 2.

this feels like something that might require a semaphore somewhere outside of the process.

another possible option, if this is in the cloud, would be to set it up as a proper asg with a status check on the es health endpoint and write some quick bash to terminate each old node, wait until the asg capacity is back to full, then terminate the next one, etc.

# ¿ Dec 13, 2020 16:35

Blinkz0rz: May 27, 2001; MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

LochNessMonster posted:

It�s fully on prem unfortunately and working with ansible is considered black magic around here.

Also one of the reasons why I didn�t renew my contract with this client. There is virtually automation, provisioning VMs can take months (I onlu wish I was joking) as there is no spare capacity.

It�s my last feature to deliver before focussing on some knowledge transfer and I have a strong urge to do it properly. But at this point I�m considering to just run the whole thing in serial and advise them to change it to percentile in the future so I can be done with it.

Honestly it might make sense to just write something quick and dirty that spins until the cluster is healthy, acquires a lock somewhere (DB, consul, etcd, redis, whatever), restarts the service, then releases the lock.

Run that on every node at the same time. It'll take a while but at least you won't have to coordinate everything manually.

# ¿ Dec 13, 2020 20:33

Blinkz0rz: May 27, 2001; MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

Gyshall posted:

Imagine wanting to use interpolation in providers/versions.

At some point you have to stop abstraction. Terraform began life as a declarative provisioning tool, it's miles ahead of where it was and is still loads better and accessible. I'm of the opinion that it doesn't need try/catch stuff and if you need that or backend interpolation then just write a thin wrapper for it or use something like terragrunt.

Yeah, dynamic versioning is a pretty awful smell that you're doing something fundamentally at odds with what terraform is designed to do.

12 rats tied together posted:

e: It's also possible I'm doing this wrong because this is way harder to test, but it seems like the "conditionally null parameter key and value" problem is still here:
code:
Error: Invalid index

  on main.tf line 17, in resource "null_resource" "test_nullable":
  17:     test = "${null_resource.is_null[*].id != "" ? null_resource.is_null[0].id : null}"
    |----------------
    | null_resource.is_null is empty tuple
If I set count=0 on "is_null" I get an invalid index error -- in a production scenario this would be a situation where count = 0 or 1 based on whether or not I want that aforementioned nat gateway or internet gateway, or "public vs private" in AWS terms. If I set count=1 this plans and applies just fine. Back in 0.11/early 0.12 days there was a github issue about how the ternary evaluates both sides of an expression, so "this or null" will never work because "this" must always exist.

The Terraform solution to this problem would probably be operator chaining so you'd have some awful like 200 character nested ternary/list coalesce nonsense. I imagine that _probably_ works now, but I don't care enough to find the github issue where apparentlymart explains it in a way that someone who doesn't work at hashicorp can understand.

The established pattern since forever is to have a boolean variable at the module level and then define a resource's count based on the variable's value. Of course it gets complicated with dependent or linked resources but that's kind of what you'd expect with a declarative graph.

# ¿ Feb 4, 2021 14:28

Blinkz0rz: May 27, 2001; MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

Man you have a lot of opinions about how to use a tool you admit that you've only used minimally

# ¿ Feb 7, 2021 21:46

Blinkz0rz: May 27, 2001; MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

Nm then, I thought I'd read a post from you where you said you hadn't used terraform in years.

# ¿ Feb 7, 2021 22:15

Adbot: ADBOT LOVES YOU

# ¿ May 17, 2024 07:17

Blinkz0rz: May 27, 2001; MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

Methanar posted:

spinnaker sux

It's overengineered nonsense but it shouldn't surprise you given that it's Asgard's successor.

# ¿ Feb 14, 2021 20:06

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Continuous Integration/build engineering/devops thread

«‹›4 »