Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Spring Heeled Jack
Feb 25, 2007

If you can read this you can read
What are people using to manage/implement secrets in k8s? The built in method of "creating the secret > mounting the secret as a volume in the pod > defining all secret values you will need in the mount > referencing the secrets in env values" seems like a huge goddamn hassle for a pod that's going to have like 10+ secret env variables.

Adbot
ADBOT LOVES YOU

ThePeavstenator
Dec 18, 2012

:burger::burger::burger::burger::burger:

Establish the Buns

:burger::burger::burger::burger::burger:
terraform 0.12 is going to include a lot of features that are lacking in terraform now. It's going to have stronger types, loops, not require everything to be string interpolation, basically make terraform HCL look more like code and not scripts that just assemble strings

New Yorp New Yorp
Jul 18, 2003

Only in Kenya.
Pillbug

Cancelbot posted:

I don't get the Terraform (mild?) hate - We have 24 AWS accounts with over 200 servers managed entirely in Terraform. Recently our on-premises subnets changed and it took all of 10 minutes to roll it out to everyone's code.

The only people I work with who actually hate it are the old infrastructure teams who cry whilst cuddling their old Dell blades.

In my case, it's because the Azure provider sucks and doesn't work very well for the scenarios where my client wants to use it. The Terraform fans I've discussed it with have a surprising tendency to blame the Azure platform instead of accepting that the tooling itself may be lacking or poorly implemented.

Vulture Culture
Jul 14, 2003

I was never enjoying it. I only eat it for the nutrients.

New Yorp New Yorp posted:

In my case, it's because the Azure provider sucks and doesn't work very well for the scenarios where my client wants to use it. The Terraform fans I've discussed it with have a surprising tendency to blame the Azure platform instead of accepting that the tooling itself may be lacking or poorly implemented.
Having written (tiny) parts of the Azure provider, it can definitely be both.

Docjowles
Apr 9, 2009

Bhodi posted:

Anyone have a good aws terraform example config for something like that? We're bringing up a new vpc from scratch and want to switch to autogenerating the subnets, sgs, iam roles and such and I don't really want to fall into any obvious traps.

Any good whitepapers or blogs on this? Like, some vpcs are nearly permanent you might want a different state store than your apps just to prevent accidents? Stuff like that.

Disclaimer: we are still figuring out terraform so everyone please feel free to tell me this is wrong and dumb. But we already did it once, and had everything suck, and have now learned from those mistakes so are hopefully redoing it better.

The community VPC module is pretty solid for stamping out VPCs quickly and uniformly. At lest the VPC itself and subnets and NAT gateways, all the networking pieces. IAM and SG's you're on your own.

The most important trap to avoid is putting all your poo poo into one state file. Just as you said, you don't want to be updating the tag on a dev instance and somehow accidentally delete your whole VPC. TF will only consider .tf files in the current directory as in scope for changes. So if you dump ALL of your code in one directory, the whole thing is going to get evaluated and applied on every run. Instead, break your state up into logical units. We have our git repo broken down like

code:
terraform
|-team1devaccount
  |-vpc
  |-ec2
  |-s3
|-team1prdaccount
  |-vpc
...
You can pick a structure that makes sense for your needs, but for the love of god don't dump all your code into one directory. If you need to reference something built in another directory (like to get the VPC ID to build an EC2 instance) use remote state to query it dynamically.

Charity Majors has an excellent rant on the subject.

Always do a plan, save it to a file, and then apply that plan. That way you're (more likely to be) just applying the changes you really indented to make, and can fully review what it's going to do beforehand.

There are also some nice wrapper tools like Terragrunt that can cut down on the boilerplate poo poo you need to recreate in every directory.

necrobobsledder
Mar 21, 2005
Lay down your soul to the gods rock 'n roll
Nap Ghost
Terraform provisioners and other kinds of hooks-as-resources mechanics are what you should be using instead of custom resources in CloudFormation if at all possible. I wrote a 4 line AWS club shell script that turned into a monstrosity of a lambda function because a resource created within CF needed to be peered with another two VPCs. I recommend no more than one VPC created within a stack as a result of the interlinking mess.

Terraform really shouldn’t be used for application deployments though. Blue green deploys and rolling deploys are actually supported in CloudFormation at least, not some “you get to build this crap again with shell scripts!” solution like in Terraform.

New Yorp New Yorp
Jul 18, 2003

Only in Kenya.
Pillbug

Vulture Culture posted:

Having written (tiny) parts of the Azure provider, it can definitely be both.

Oh yeah, I wasn't claiming otherwise! I've just gotten a lot of weirdly defensive "well it can't be a problem with Terraform, it must be Azure's fault" when discussing issues that I've ONLY experienced while using Terraform, and not the portal, ARM templates, Azure PowerShell, or the Azure CLI.

Warbird
May 23, 2012

America's Favorite Dumbass

The NPC posted:

Is the answerfile included in the package? Can you use Resolve-Path on ./tools/foo.rsp to get the absolute path and pass that in? If you're passing it in through $silentArgs you might have to watch how quotes and variables get escaped too.

Remember, ChocolateyInstall.ps1 is just a Powershell script, so any pre-processing or input sanitation you could do in PS you can do here.

Mother fucker I hate Oracle. So I think I found out what was the problem:

Oracle uses a universal installer to do its, well, installs. Thing is, said installer isn't the first stop on the journey; it's the 12c setup executable in this case (Program A). Chocolatey only knows and cares about Program A and doesn't have the slightest worry in the world about what's going on with the universal installer (Program B). Near as I can tell, A calls B and passes it some info and closes out with a nonstandard exit code, because Oracle. Chocolatey sees the non 0 exit code, assumes everything's FUBAR and starts purging the install files that B needs so things get cleaned up. This causes B to error out randomly depending on what got deleted first. This doesn't happen when doing things "normally" with the same flags via a Powershell window because there's nothing there to really care.

Resolution: Pass a flag to have A hang around until B finishes, resulting in an acceptable 0 exit code.
Possible Alternative Resolution: Have 259 as an acceptable code for this one package and ignore the hang around flag.

I'm starting to understand why everyone I've ever worked with that's been in Ops beyond a certain year range sounds like they hate their jobs/lives.

Cancelbot
Nov 22, 2006

Canceling spam since 1928

freeasinbeer posted:

Small tasks become cumbersome, like making a small tweak is very easy to do in the UI but maybe hard and time consuming in tf. And if you do use tf since it can reconcile global state, it sometimes starts editing things you don’t want. Peering for example is easier to do in the UI across accounts, but then adding those routes in is a nightmare.


Kubernetes also makes it semi-useless, because kubernetes duplicates a ton of or in some cases will fight terraform, and cause things to break.

It’s also always made me nervous with databases like rds, as it thinks it needs to regenerate things sometimes and if you aren’t paying attention oops!

24 accounts is a lot for only 200 servers? We have like 15 and run thousands(although I am coming to understand that this is because devs have no idea what they are running)

We give each team an account and they look after their own resources (fully autonomous woo!). We set hard borders where they can only talk with HTTPS cross accounts unless its old legacy crap like SQL in which case we use PrivateLink/VPC peering only where necessary; this saves on shared state and we don't have to gently caress about with routing or IP clashes unless they need to hit the stupidly monolithic SQL cluster or get to our on-premises infrastructure. Oddly enough for the scale we are I feel as though we are light/efficient on server resource; Our website runs on 4x m4.2xlarge servers as CloudFlare sucks most of it up and our back-end hasn't been fully moved to AWS yet, which is another 200 servers in waiting. We also have some bad habits such as 30 services on a single node which is why we're pushing hard on the ECS end of things.

For Terraform & containers we are having some fun; we did once define tasks & services with Terraform, but the developers love Octopus deploy too drat much and it was breaking all sorts of things. So instead we made enough Octopus plugins for it to play nice with ECS at a level the devs like; "Deploy my app", "Open a port", "Scale up" etc. all Terraform does there is build the cluster using a module.

The idea behind us Terraforming everywhere is that these teams can then own their full stack in as faithful way as we can, and they can help each other out by looking at code instead of trawling through the inconsistent AWS console experience.

Cancelbot fucked around with this message at 15:17 on Feb 22, 2019

Qtotonibudinibudet
Nov 7, 2011



Omich poluyobok, skazhi ty narkoman? ya prosto tozhe gde to tam zhivu, mogli by vmeste uyobyvat' narkotiki
Today in Kubernetes adventures: the mysterious Service that doesn't want to use one of its ports. ELB for the LoadBalancer comes up fine and maps the external ports to the correct nodePort. Traffic arrives successfully at nodes destined for the correct nodePort, but when directed into the Pod container itself, somehow both end up going to the same targetPort, apparently ignoring what's in the actual Service definition. This happens even that port/targetPort combination is flat out removed from the Service, which makes no sense.

Either there's actually some bug in that part of the network provisioning code, or I and a bunch of other people are missing something really dumb in the Service/Deployment definitions.

Docjowles
Apr 9, 2009

Crossposting from the AWS thread since I know there's a bunch of terraform geeks in here

Docjowles posted:

Ok I have my own Route53 question. We were hoping to switch from managing our own internal resolvers to using route53. We created a new private hosted zone with like 1000 records in it using Terraform. It took ages to complete which I kind of expected. But it also takes ages to do a subsequent plan/apply even if there are no changes. Like 15 minutes per no-op run. Which is uh not going to fly for a frequently changing zone.

Anyone found a way to reasonably manage large route53 zones with terraform?

We can come up with other solutions, including just keeping our own resolvers. Or writing a smarter script that calls the API directly and only handles records that actually need to change. It's just super nice to have everything in Terraform for a variety of reasons. But if it's the wrong tool for this job, then oh well.

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb

Docjowles posted:

Crossposting from the AWS thread since I know there's a bunch of terraform geeks in here

I'm curious what it's spending most of the time doing, are you able to tell from the output? If not, maybe need to turn up the logging level.

Cancelbot
Nov 22, 2006

Canceling spam since 1928

fletcher posted:

I'm curious what it's spending most of the time doing, are you able to tell from the output? If not, maybe need to turn up the logging level.

It'll be the "refreshing state" part of the plan. I think Terraform just has a list of "this R53 record should exist here" in its state file, which then fires off a metric poo poo-ton of AWS API calls to verify that is indeed the case. It'll then do a diff based on what is consistent with the AWS state & the new computed state, rather than be smarter by looking at the HCL that changed prior to doing the refresh.

Erwin
Feb 17, 2006

Cancelbot posted:

It'll be the "refreshing state" part of the plan. I think Terraform just has a list of "this R53 record should exist here" in its state file, which then fires off a metric poo poo-ton of AWS API calls to verify that is indeed the case. It'll then do a diff based on what is consistent with the AWS state & the new computed state, rather than be smarter by looking at the HCL that changed prior to doing the refresh.

Regardless of whether the configuration actually changed, it would need to refresh every resource anyway, since it would need to modify any drift.

Docjowles
Apr 9, 2009

Yeah, it’s that. It spends 15 minutes refreshing the state. And we haven’t even imported all the zones we would have in production yet, lol, this is just a subset for a test.

Probably going to end up writing our own tool to do this which isn’t terribly hard. I just always prefer to use popular off the shelf stuff first if possible.

I was wondering if there was some obvious workaround or something since I assume we are not the first team wanting to manage large zones via terraform. But maybe I am uniquely dumb :pseudo:

Vanadium
Jan 8, 2005

You don't have to have all the records in a single terraform config.

Bhodi
Dec 9, 2007

Oh, it's just a cat.
Pillbug
That terraform blog post was amazing, I need more words like that.

Vulture Culture
Jul 14, 2003

I was never enjoying it. I only eat it for the nutrients.
The Route53 API is also pretty hostile if you're not batching requests, and that's something that Terraform's architecture makes really hard.

If you needed to hack around this, instead of route53_record resources, you could consider using local-exec provisioners to call out to awscli or something to keep it out of the state graph. This will work fine for resource creation, but will be murderously slow on deletes if you have a zone with a lot of records.

Hadlock
Nov 9, 2004

Spring Heeled Jack posted:

What are people using to manage/implement secrets in k8s? The built in method of "creating the secret > mounting the secret as a volume in the pod > defining all secret values you will need in the mount > referencing the secrets in env values" seems like a huge goddamn hassle for a pod that's going to have like 10+ secret env variables.

I implemented this in vault, but we're not operationally mature enough to auto roll vault keys, so that got scrapped for helm secrets that is backed by aws key value store and we can lean heavily on IAM instead, but also supports other back ends so we're not fully tied to Amazon.

Nice thing about helm secrets is that it pulls any orchestration out of the container and now it's just depending on those values passed in as env vars, which is a lot more pure 12 factor

Downside to helm secrets is that i haven't gotten it to cleanly helm lint/helm secrets lint yet, but that's more of a time/priority problem than a technical hurdle

Edit: also the other big win was to put as many service dependencies into containers as possible, database, message queue, etc, and then put everything in it's own namespace. That allows you to hard code a lot of things like user names and service dns names (our primary db used to be primarydb-dev1.example.com) but now it's simply "primarydb" and pointing the app at primarydb:5432/primarydb, inside the namespace, resolves to the correct database. Significantly reduces the amount of config, since every namespace can have it's own service with the identical name "primarydb"... We went from 20+ config items down to about 6 for our most complex app (the monolith)

Hadlock fucked around with this message at 10:03 on Feb 27, 2019

StabbinHobo
Oct 18, 2002

by Jeffrey of YOSPOS
how do y'all do your container registry on gcp?

meaning, if you have N environments, each inside their own gcp project... do you have each separate environment trigger off your repo and build their own containers that they put in their own registries in parallel/on-their-own? or do you create like a designated-registry-project that they all just pull from? if they share a registry-project then where are you putting the deploy hook that runs the 'kubectl set image' after the push?

StabbinHobo fucked around with this message at 18:15 on Feb 27, 2019

Hadlock
Nov 9, 2004

StabbinHobo posted:

how do y'all do your container registry on gcp?

We use one great big container repo, each branch has a jenkinsfile that builds a namespace/monolith-branchname:build-id and then fires off whatever webhooks are relevant to that branch/jenkinsfile and/or a helm upgrade -n mydeploy --set monolithVersion="namespace/monolith-branchname:build-id"

SnatchRabbit
Feb 23, 2006

by sebmojo
My company has automated a PeopleSoft linux deployment in AWS using mostly Cloudformation, CodeDeploy and Ansible playbooks. We now have to deploy PeopleSoft on Windows and we're running into some issues with Ansible, mainly that Ansible can't be run on a windows box without a linux control box. I've seen that there are some work arounds like installing Linux Sub System, or cygwin, but I don't know how nice those will work with our current linux playbooks. I'm just curious if anyone has run into a similar problem and either used Ansible or another orchestration tool like Terraform. Any insight would be appreciated.

goatsestretchgoals
Jun 4, 2011

https://www.macrometa.co/blog/why-global-edge-fabric

This is one of those posts where you’re about to quote but then your eyes scan down and you find something dumber. That said, my current favorite is

quote:

What we need is layered model that decouples its various layers - data storage, data organization, data manipulation and data model.

This allows a single database system to support multiple data models against a single, integrated backend. Document, graph, relational, and key-value are examples of data models that may be supported by a multi-model database.

fluppet
Feb 10, 2009

SnatchRabbit posted:

My company has automated a PeopleSoft linux deployment in AWS using mostly Cloudformation, CodeDeploy and Ansible playbooks. We now have to deploy PeopleSoft on Windows and we're running into some issues with Ansible, mainly that Ansible can't be run on a windows box without a linux control box. I've seen that there are some work arounds like installing Linux Sub System, or cygwin, but I don't know how nice those will work with our current linux playbooks. I'm just curious if anyone has run into a similar problem and either used Ansible or another orchestration tool like Terraform. Any insight would be appreciated.

Ansible runs fine under wsl or even docker

Hadlock
Nov 9, 2004

I spun up loki on our test kubernetes cluster and now I don't need to maintain an elastic search cluster just for logging :dance:

Since everything is deployed using a helm chart everything is already correctly tagged and sortable by custom labels, container name, namespace, etc

You can't do full text search on it, but whatever, this serves 99.99% of our developer logging needs

I setup Loki to have 120gb disk to start; we're using 500gb for Prometheus and that seems to be gross overkill, we're going to have 50% disk left over after the first year.

Bruegels Fuckbooks
Sep 14, 2004

Now, listen - I know the two of you are very different from each other in a lot of ways, but you have to understand that as far as Grandpa's concerned, you're both pieces of shit! Yeah. I can prove it mathematically.

goatsestretchgoals posted:

https://www.macrometa.co/blog/why-global-edge-fabric

This is one of those posts where you’re about to quote but then your eyes scan down and you find something dumber. That said, my current favorite is

Holy crap that article is such total bullshit.

Vulture Culture
Jul 14, 2003

I was never enjoying it. I only eat it for the nutrients.

goatsestretchgoals posted:

https://www.macrometa.co/blog/why-global-edge-fabric

This is one of those posts where you’re about to quote but then your eyes scan down and you find something dumber. That said, my current favorite is
this is completely fine, people have been doing "multi-model databases" for generations (SQL/XML was defined way back in SQL:2003 as one early example that appeared in extremely mainstream DB products)

from a product perspective this is probably not where i would start building an ambitious edge-compute database product, but more power to them i guess

Vulture Culture fucked around with this message at 15:27 on Mar 5, 2019

SnatchRabbit
Feb 23, 2006

by sebmojo

fluppet posted:

Ansible runs fine under wsl or even docker

Would the application we install with WSL ansible on the Windows box also run in the WSL or can I install it natively in Windows?

The Fool
Oct 16, 2003


Messing around in my lab and trying to figure out a reliable way to deploy a node app to Windows.

No containers, just a Window Server VM with nothing installed.

I'm using Azure Devops Pipelines, and deploying the code from repo isn't a problem, I'm just not sure how to reliably ensure Node is present and if it is already running, how to reload the app.

Methanar
Sep 26, 2013

by the sex ghost

The Fool posted:

Messing around in my lab and trying to figure out a reliable way to deploy a node app to Windows.

No containers, just a Window Server VM with nothing installed.

I'm using Azure Devops Pipelines, and deploying the code from repo isn't a problem, I'm just not sure how to reliably ensure Node is present and if it is already running, how to reload the app.

Do you really need it to be windows server?

You're paying a pretty non-trivial amount in licensing costs to be running stuff on windows if you don't need to.

The Fool
Oct 16, 2003


Methanar posted:

Do you really need it to be windows server?

I'm attempting to combine a hobby (the node app) with work (Windows Server Admin). If I was doing this 100% for myself I would just build a container on a linux vm and rebuild when I needed to deploy an update.

Zephirus
May 18, 2004

BRRRR......CHK

The Fool posted:

Messing around in my lab and trying to figure out a reliable way to deploy a node app to Windows.

No containers, just a Window Server VM with nothing installed.

I'm using Azure Devops Pipelines, and deploying the code from repo isn't a problem, I'm just not sure how to reliably ensure Node is present and if it is already running, how to reload the app.

Put a devops agent on it

powershell release task

- Install chocolatey
- Choco install nodejs

npm/yarn/whatever to deploy packages or distribute files from build artifacts via artifact download task

JHVH-1
Jun 28, 2002
You can use pm2 to manage the node process and it looks like it has a windows service package to support it. So all you need is to get nodejs and npm on a machine and then install everything via npm

FISHMANPET
Mar 3, 2007

Sweet 'N Sour
Can't
Melt
Steel Beams
I work for basically an MSP that provides basically IaaS to our (internal) customers so devops is kind of tricky for us because there's no dev, only ops. Nonetheless we've automated a ton of stuff with powershell and Azure runbooks, and have written and opensourced a number of Powershell modules that we use all over our code. We ended up hacking together the very barest form of a build "pipeline" before any of us knew what that even meant. Everything is in Github (internal code in our on prem enterprise Github, open sourced in public Github) with everything hooked into an Azure runbook via webhook, and whenever stuff gets updated it gets deployed: either copied to a set of servers to run or copied automatically into the Azure runbook.

Well I've been aware of CI for a little bit now and finally found enough info about building a pipeline for Powershell modules and I spent all day Friday on it and I've now got one of our modules plugged into both Appveyor and Azure Pipelines (used a tutorial that used Appveyor but then found Azure and got that working as well). It doesn't deploy yet but it does run some pretty basic Pester tests to ensure the code is valid Powershell and that it passes PSScriptAnalyzer tests. So now I'm one of you I guess.

Pile Of Garbage
May 28, 2007



FISHMANPET posted:

pipeline for Powershell modules :words:

Nice, sounds pretty cool. Where are the tests being executed, inside containers or some serverless dealio?

swampcow
Jul 4, 2011

Cancelbot posted:

I don't get the Terraform (mild?) hate - We have 24 AWS accounts with over 200 servers managed entirely in Terraform. Recently our on-premises subnets changed and it took all of 10 minutes to roll it out to everyone's code.

The only people I work with who actually hate it are the old infrastructure teams who cry whilst cuddling their old Dell blades.

There are some pitfalls you can run into because either Hashicorp hewed too strictly to bring declarative (iterating over resources sucks, reminds me of early puppet), the language wasn't created very well (doesn't support nested maps, I think this is fixed in 0.12), or there's just pervasive annoying bugs (such as weirdness with variables in nested modules).

For toy setups or rubberstamped environments, Terraform seems perfect. It just starts to suck when you try to add complexity in certain places. One group at my company wrote something to generate Terraform. This gets around some of the pitfalls.

FISHMANPET
Mar 3, 2007

Sweet 'N Sour
Can't
Melt
Steel Beams

Pile Of Garbage posted:

Nice, sounds pretty cool. Where are the tests being executed, inside containers or some serverless dealio?

I... don't know? They execute in whatever Appveyor or Azure Pipelines runs in.

This is the module I've been working with: https://github.com/umn-microsoft-automation/UMN-SCOM/tree/build-pipeline
My yaml files for both Appveyor and Azure are the same, they specify a Windows server image and then call build.ps1 in the Build directory, which calls psake.ps1 which runs the tests.

Mr Shiny Pants
Nov 12, 2012
Let's say I will have some Linux machines that will be deployed around the world with intermittent connectivity ( think satellites ) running docker on some Linux variant.

What would be a good way to keep these under control and up to date?

Would it be possible to just run regular docker on them and push updates using something like Ansible?

minato
Jun 7, 2004

cutty cain't hang, say 7-up.
Taco Defender
I think it comes down to splitting the update into 2 phases:
1) Ensure the desired state description and necessary assets are on the machine.
2) Initiate a server-local process to update the desired state from those local assets.

1 can be retried until successful with no side effects, which mitigates the issue of poor or intermittent connectivity. And 2 is likely to succeed because all the assets are in place and no connectivity is required.

Adbot
ADBOT LOVES YOU

Mr Shiny Pants
Nov 12, 2012

minato posted:

I think it comes down to splitting the update into 2 phases:
1) Ensure the desired state description and necessary assets are on the machine.
2) Initiate a server-local process to update the desired state from those local assets.

1 can be retried until successful with no side effects, which mitigates the issue of poor or intermittent connectivity. And 2 is likely to succeed because all the assets are in place and no connectivity is required.

So something like: rsync the required files or updates and update from the local files?

Is this something I need to build myself or are there some nice utilities available, I can't be the only one dealing with something like this.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply