Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
necrobobsledder
Mar 21, 2005
Lay down your soul to the gods rock 'n roll
Nap Ghost
I can’t be the only one that’s used CF and TF extensively but they’re not mutually exclusive to me and there’s a lot of good that can happen when using TF for basic infrastructure scaffolding and CF for more application-centric management. ASG management with CF is miles better than Terraform because CF implies a hermetically sealed state file that’s remote so they can do some stuff better. Stacksets are oftentimes easier to work with than TF modules and given how cross-account provisioning is easier with CF than TF as such is almost making me pine for CF as I’m working on a large TF codebase now.

Being able to have a lot of tooling that generates CF for you is one advantage that CF does have though, and TF provides you with more helper functions to help overcome the disadvantages of HCL. But for the most part, I’ve written CF with Troposphere for the better part of my life with CF and it’s just easier to deal with than getting frustrated at the lack of loops in CF. Meanwhile, looping in TF is hilariously anemic and reminds me of when I had to deal with the early days of Puppet (it sucked and I moved onto Chef quickly is what happened while Puppet worked for a new set of features).

I still very much prefer TF’s plan / apply cycle than using Sceptre to generate a changeset and retrieve it and parse it and display it and go deep into each of the properties for what changed and wondering wtf actually changed.

And if you’re trying to do larger migrations with CF where you just want to rename things or similar, you’ll be begging and praying for TF import and state operations from CF.

Adbot
ADBOT LOVES YOU

Nomnom Cookie
Aug 30, 2009



Note that CF added changesets five years after GA. AWS seems to base their GA point on the soonest time they can feasibly start charging for something without getting sued, rather than what would make the service "good" or "usable". Any time AWS pushes out a new thing that looks cool and useful, I make two lists: ways thing could benefit us, and features thing needs before it can be considered for production. EKS clusters were completely immutable at GA, for example. Point release updates required a new cluster. It also didn't support metrics-server.

necrobobsledder
Mar 21, 2005
Lay down your soul to the gods rock 'n roll
Nap Ghost
Oh yeah, CF 5 years ago vs CF now is hilarious. Another reality is that you can go from CF to TF and the opposite direction is simply not true. There were some vague stirrings of importing resources into CF or letting CF scan your infrastructure and generate stacks live that can manage them but those efforts seem to have died on the vine.

Maybe Azure and GCP have better offerings worth a try?

Docjowles
Apr 9, 2009

Kevin Mitnick P.E. posted:

Note that CF added changesets five years after GA. AWS seems to base their GA point on the soonest time they can feasibly start charging for something without getting sued, rather than what would make the service "good" or "usable". Any time AWS pushes out a new thing that looks cool and useful, I make two lists: ways thing could benefit us, and features thing needs before it can be considered for production. EKS clusters were completely immutable at GA, for example. Point release updates required a new cluster. It also didn't support metrics-server.

God yeah. We were excited to see they were rolling out managed Kafka, but it currently has a hilariously bad list of limitations that immediately ruled it out for us. Like you can't scale the cluster at all, and it has no support for Kafka's security and authentication features. Granted it's only in preview and not GA yet. But I'm sure at least some of those blockers will remain when it does go to production.

LochNessMonster
Feb 3, 2005

I need about three fitty


Pretty happy I stumbled into this thread and read about the Caddyserver. I was trying to build a forwarding proxy for some legacy apps that aren't maintained anymore but need to send data to an API that (rightfully) only takes data from an cert based authenticated source. I was trying to do this with nginx but found out that it doesn't play nice with https when trying to do this. So someone build an nginx mod for this: https://github.com/chobits/ngx_http_proxy_connect_module

Caddyserver sounds like a way better alternative. I'm going to give that a go.

JHVH-1
Jun 28, 2002
If you got nothing to do today you can spend it reading the ansible 2.8 changelog https://github.com/ansible/ansible/blob/stable-2.8/changelogs/CHANGELOG-v2.8.rst

Docjowles
Apr 9, 2009

JHVH-1 posted:

If you got nothing to do today you can spend it reading the ansible 2.8 changelog https://github.com/ansible/ansible/blob/stable-2.8/changelogs/CHANGELOG-v2.8.rst

lmao jesus christ

chutwig
May 28, 2001

BURLAP SATCHEL OF CRACKERJACKS

JHVH-1 posted:

If you got nothing to do today you can spend it reading the ansible 2.8 changelog https://github.com/ansible/ansible/blob/stable-2.8/changelogs/CHANGELOG-v2.8.rst

Unironically excited about collections and interpreter auto discovery, but in typical Ansible style they won’t be stable for another 3 releases and will have permanent weird behavior at the margins that everyone just gets used to.

12 rats tied together
Sep 7, 2006

The changes to attributes of undefined j2 objects are going to clean up a lot of bullshit, changes to the k8s set of modules are also really nice.

e: the porting guide for a release is usually a better reference than the full changelog.

necrobobsledder
Mar 21, 2005
Lay down your soul to the gods rock 'n roll
Nap Ghost
Every time I see Ansible release notes I feel better about using Saltstack instead. The start-up cost is higher but the lack of BS during maintenance throughout the lifecycle of my infrastructure is well worth it. Also, nothing like Salt Reactor built into Ansible either and it’s been handy from time to time.

tortilla_chip
Jun 13, 2007

k-partite
Is it still just netmiko proxy support for network devices that aren't run at Cloudflare?

12 rats tied together
Sep 7, 2006

necrobobsledder posted:

Also, nothing like Salt Reactor built into Ansible either and it’s been handy from time to time.

Thanks for the note, this is a really cool feature.

I have a PoC running at work of cloudwatch events -> awx which is similar, but requires you to run awx, and have a cloudwatch events bus. Considering awx as a part of ansible gives it the ability to compete with something like salt, puppet, or chef, but it also increases the setup burden quite a bit which is generally considered to be one of the huge advantages of ansible. You still don't have to install an agent though.

This release for ansible though is only good things -- there are no breaks in functionality, only the removal of some features that have been deprecated for a bunch of versions now. The only time ansible releases have given me trouble is 1.9.x to 2.x, otherwise they have all been really painless.

FlapYoJacks
Feb 12, 2009
I just figured out how to start a gitlab-client docker image, start docker within that container, then pull another image from AWS/ECR, and start THAT image with systemd, allowing me to run some unit tests within docker + systemd.

Not too shabby.

crazypenguin
Mar 9, 2005
nothing witty here, move along

necrobobsledder posted:

There were some vague stirrings of importing resources into CF or letting CF scan your infrastructure and generate stacks live that can manage them but those efforts seem to have died on the vine.

I ran into this recently. Still trying to figure out how I'm going to bring this "built by clicking" environment under the control of CF.

I'm told this feature isn't dead and is coming, but who knows.

Hadlock
Nov 9, 2004

What is the likelihood that terraform 0.12 with it's templating and whatnot will leave RC status by July 1? Looks like it exited Beta 2 weeks ago.

Spring Heeled Jack
Feb 25, 2007

If you can read this you can read
Given the non-existance of a k8s thread, I'll ask here. Is anyone using an oauth2 proxy with nginx ingress? I'm attempting to setup external auth for a public status page and after 2 days of troubleshooting the ingress I have to say it doesn't seem to like me.

Ingress config: https://pastebin.com/TPajxj2H

If I browse to https://sub.domain.com/oauth2 directly I get a prompt to authenticate as expected. However browsing to https://sub.domain.com/service/hangfire gives me a 504 gateway error at the ingress. Taking out the auth annotations makes the public page load fine. I strongly suspect something is wrong with my ingress config but I don't know enough about nginx to say what.

Vulture Culture
Jul 14, 2003

I was never enjoying it. I only eat it for the nutrients.

Spring Heeled Jack posted:

Given the non-existance of a k8s thread, I'll ask here. Is anyone using an oauth2 proxy with nginx ingress? I'm attempting to setup external auth for a public status page and after 2 days of troubleshooting the ingress I have to say it doesn't seem to like me.

Ingress config: https://pastebin.com/TPajxj2H

If I browse to https://sub.domain.com/oauth2 directly I get a prompt to authenticate as expected. However browsing to https://sub.domain.com/service/hangfire gives me a 504 gateway error at the ingress. Taking out the auth annotations makes the public page load fine. I strongly suspect something is wrong with my ingress config but I don't know enough about nginx to say what.
Post the generated Nginx config

post hole digger
Mar 21, 2011

I'm working on learning myself a little terraform for aws management and right now I am trying to taint an ec2 instance to kick off the recreation of some server infrastructure, including a baseline ami created from the snapshot of said ec2 instance. the ec2 instance has the resource aws_ami_from_instance applied to it after it is launched. Basically, when I taint the ec2 instance aws_instance.my_instance, I see the following objects be recreated:

aws_ami_from_instance.my_golden_ami

aws_scaling_auto_group.my_asg

aws_elb.my_elb

aws_instance.my_instance

aws_launch_configuration.my_lc

Everything goes fine once I taint and re-apply except for the aws_ami_from_instance.

my aws_ami_from_instance config looks like this

code:
resource "random_id" "golden_ami" {
  byte_length = 3
}

resource "aws_ami_from_instance" "my_golden_ami" {
  name               = "golden_ami-${random_id.golden_ami.b64}"
  source_instance_id = "${aws_instance.my_instance.id}"

Once it gets there, I get hit with:



code:
Error: Error applying plan:

1 error(s) occurred:

* aws_ami_from_instance.my_golden_ami: 1 error(s) occurred:

* aws_ami_from_instance.my_golden_ami: InvalidAMIName.Duplicate: AMI name golden_ami-CedV is already in use by AMI ami-01fffffffffffffff
Which doesn't make a ton of sense... shouldn't it delete the old ami first and then create a new one? I've tried using a static name (rather than the random one I have now) and tainting the golden AMI too, but neither one of those moves gets me any closer to it working. They both error out on the same thing.

Any idea what I'm doing wrong? From the documentation, it seems like this should work:

quote:

Note that the source instance is inspected only at the initial creation of this resource. Ongoing updates to the referenced instance will not be propagated into the generated AMI. Users may taint or otherwise recreate the resource in order to produce a fresh snapshot.
So I am hoping that this is just simple learning curve mistake with how I am trying to re-create it.

Spring Heeled Jack
Feb 25, 2007

If you can read this you can read

Vulture Culture posted:

Post the generated Nginx config

I believe this should be all for the subdomin I'm using: https://pastebin.com/rpRbQPeA

I found the solution buried in an old closed github issue. For the auth URL I needed to use the internal service address: nginx.ingress.kubernetes.io/auth-url: "http://oauth2-proxy.development.svc.cluster.local:4180/oauth2/auth

Spring Heeled Jack fucked around with this message at 18:39 on May 23, 2019

Vulture Culture
Jul 14, 2003

I was never enjoying it. I only eat it for the nutrients.

my bitter bi rival posted:

I'm working on learning myself a little terraform for aws management and right now I am trying to taint an ec2 instance to kick off the recreation of some server infrastructure, including a baseline ami created from the snapshot of said ec2 instance. the ec2 instance has the resource aws_ami_from_instance applied to it after it is launched. Basically, when I taint the ec2 instance aws_instance.my_instance, I see the following objects be recreated:
Have you verified that the AMI is actually gone at the time you receive this error? Many AWS object types respond to delete operations by marking as deleted, then garbage-collecting later.

post hole digger
Mar 21, 2011

Vulture Culture posted:

Have you verified that the AMI is actually gone at the time you receive this error? Many AWS object types respond to delete operations by marking as deleted, then garbage-collecting later.

So that seems to be the problem. It's not actually gone, it's not even marked as deleted. It seems like instead of deleting it and creating a new one, Terraform just tries to create a new AMI with the same name (instead of deleting the old one and then making the new one). I can't figure out why or what I can do to get it to work that way. Maybe my whole workflow is just wrong...

12 rats tied together
Sep 7, 2006

The dependency chain that you posted is a little strange, one of the nice things about working in AWS is that a lot of these things are decoupled from each other. I usually see people have an AMI build process that is totally separate from their instance provisioning process, for example, and you don't need to recreate your ELB and ASG every time you recreate an instance.

I think a more typical workflow would be that you'd have one terraform state file that manages your "create ami from instance snapshot" setup. You can use the aws_instances data source to query your ASG for any of the instances in it, assuming they are identical, and then use the returned instance id as a source for your aws_ami_from_instance resource; In this state file you'd probably also want to have an output for your created ami's id.

Back in your main state file, you can consume the output from your ami management state file by using the terraform_remote_state data source, use the ami id as an input for your autoscaling group's launch configuration.

To "roll a new ami" you would have to do something like:
    - terraform taint your aws_ami_from_instance resource in your ami state
    - terraform apply to grab an instance from your ASG and create a new ami from it
    - terraform get -update in your main state to pick up the newly output ami id from the ami state
    - terraform apply in your main state to apply your new ami id to your configuration

The only changes that should need to happen for this are a replacement of the launch configuration, and an update (no replacement) to the ASG to configure it to use the new launch configuration. The last time I worked with terraform and ASGs, you had to roll new instances on the group yourself, there was no analogue to CloudFormation's UpdatePolicy ASG attribute.

You can probably do something with a local-exec creation provisioner on your launch configuration resource that changes your ASG's desired capacity to 0, waits for your nodes to disappear, and then sets it back up to whatever you configured in terraform.

e: added links

12 rats tied together fucked around with this message at 18:09 on May 23, 2019

12 rats tied together
Sep 7, 2006

I'm sorry -- I didn't read your post enough and was too focused on the weirdness about recreating the ELB and ASG resources. It's more likely that your random id resource is not being deleted and recreated when you taint -- I believe that the random provider has a "keepers" concept for this use case.

Generally you want to let terraform uniquely name everything it can but it seems like the aws_ami_from_instance resource is weird in this case as it does not have a name_prefix attribute and it _requires_ that you specify a name. You can also probably taint your random id resource along with the instance, if you don't want to use keepers.

post hole digger
Mar 21, 2011

Thank you so much for your helpful advice. I am in the pretty early stages of learning how Terraform works, so I am starting to understand how Terraform works but don't know the theory/best practices for designing my workflows yet. That info is all very useful for me.

12 rats tied together
Sep 7, 2006

This comes up a lot with terraform but it runs into the (recently very common) trap of trying to provide a purely declarative interface, which works great until something doesn't behave as expected, and then you need to wrap your mind around where your expectations differed from terraform's and then unroll your probably somewhat complicated declarations into something terraform can manage intelligently.

In my experience you split up your state files for primarily 2 reasons: One, terraform tells you do, and it takes like 15+ minutes to run a terraform plan with >300 resources which is barely acceptable. Two, having two different terraform states gives you a really obvious and explicit "this, then that" interface, which you can use to isolate dependencies or provide some kind of inheritance chain or similar logical (ideally intuitive) topology.

The example of "create an ami" and "use an ami" is a really good example of a producer state and a consumer state, identifying which parts of your terraform stack produce shared resources vs which parts of your stack consume those resources is a great first step towards taking your terraform out of "can tweak the examples" territory into "can talk about terraform in a job interview" territory.

After splitting up producers vs consumers you could also think about how you'd organize repository growth: what if you needed to build a second type of ami, where would that go? What if you needed another ASG that used the same ami? What if you needed to build both amis and ASGs in a new AWS region, or a different AWS account?

A feature in 0.12 that I'm really excited about is the ability to pass an entire module to another module:
code:
module "consul_cluster" {
  source = "./modules/aws-consul-cluster"

  network = module.network
}
I definitely recommend you experiment with this feature asap, doing this was indescribably worse for the past ~3 years.

The last pending question for me was, given the linear-to-exponential growth of terraform state folders, how do you manage the ordering and execution of apply actions across them? That's pretty much when I gave up on the tool and switched to ansible + cloudformation, but you could totally still use ansible as an orchestrator for your terraform states as well. There are also a couple of third party tools I'm aware of such as terragrunt and pulumi which might have answers for you here too.

LochNessMonster
Feb 3, 2005

I need about three fitty


Probably a stupid question but is anyone familair with the placement preferences and constraints of a swarm cluster.

What I’d like to do is place a specific service with nust 1 container always on host1 of our dev swarm cluster. If host1 is not available it can run on any other node. Preferably it gets placed back on host1 when it’s up again but I’m not bothered if that last step doesn’t happen until the container is killed.

Something like this should do that right? Deploy to any swarm host with the dev label and try to place it on host1 if possible.


YAML code:
deploy:
      placement:
        constraints: 
          - node.labels.type == dev
        preferences:
          - spread: unique.label.host1

Docjowles
Apr 9, 2009

Hadlock posted:

What is the likelihood that terraform 0.12 with it's templating and whatnot will leave RC status by July 1? Looks like it exited Beta 2 weeks ago.

Oh god, it actually came out last week :stare: Was traveling and missed the announcement. Never thought we'd see the day.

This is gonna be a lot to take in. lmao at this being a 0.01 version bump.

NihilCredo
Jun 6, 2011

iram omni possibili modo preme:
plus una illa te diffamabit, quam multæ virtutes commendabunt

LochNessMonster posted:

Probably a stupid question but is anyone familair with the placement preferences and constraints of a swarm cluster.

What I’d like to do is place a specific service with nust 1 container always on host1 of our dev swarm cluster. If host1 is not available it can run on any other node. Preferably it gets placed back on host1 when it’s up again but I’m not bothered if that last step doesn’t happen until the container is killed.

Something like this should do that right? Deploy to any swarm host with the dev label and try to place it on host1 if possible.


YAML code:
deploy:
      placement:
        constraints: 
          - node.labels.type == dev
        preferences:
          - spread: unique.label.host1

That sounds almost exactly like the use case described in the Docker Compose reference. Is it not working as expected?

LochNessMonster
Feb 3, 2005

I need about three fitty


NihilCredo posted:

That sounds almost exactly like the use case described in the Docker Compose reference. Is it not working as expected?

Need to see for myself but haven’t had time to test yet. Coworker has been fiddling with this for a few days and claims this isn’t working. Figured I ask here if I’m missing something obvious or not.

Warbird
May 23, 2012

America's Favorite Dumbass

Me: We should set up versioning for our Choco package development and not just have everything just on a public network share. Also a pipeline to compile and insert these compiled packages into Artifactory would be a good idea instead of everything being manual.

Everyone else: gently caress off.


I’m starting to think a career change may be in order.

Helianthus Annuus
Feb 21, 2006

can i touch your hand
Grimey Drawer
ya, get a better job

but in the meantime, set that automation up regardless, and change the access creds so your devs cant keep doing it the dumb old way

qsvui
Aug 23, 2003
some crazy thing
why bother, gently caress em

JHVH-1
Jun 28, 2002
This is neat. Was watching aws twitch stream and never heard of it before. https://github.com/awslabs/aws-cdk

Che Delilas
Nov 23, 2009
FREE TIBET WEED

JHVH-1 posted:

This is neat. Was watching aws twitch stream and never heard of it before. https://github.com/awslabs/aws-cdk

Couple people in my company have been using it and like it so far (they're using C#). I doubt they've run into the hardest edges yet but it seems promising.

xpander
Sep 2, 2004

JHVH-1 posted:

This is neat. Was watching aws twitch stream and never heard of it before. https://github.com/awslabs/aws-cdk

Another team at my place of employment uses it and speaks the world of it. My own crew isn't quite ready to put all our eggs in the CDK basket yet as it can't guarantee forward-compatibility at this point, but we're watching it with much interest.

LochNessMonster
Feb 3, 2005

I need about three fitty


quick test on the swarm stack deploy, below the exact compose file I'm testing with.

code:
version: '3.7'
services:
  preftest:
    image: nginx:alpine
    deploy:
      placement:
        preferences:
          - spread: node.labels.rack == 1
        constraints:
          - node.labels.acceptance == true
It will ignore the preference comletely and deploy to any node that matches the constraint. The node it deployed to did not have a rack label at all and the node that does h ave the rack label (with value 1) was up and running and not starved for resources.

When removing the constraint it'll deploy to any node in the cluster.

Tried to switch the order fo preferences/constraints but it has the same result. Either I'm missing something obvious or the preferences are not working properly. I've removed the label entirely and readded it without value and changed the placement preference to spread: node.labels.rack == true but that didn't change anything either. The strange thing about the label is that a docker node ls -f "Label=rack" doesn't return anything either. It might be that the label is not being recognized or something, but it certainly shows up when doing a node inspect.

edit:

Saw a PR concerning documentation about this feature on github mentioning that nodes assigned no label are being treated as if it had the label but with no value assigned. So I assigned rack=false values to all other nodes in the cluster and it still placed the container on other nodes than the preferred one. Adding a constraint to rack=1 to the compose file work properly, so it's not a label thing.

The preference functionality just seems to be not working at all.

edit 2:

Figured that "spread" might only work for n > 1 and gave some other nodes the true value for this label but that didn't change any thing at all.

LochNessMonster fucked around with this message at 13:25 on May 31, 2019

Spring Heeled Jack
Feb 25, 2007

If you can read this you can read
Looking for a little feedback on this, we're currently doing the monorepo thing for all of our newer services we deploy to k8s. Right now when a change is committed and a build is triggered, all services (including those unaffected by the code change) are rebuilt and deployed to our dev environment together.

Recently some of our devs went to Build and got the feeling from the MS engineers that it would be a better practice to trigger builds only for the affected services, and deploy them separately as well. We have like ~10 services currently, so not too much to deal with, however as we modernize some of our legacy apps/introduce new things this will grow.

What is everyone here doing in this regard?

minato
Jun 7, 2004

cutty cain't hang, say 7-up.
Taco Defender
Once a monorepo gets to a significant size, you need lots of extra support and tooling to use it effectively. You've just touched the tip of the iceberg. Google & Facebook have written some papers on their monorepos, it's worth a read about all the interesting effects of scale.

E.g. when a changeset lands, what is the dependency tree of build artifacts that need to be rebuilt? Solution: Google built Bazel for this reason, Facebook built Buck. (From experience, Bazel is a world of hurt - the intention and design is great, but the implementation and docs are just horrible)

E.g. What happens when your commit rate is really high?
- Effect 1: you can't land fast enough because the code is moving so fast. Solution: write a system that queues PRs and tries to automatically land them, using automatic rebasing/merging where possible. This will involve integration with whatever code review workflow you're currently using, since devs may need to get involved if a rebase can't be done automatically.
- Effect 2: your build systems can't keep up with the commit rate. Solution: batch (say) 20 PRs together into a single, and if the build passes all tests then you can rubber stamp them collectively. Otherwise automatically bisect to find the problematic commit(s).

E.g. What happens when your monorepo gets really big and pulls/clones take forever? Solution: Google implemented a goddamn virtual filesystem that would only pull files on demand. Facebook implemented a sparse-checkout system in Mercurial where the developer would clone via a "profile" that indicated the subset of files to actually clone.


If you're a relatively small shop, then you've got to cap the size on your monorepos because you won't have the tooling and support to do all the above.

Spring Heeled Jack
Feb 25, 2007

If you can read this you can read
I would say it will be a good thing if we start to have those problems. The repo size is like maybe 50mb total at this point and we have maybe 3-4 devs total working from it.

Adbot
ADBOT LOVES YOU

minato
Jun 7, 2004

cutty cain't hang, say 7-up.
Taco Defender
To answer your specific question: to determine what builds need to be made based on what files changed, you can either come up with some simple directory convention (e.g, a change under /foo means rebuild Foo, and a change under /shared means rebuild everything), or use something like Bazel/Buck which will tell you exactly what needs to be rebuilt because it accounts for dependencies for every file in the repo. I would try the former before the latter... it's an expensive investment.


Another scaling problem I forgot to mention: access control. Either all devs can see everything, or you've got to have automation that rejects PRs from folks trying to touch files they don't have permission to modify, or if there are files that some devs shouldn't even see then you need a bonkers sparse-filesystem solution that prevents them from cloning them at all. Fun!

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply