Continuous Integration/build engineering/devops thread

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Continuous Integration/build engineering/devops thread

«‹›158 »

Pile Of Garbage: May 28, 2007

abigserve posted:

Azure integrates very well with Terraform now and it's good an insane plugin for vscode apparently (I use intellij personally so I can't vouch but some swear by it).

Yeah the Terraform extension for VS Code is quite useful. It gives you in-line reference details for each data/resource section. Also you can Ctrl+Click resource names and it takes you straight to the doco page for that resource. Plus the usual stuff like auto-completion and syntax highlighting. https://marketplace.visualstudio.com/items?itemName=mauve.terraform

# ? Jan 3, 2019 08:10

Adbot: ADBOT LOVES YOU

# ? Jun 5, 2024 03:20

Hadlock: Nov 9, 2004

How do you go about publishing stuff publicly? We rely on activemq with our monolith, and it appears to be the only queue server without a Prometheus exporter, so I had to write one. JMX exporter exists but due to the random port assignment after negotiation on standard ports, makes it especially useless in a container world; even packaged inside the container the jmx exporter doesn't expose a lot of things directly.

This seems sufficiently generic that I can release it publicly, my boss seems on board with the idea. Never released a work project like this though.

# ? Jan 3, 2019 10:28

12 rats tied together: Sep 7, 2006

abigserve posted:

Other benefits are;
- It interacts very well with Vault, so you can do things like store S3 keys straight into the password vault without having to ever see them
- The state file is JSON so you can easily scrape it for information (within reason, remember there's secrets in there)
- it plays remarkably nicely with existing resources, that is, you don't have to terraform your entire environment but you can still use it without fear of blowing poo poo away.

I'm actually incredibly disappointed with the vault integration as a feature. When using ansible-vault we have the secret obfuscated throughout the entire lifecycle -- it gets pulled out of a vault file with a psk and included as a cloudformation or alicloud ros parameter with NoEcho: True. Persisting the secrets in plaintext is not totally untenable but it's just yet another reason to not use terraform, unfortunately. The state file schema is totally loving insane btw, if you've ever had to write automation for it (we have a "with_tfstate" ansible plugin at $current_job).

Point 3 is fair as long as you take it to "and eventually you can import existing resources into terraform (and then get to worry about blowing poo poo away)". Otherwise using automation for some stacks but not all isn't terribly specific to Terraform.

abigserve posted:

I echo the sentiment that I would absolutely one hundred million percent not waste a second of time on any cloud specific deployment language.

I'm curious how many people echoing this sentiment have actually gone multi cloud, and which providers if so? The terraform repo we have at $current-job was pushing something like 5 aws accounts, probably ~20 regions across them, 3-4 regions in 2 alibaba cloud accounts, and then some experiments with other providers like the ns1 provider (which ended up being buggy as poo poo).

I thought it was still trash. It's not like the DSL is cloud agnostic -- in alicloud you still need to set PrePaid/PostPaid, vswitch id instead of subnet id, data_drives instead of block device mappings, and other cloud specific parameters. You still need different providers (one for each region, even!), and you still can't interpolate any dynamic values in a provider config for some reason. Even with AWS <--> Alicloud where one provider is essentially a copy paste of the other this is way harder to handle in Terraform than it is in any other applicable tooling.

Every relevant cloud provider (even openstack if you want to squint really hard at what 'provider' means) provides document based orchestration tooling. Any weirdness this tooling exposes is generally going to be a reflection of weirdness in the provider API itself, for example, vpc_security_group_ids vs security_groups, ec2 classic vs main vpc vs any other vpc -- Terraform is never going to handle these for you automatically, and the gotchas are generally lifted straight out of RunInstances.

If you're going multi cloud, or if you want some facsimile of true agnostic, you unfortunately still actually need to learn the cloud providers. In my experience learning the providers is learning whatever version of CloudFormation they have, so layering Terraform on top of that has historically been a huge waste of effort for me when other tooling exists with a superset of the functionality and generally way less restrictions on where I can put a for loop.

# ? Jan 3, 2019 10:48

Docjowles: Apr 9, 2009

Hadlock posted:

How do you go about publishing stuff publicly? We rely on activemq with our monolith, and it appears to be the only queue server without a Prometheus exporter, so I had to write one. JMX exporter exists but due to the random port assignment after negotiation on standard ports, makes it especially useless in a container world; even packaged inside the container the jmx exporter doesn't expose a lot of things directly.

This seems sufficiently generic that I can release it publicly, my boss seems on board with the idea. Never released a work project like this though.

This is going to vary a lot according to company policy (if they have even thought through what the policy is before) and what you�ve signed in your employment agreement. In my case, I basically go convince the most senior exec in the engineering org that the code is useful but doesn�t represent anything super amazing that truly differentiates our company, isn�t a trade secret, etc. He does some vetting of his own. Assuming he says OK I push it to our company�s public GitHub org. The company owns the code, though I�m welcome to have my name in the readme or whatever.

# ? Jan 3, 2019 14:22

StabbinHobo: Oct 18, 2002; by Jeffrey of YOSPOS

cloudformation being terrible and terraform being worse (yep) is a lot of the reason i'm going in hard on k8s.

it doesn't completely solve the problem, but it removes 80-90% of it from being that layer's concern. so you can worry less about how much your tool of choice for that layer sucks.

# ? Jan 3, 2019 15:11

ThePeavstenator: Dec 18, 2012; Establish the Buns

abigserve posted:

- The state file is JSON so you can easily scrape it for information (within reason, remember there's secrets in there)

The funny thing is that the docs specifically called out not doing this until very recently because the statefile implementation was supposed to be opaque.

This of course was ignored and now HashiCorp is stuck supporting it with backwards comparability :v:

# ? Jan 3, 2019 15:46

Blinkz0rz: May 27, 2001; MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

StabbinHobo posted:

cloudformation being terrible and terraform being worse (yep) is a lot of the reason i'm going in hard on k8s.

it doesn't completely solve the problem, but it removes 80-90% of it from being that layer's concern. so you can worry less about how much your tool of choice for that layer sucks.

Removes it from a lintable static-ish template language and puts it in a yaml hellscape from which you can never escape.

# ? Jan 3, 2019 23:57

StabbinHobo: Oct 18, 2002; by Jeffrey of YOSPOS

etc has always been hell at least its in mostly one place now

# ? Jan 4, 2019 00:12

abigserve: Sep 13, 2009; this is a better avatar than what I had before

ThePeavstenator posted:

The funny thing is that the docs specifically called out not doing this until very recently because the statefile implementation was supposed to be opaque.

This of course was ignored and now HashiCorp is stuck supporting it with backwards comparability

hashicorp say a lot of things that you shouldn't take at face value. These are the guys that state, in big bold letters in the getting started guide for Vault, "never use static client tokens!", then in the Terraform Vault guide start with "first, provision a static token from Vault"

12 rats tied together posted:

I'm actually incredibly disappointed with the vault integration as a feature. When using ansible-vault we have the secret obfuscated throughout the entire lifecycle -- it gets pulled out of a vault file with a psk and included as a cloudformation or alicloud ros parameter with NoEcho: True. Persisting the secrets in plaintext is not totally untenable but it's just yet another reason to not use terraform, unfortunately. The state file schema is totally loving insane btw, if you've ever had to write automation for it (we have a "with_tfstate" ansible plugin at $current_job).

Point 3 is fair as long as you take it to "and eventually you can import existing resources into terraform (and then get to worry about blowing poo poo away)". Otherwise using automation for some stacks but not all isn't terribly specific to Terraform.

I'm curious how many people echoing this sentiment have actually gone multi cloud, and which providers if so? The terraform repo we have at $current-job was pushing something like 5 aws accounts, probably ~20 regions across them, 3-4 regions in 2 alibaba cloud accounts, and then some experiments with other providers like the ns1 provider (which ended up being buggy as poo poo).

I thought it was still trash. It's not like the DSL is cloud agnostic -- in alicloud you still need to set PrePaid/PostPaid, vswitch id instead of subnet id, data_drives instead of block device mappings, and other cloud specific parameters. You still need different providers (one for each region, even!), and you still can't interpolate any dynamic values in a provider config for some reason. Even with AWS <--> Alicloud where one provider is essentially a copy paste of the other this is way harder to handle in Terraform than it is in any other applicable tooling.

Every relevant cloud provider (even openstack if you want to squint really hard at what 'provider' means) provides document based orchestration tooling. Any weirdness this tooling exposes is generally going to be a reflection of weirdness in the provider API itself, for example, vpc_security_group_ids vs security_groups, ec2 classic vs main vpc vs any other vpc -- Terraform is never going to handle these for you automatically, and the gotchas are generally lifted straight out of RunInstances.

If you're going multi cloud, or if you want some facsimile of true agnostic, you unfortunately still actually need to learn the cloud providers. In my experience learning the providers is learning whatever version of CloudFormation they have, so layering Terraform on top of that has historically been a huge waste of effort for me when other tooling exists with a superset of the functionality and generally way less restrictions on where I can put a for loop.

We have systems deployed across the big 3 but we're primarily an AWS shop. All of your points are valid but I feel like the benefit of a consistent format and process for declaring and instantiating resources is worth the tradeoffs of narrower support and some teething issues like a lack of for loops. It also makes it easier to plan as a devops person because you don't have to have tooling to handle deploying aws, azure and gcp macguffins but rather a single pipeline that runs terraform.

# ? Jan 4, 2019 00:46

fletcher: Jun 27, 2003; ken park is my favorite movie; Cybernetic Crumb

Docjowles posted:

This is going to vary a lot according to company policy (if they have even thought through what the policy is before) and what you�ve signed in your employment agreement. In my case, I basically go convince the most senior exec in the engineering org that the code is useful but doesn�t represent anything super amazing that truly differentiates our company, isn�t a trade secret, etc. He does some vetting of his own. Assuming he says OK I push it to our company�s public GitHub org. The company owns the code, though I�m welcome to have my name in the readme or whatever.

In addition to the senior exec, also run it by legal

# ? Jan 4, 2019 01:35

FlapYoJacks: Feb 12, 2009

I finally fixed our Jenkins server. I had originally set it up to just poll bitbucket, which was fine when we had one project.

Once it got to over 10 projects, polling was hanging. I should have just setup a webhook to begin with, but :effort:

# ? Jan 4, 2019 14:51

StabbinHobo: Oct 18, 2002; by Jeffrey of YOSPOS

now watch your webhooks randomly silently mysteriously fail once in awhile until you cave and go back to polling

# ? Jan 5, 2019 02:15

Spring Heeled Jack: Feb 25, 2007; If you can read this you can read

Those using docker swarm, what do you do to deploy/update services in a pipeline? We have build, tag, and push to our private repo in bamboo which then creates a release in octopus.

What I�m mostly seeing is to ssh into a manager node and run the stack deploy command from there, which is fine but it seems like there should be a better way. (Or there would be a better way by now if k8s hadn�t eaten its lunch).

# ? Jan 9, 2019 04:28

New Yorp New Yorp: Jul 18, 2003; Only in Kenya.; Pillbug

I've been translating my ARM template for this project into Terraform. Here are my Terraform impressions:

- I like the syntax better than ARM syntax. This is the only thing I like.
- I don't understand the point of a state file. It creates something that's generated by interrogating Azure, but won't work without it? And now I have some extra file generated at deployment time that I have to worry about synchronizing and sharing between deployments? Maybe I'm missing something, but it seems really pointless.
- Creating the resource I need takes upwards of an hour. An equivalent ARM template takes seconds. Literally 5 seconds. I have no idea what Terraform is doing differently.
- It has a hard-coded timeout of an hour for resource creation. If the timeout expires, it just fails. There is no way to change the timeout, at least for the Azure resources that I'm using. This is the dumbest loving thing. Combined with the above point, this means that I have about a 75% chance of failure when creating my resource.
- If Terraform fails creating a resource, my understanding is I can use the "import" command to add it to my state file once it's done, so I should be able to pick up my deployment where I left off. This does not loving work, it tells me that the resource has to be destroyed and recreated, even though it is the exact resource specified in my Terraform configuration.

So, in short: I like the syntax better but it's totally unusable in my specific scenario (usually fails, about 30x slower than ARM), and would complicate deployments by making me persist and share the state file.

[edit] Or we could use a Terraform Enterprise server to persist the state. Because who doesn't love adding additional crap to their toolchain to solve an already-solved problem, but in a slightly different way? :waycool:

New Yorp New Yorp fucked around with this message at 16:04 on Jan 9, 2019

# ? Jan 9, 2019 15:39

LochNessMonster: Feb 3, 2005; I need about three fitty

Spring Heeled Jack posted:

Those using docker swarm, what do you do to deploy/update services in a pipeline? We have build, tag, and push to our private repo in bamboo which then creates a release in octopus.

What I�m mostly seeing is to ssh into a manager node and run the stack deploy command from there, which is fine but it seems like there should be a better way. (Or there would be a better way by now if k8s hadn�t eaten its lunch).

You don�t have to ssh into a manager node, you can run the command from your build server with

code:

docker �tlsverify -H tcp://ip:port stack deploy �compose-file compose.yml

LochNessMonster fucked around with this message at 16:51 on Jan 9, 2019

# ? Jan 9, 2019 16:48

freeasinbeer: Mar 26, 2015; by Fluffdaddy

The default is to use the object store that exists in the cloud that you are using as the backend. There is no need for enterprise to sync remote state. I.e. s3 in AWS, gcs in google, etc etc.

# ? Jan 9, 2019 17:47

The Fool: Oct 16, 2003

What IaaC type tool should I be using to manage Hyper-V guests?

# ? Jan 9, 2019 18:11

my homie dhall: Dec 9, 2010; honey, oh please, it's just a machine

New Yorp New Yorp posted:

I've been translating my ARM template for this project into Terraform. Here are my Terraform impressions:

- I like the syntax better than ARM syntax. This is the only thing I like.
- I don't understand the point of a state file. It creates something that's generated by interrogating Azure, but won't work without it? And now I have some extra file generated at deployment time that I have to worry about synchronizing and sharing between deployments? Maybe I'm missing something, but it seems really pointless.
- Creating the resource I need takes upwards of an hour. An equivalent ARM template takes seconds. Literally 5 seconds. I have no idea what Terraform is doing differently.
- It has a hard-coded timeout of an hour for resource creation. If the timeout expires, it just fails. There is no way to change the timeout, at least for the Azure resources that I'm using. This is the dumbest loving thing. Combined with the above point, this means that I have about a 75% chance of failure when creating my resource.
- If Terraform fails creating a resource, my understanding is I can use the "import" command to add it to my state file once it's done, so I should be able to pick up my deployment where I left off. This does not loving work, it tells me that the resource has to be destroyed and recreated, even though it is the exact resource specified in my Terraform configuration.

So, in short: I like the syntax better but it's totally unusable in my specific scenario (usually fails, about 30x slower than ARM), and would complicate deployments by making me persist and share the state file.

[edit] Or we could use a Terraform Enterprise server to persist the state. Because who doesn't love adding additional crap to their toolchain to solve an already-solved problem, but in a slightly different way?

The state file is like the entire point of the software!! Terraform bills itself as infrastructure provisioning software, which it does, but so do lots of other solutions. Its killer feature is keeping track of provisioned resources so it can determine the operations required to go from the current state to the new desired state. And it does this for a ton of �providers�, which can be things like cloud resources or a deployed helm application or TLS resources. It�s great for creating and keeping track of long-lived, relatively static things in a repeatable manner and falls down when you need custom or complicated transitions between states.

If you don�t care about keeping track of the state/understand why you need to then use another tool. And you should drop the state in whatever azure�s equivalent of s3 is.

# ? Jan 9, 2019 18:17

New Yorp New Yorp: Jul 18, 2003; Only in Kenya.; Pillbug

Ploft-shell crab posted:

Its killer feature is keeping track of provisioned resources so it can determine the operations required to go from the current state to the new desired state.

Why does it need an external tracking mechanism? All I care about is the current state (which it can look at by querying the platform) and the desired state (which is defined in my configuration).

# ? Jan 9, 2019 18:29

Vulture Culture: Jul 14, 2003; I was never enjoying it. I only eat it for the nutrients.

New Yorp New Yorp posted:

Why does it need an external tracking mechanism? All I care about is the current state (which it can look at by querying the platform) and the desired state (which is defined in my configuration).

Because it's a multi-cloud solution, and not all clouds can identify a unique resource based on any combination of known properties. On some, like Azure, it's really easy because things have distinct and typically manually-defined resource names. On others, like AWS, the only identifier that uniquely identifies a resource you've created is an ARN or some constituent ID, which is automatically generated for almost all types of resources on the platform.

Sometimes you also need Terraform to work in reverse�if you remove a resource definition from your configuration, you need your infrastructure provisioner to be smart enough to delete the corresponding resources. You can't do that without tracking state.

Vulture Culture fucked around with this message at 18:37 on Jan 9, 2019

# ? Jan 9, 2019 18:35

my homie dhall: Dec 9, 2010; honey, oh please, it's just a machine

New Yorp New Yorp posted:

Why does it need an external tracking mechanism? All I care about is the current state (which it can look at by querying the platform) and the desired state (which is defined in my configuration).

If you store the state in azure�s s3 it is querying the platform :grin:

To respond to what you mean though, �querying the platform� is not a good way to figure out what it�s responsible for. What is used to uniquely identify a resource varies from resource to resource. Resource names are not guaranteed to be unique for every resource and tags are not available on all types of resources Terraform wants to provision.

And, as mentioned, there are many providers other than the cloud ones which may have no means for being queried.

e: Also, resource deletion.

# ? Jan 9, 2019 18:44

New Yorp New Yorp: Jul 18, 2003; Only in Kenya.; Pillbug

Vulture Culture posted:

Because it's a multi-cloud solution, and not all clouds can identify a unique resource based on any combination of known properties.

Okay, that's fair. My other complaints still stand. Unfortunately, I just don't have time to learn Go well enough to fix the broken things I've encountered.

# ? Jan 9, 2019 18:51

The Fool: Oct 16, 2003

In addition, terraform maintaining it's own state allows you to have terraformed resources along side other resources and not worry too much about terraform loving poo poo up.

# ? Jan 9, 2019 18:55

Erwin: Feb 17, 2006

New Yorp New Yorp posted:

Okay, that's fair. My other complaints still stand. Unfortunately, I just don't have time to learn Go well enough to fix the broken things I've encountered.

Terraform taking an hour to create a resource is probably the Azure provider's fault. If you're sure you're not doing something wrong in your configuration, then go look at the provider's repo for issues related to whatever you're seeing. The Azure provider is what defines how Terraform interacts with the Azure API to kick off the resource creation and to know when it's finished. If it was working correctly, it would take the same amount of time as your ARM template. There's nothing about Terraform that would make it take longer for Azure to do things. Solve that issue and your other points are moot.

Terraform sucks in a lot of ways, but not in any of the ways you think it does. It's the best tool for what it does, and it's one of those things that you grow to hate because it's indispensable.

# ? Jan 9, 2019 19:12

The Fool: Oct 16, 2003

Erwin posted:

Terraform taking an hour to create a resource is probably the Azure provider's fault. If you're sure you're not doing something wrong in your configuration, then go look at the provider's repo for issues related to whatever you're seeing. The Azure provider is what defines how Terraform interacts with the Azure API to kick off the resource creation and to know when it's finished. If it was working correctly, it would take the same amount of time as your ARM template. There's nothing about Terraform that would make it take longer for Azure to do things. Solve that issue and your other points are moot.

Terraform sucks in a lot of ways, but not in any of the ways you think it does. It's the best tool for what it does, and it's one of those things that you grow to hate because it's indispensable.

The azure provider is heavily contributed to by Microsoft, there is really no reason for it to have this kind of issue.

# ? Jan 9, 2019 19:23

Vulture Culture: Jul 14, 2003; I was never enjoying it. I only eat it for the nutrients.

The Fool posted:

The azure provider is heavily contributed to by Microsoft, there is really no reason for it to have this kind of issue.

I've been there and this is an issue with the Azure API for Golang itself. Philosophically, it's somewhat part of how Azure is designed. Other clouds' APIs return when the request to provision a resource has been accepted. Azure's returns when the resource is actually provisioned and passes a health check. (Ask me about trying to start two thousand VM instances concurrently, in light of the previous statement.)

The Fool posted:

In addition, terraform maintaining it's own state allows you to have terraformed resources along side other resources and not worry too much about terraform loving poo poo up.

This is a great point. Terraform expects one single project to own a resource. It won't skip creating a resource just because someone else in your org happened to make one with the same name (that doesn't actually do any of the things you expect it to).

The downside is that importing is a footgun, because if you import a resource that someone else created as part of another project, you can end up deleting it with your own terraform destroy.

Vulture Culture fucked around with this message at 21:35 on Jan 9, 2019

# ? Jan 9, 2019 21:29

Vanadium: Jan 8, 2005

You shouldn't import it, you should have the other project output the relevant attributes of the resource and include its state file via the terraform remote state data source. :colbert:

# ? Jan 10, 2019 03:28

kitten emergency: Jan 13, 2008; get meow this wack-ass crystal prison

last time i messed with them i found all of the azure management SDK and APIs to be absolutely terrible and generally slower than using the UI or powershell for some reason.

# ? Jan 10, 2019 03:41

Lily Catts: Oct 17, 2012; Show me the way to you
(Heavy Metal)

Are AWS CodeCommit, CodePipeline, and CodeDeploy services worth using? We have a web app we want to deploy to AWS but we already have Jenkins.

# ? Jan 11, 2019 10:34

New Yorp New Yorp: Jul 18, 2003; Only in Kenya.; Pillbug

I swear I'm not an idiot, but Terraform's state tracking behavior makes no sense.

I have an Azure resource group with a bunch of stuff in it. I want to recreate it all, so I "taint" my resource group and apply the change. It deletes and recreates the resource group. Only the resource group. None of the resources that were in the resource group. That's not what I was expecting, but fine. Is there a command for "just recreate everything" that I should be using instead?

Okay, I run apply again. Now I get a 404 error because it expects the resources to be there. I thought the point of the state file is to track drift. There's been a big drift, my poo poo is gone and I want it back. If I delete stuff via the portal, why do I have to manually update my state? Shouldn't it detect that something was being managed by Terraform, and it's drifted out of existence and recreate it? Hell, I have to manually update my state using TERRAFORM to manage the resources. I don't get it.

Is this just because the Azure provider sucks? I can accept that.

New Yorp New Yorp fucked around with this message at 16:07 on Jan 11, 2019

# ? Jan 11, 2019 16:05

Erwin: Feb 17, 2006

New Yorp New Yorp posted:

Is this just because the Azure provider sucks? I can accept that.

Yes. And more importantly, the Azure API.

I've never had to deal with the Azure API, but it sounds like the Azure API gives a 404 if provided a non-existent resource ID, instead of a more informative message about no resource existing with that ID. Therefore the Azure provider needs to guess at the meaning of a 404 and whether it indicates a missing resource or an actual problem. This is just conjecture, but there are quite a few issues around various 404 errors on the provider's github repo. It sounds like they have to provide 404 interpretation logic for each resource type.

Also this issue might be related to what you're seeing with resource groups: https://github.com/terraform-providers/terraform-provider-azurerm/issues/2629 It seems Azure identifies things by names? Yikes. So the creator of that issue is creating a resource group with a name built from some variables. Then he creates another resource to add to that group, only he provides the resource_group_name as a string built by the same variables instead of actually referencing the created resource group data source. So maybe the resources in your resource group weren't assigned to the resource group by referencing the resource group (terraform) resource.name but just the same string value?

It really just sounds like Azure is an all-around shitshow so hate Azure. Is Azure the only thing you're targeting with Terraform? If so why use Terraform?

# ? Jan 11, 2019 16:43

New Yorp New Yorp: Jul 18, 2003; Only in Kenya.; Pillbug

Erwin posted:

Is Azure the only thing you're targeting with Terraform? If so why use Terraform?

Yes. Long story short, client had a management shakeup and now there's a ton of people jockeying for power by seeing who can introduce the most buzzwords. One of the buzzwords that has landed on my plate is "terraform". I've been saying exactly what you just said all along, but I figured I'd put my money where my mouth is and do a technical evaluation, while also getting to pick up some new skills on someone else's dime. So I spent some time this week reimplementing one of the standard ARM templates in Terraform. I've been less than thrilled with the results.

# ? Jan 11, 2019 16:51

ThePeavstenator: Dec 18, 2012; Establish the Buns

New Yorp New Yorp posted:

I swear I'm not an idiot, but Terraform's state tracking behavior makes no sense.

I have an Azure resource group with a bunch of stuff in it. I want to recreate it all, so I "taint" my resource group and apply the change. It deletes and recreates the resource group. Only the resource group. None of the resources that were in the resource group. That's not what I was expecting, but fine. Is there a command for "just recreate everything" that I should be using instead?

it sounds like you're looking for workspaces

ThePeavstenator fucked around with this message at 02:01 on Jan 12, 2019

# ? Jan 12, 2019 01:58

StabbinHobo: Oct 18, 2002; by Jeffrey of YOSPOS

Erwin posted:

Yes. And more importantly, the Azure API.

I've never had to deal with the Azure API, but it sounds like the Azure API gives a 404 if provided a non-existent resource ID, instead of a more informative message about no resource existing with that ID. Therefore the Azure provider needs to guess at the meaning of a 404 and whether it indicates a missing resource or an actual problem. This is just conjecture, but there are quite a few issues around various 404 errors on the provider's github repo. It sounds like they have to provide 404 interpretation logic for each resource type.

what? it means Not Found, its the perfect http status code for that scenario.

loving web developers need everything in javascript

edit: <grumpy old man points to rfc> https://tools.ietf.org/html/rfc7231#section-6.5.4

# ? Jan 12, 2019 02:07

New Yorp New Yorp: Jul 18, 2003; Only in Kenya.; Pillbug

ThePeavstenator posted:

it sounds like you're looking for workspaces

I don't see how. I was just trying to tear everything down so I could test recreating it from scratch.

# ? Jan 12, 2019 02:35

ThePeavstenator: Dec 18, 2012; Establish the Buns

New Yorp New Yorp posted:

I don't see how. I was just trying to tear everything down so I could test recreating it from scratch.

terraform destroy then?

Unless this is all existing infrastructure created outside of terraform. It doesn't magically track services created outside of it without importing it to your state first. It also doesn't update your scripts when you import things into your state, you still have to define your infrastructure in code even if it already exists.

# ? Jan 12, 2019 02:58

Docjowles: Apr 9, 2009

New Yorp New Yorp posted:

I don't see how. I was just trying to tear everything down so I could test recreating it from scratch.

I think what you want is terraform destroy, followed by a fresh apply. Taint is for targeting individual resources, as you found, and is kind of a niche command. Destroy nukes everything TF knows about from orbit in one go.

# ? Jan 12, 2019 02:59

necrobobsledder: Mar 21, 2005; Lay down your soul to the gods rock 'n roll; Nap Ghost

Give https://github.com/newcontext-oss/kitchen-terraform a try as well if you can stomach Ruby

# ? Jan 12, 2019 05:18

Harik: Sep 9, 2001; From the hard streets of Moscow
First dog to touch the stars; Plaster Town Cop

I've been working my way up the automation chain, got the CI half setup a while back with full sets of unit tests and all the reporting.

Now I'm working on the deployment. This is about two days of evaluation here, and the product isn't deployed so I've got freedom to experiment a bit since it's just internal testers getting booted.

gitlab->docker image->docker registry->docker stack deploy

Two painpoints:
I want the git version information in the container but docker build can't run run host commands. I don't want to copy-in the entire git history and add git as a dependency inside the container just to run git-version, so for now I have a dumb shell script that just creates a file then runs docker-build which copies it in with the rest.

Is there a better way to get that information into the image?

E: Of course there is, gitlab provides all of that as environment variables. I can set those in a shell-script for local testing builds, but inside CI it will just work.

The second part is the 2-5 second downtime during an upgrade.

code:

services:
  redis:
    image: redis:latest
  web:
    build: .
    image: company.registry/image:production
    command: goes here
    tty: true # this is a bug on my end with logging buffering forever
    ports:
      - "8099:8000"
    depends_on:
      - redis
    links:
      - redis
    secrets:
      - json
    volumes:
      - ./config.json:/srv/config.json:ro
    deploy:
      replicas: 2
      update_config:
        parallelism: 2
        delay: 60s
        order: start-first

Problem #1:
default order: stop-first. the gently caress, docker, why the hell would you ever kill the old service before even downloading the new one. I understand "shut down the old first", not "I don't even know if this image is available but sure, what the gently caress time to die.". I've got the registry server deliberately throttled hard to see what happens with resource contention so this would lead from 1-10 minutes of downtime depending on how many layers needed updating.

order: start-first still leads to downtime because docker switches the internal routing to the new container the microsecond it launches, without waiting for the service to actually start answering on the port. It's lightweight but there's no reason to have any downtime.

Do I need service discovery and external HA, or is it possible to do zero-downtime rollover with docker swarm?

Things that worked really well:
mounting a config file as part of the service launch. Same image from dev to production but pointed at different ACLs and database endpoints.
Stripping all the passwords out of those config files and putting them in docker secrets
Deployment in general. All the services were easy to create, starting with the swarm manager, adding a test worker to it, putting the registry in it, pushing the build and running docker stack deploy.
management of scaling.

Deployment is on a rack and a half of servers we own, not anyone's cloud. I had thought k8s was for bigger deployments only, but from reading the thread maybe not? I'm not committed to docker swarm at this point, but I do want to make sure this is as automated as possible.

Harik fucked around with this message at 07:33 on Jan 16, 2019

# ? Jan 15, 2019 07:14

Adbot: ADBOT LOVES YOU

# ? Jun 5, 2024 03:20

Spring Heeled Jack: Feb 25, 2007; If you can read this you can read

I�m using swarm as well and I think with replicas:2 and parallelism:2 it is killing both replicas at the same time. Try removing parallelism? When I do stack updates it shows in the console that it is only updating one replica at a time. Obviously this wouldn�t be an issue if you had like 8 replicas.

We have an upstream load balancer (F5) to direct traffic into our api manager running in the swarm, which in turns directs traffic to the apis. We�re still figuring some things out but it seems to works well enough for our needs and is pretty straightforward. This was my main reason for going with Swarm at this time, it was very quick to get up and running on-prem (and we're running Windows containers). Now that K8s support is supposedly better with 2019 I'm going to do more time investigating what is involved with running it.

Spring Heeled Jack fucked around with this message at 13:46 on Jan 16, 2019

# ? Jan 16, 2019 11:52

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Continuous Integration/build engineering/devops thread

«‹›158 »