Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
StabbinHobo
Oct 18, 2002

by Jeffrey of YOSPOS

Vanadium posted:

There shouldn't be a human. My understanding based on doing something analogous with terraform and ECS and reading through a coworker's k8s setup is: Your automation/build script should not only generate a docker image and tag it with something like the git commit ID, it should at the same time also generate the updated yaml files (maybe from a template in the repo) referencing the new image tagged with a unique ID, and then shove that file into k8s or whatever.
I'm interpreting that as "add a step to the end of your cloudbuild.yaml" but what does that step do?

Methanar posted:

What I mean more is when your build job completes, your code has had its dependencies pulled down, modules compiled, docker images built and pushed to a registry, tests ran. Your task runner completes the build step and then executes another job for the deploy step. That next step might just be a helm upgrade.
ok, second vote for "add a step to the end of your cloudbuild.yaml"

Scikar posted:

you can have multiple tags per image, updated on different cycles. So when you push foo_bar:ABCD1234 it can also update foo_bar:latest in the registry to point to the same image digest. That would give you a reference point for your deployments, so when you press deploy in your webapp it looks up foo_bar:latest, reads the digest, and creates a new deployment with that digest directly, kubernetes doesn't have to know or care what the tag is. Or if you need to upgrade existing deployments, get and patch them with the updated digest so kubernetes can do a rolling upgrade (I think, I'm not quite at that point yet myself).
so I should use ":latest" but not actually in the deployment, just as a thing to query the registry on? how do you query GCR like that?

quote:

Lastly, if you do have a hook I would put it on your registry rather than the CI build. The build process just has to get the new image up to the registry and it's done. The registry itself (I assume, we use the Azure one which can at least) can then notify your webapp that a new version was pushed and trigger that workflow, but if your webapp isn't running your build still succeeds (and presumably your webapp can just run the workflow when it does get started again).
ok so thats a vote for "don't add a step to the end of your cloudbuild.yaml, figure out how to have GCR trigger its own separate thing"

Vulture Culture posted:

I'm having a little bit of trouble understanding what "my tool/webapp" actually does, or why it's important to your workflow that this application does it. Could you elaborate? I want to make sure I'm providing good recommendations and not inundating you with general K8s guidance that for some reason doesn't fit your specific business requirements at all.
basically think of a blank webpage with a single submit button in the middle, you click that button and you get back a hostname. you click that hostname and your browser loads your "instance" (deployment) of the image. the most rudimentary self-service "provision me a copy of the x app" tool you can fathom.

quote:

An image isn't a deployment construct, it's an image, in the same way that a VMware template isn't a cluster of VMs, it's a template. If your deployment approach conflates the two, you're going to have a bad time. You don't need to do something crazy like put a Spinnaker CD system into production, or even a "package manager" like Helm, but you should leverage things like Kubernetes deployments where they make sense.
yea maybe I phrased that wrong somewhere because these are single-container-pods that all run the same image, but yes I'm using the Deployment construct for this.

quote:

The problem you're trying to deal with—upgrading a deployed K8s application to a new image tag
no, i don't care (yet) about updating any of the previously deployed ones, I just want to know that the *next* one (when someone mashes that button) deploys the latest image

quote:

One thing you might not have considered is that if you use tags for this approach, when an application is automatically respawned—say, because you have a host fall over and your pod gets automatically rescheduled onto another host—that host will start the deployment process by pulling down the latest version of the image, giving you an upgrade you weren't prepared for and might not want right now. In the case of something like Jenkins (an intentionally convoluted, triggering example of an app), this can be painful because it will grab an image that might be completely incompatible with the configuration or the on-disk volume data that you've configured your app to use. In almost all cases, it's better to be explicit about the version you want within Kubernetes, and use something else to drive the policy of ensuring your applications are at some specific version or other.
yea so far all our changes are backward compatible additions so even though I'm aware of this possible issue its not an issue for us (yet). also the "use something else" thing is basically the question here :)

Adbot
ADBOT LOVES YOU

StabbinHobo
Oct 18, 2002

by Jeffrey of YOSPOS
here i tried to rephrase and really distill it:
------------------------------------------------------------------------------------------------------------------------

git repo "foo-bar" has two files, a Dockerfile and a cloudbuild.yaml
the first line of the dockerfile is "FROM foo:bar" after that we're just dropping a glorified readme file in a docroot
the cloudbuild.yaml was pasted earlier, its last line pushes "foo:bar" into GCR

git repo "push-butan" is a web interface for self service provisioning of the foo:bar app
inside of its source code is the equivilant of:
- randomly generate a name
- template that name into the api equivilant of a 'kubectl apply -f foobar-deployment.yaml'
- foobar-deployment.yaml sets the image to 'gcr.io/project/foo:bar'
- user is presented with a link to random_name.example.org (via wildcard dns and nginx-ingress fwiw)
- push-butan is already deployed and running

we decide it would be nice if the readme had a "thank jeebus" section
- add a "RUN echo '<marquee>thank you jeebus for this big rear end check</marquee>' >> /some/dir/index.html" to the dockerfile
- git commit to the "foo-bar" repo
- that triggers cloudbuild
- that pushes a new image to the GCR repo with the "foo:bar" tag
- pull up the push-butan web interface
- mash the button
- click the link
- no sign of the jeebus update

without involving another tool than what we already have (gke/k8s, gcr, cloudbuild, github, build triggers) is there
a "right" or even just "clean and simple" way of solving this?

StabbinHobo fucked around with this message at 18:10 on Nov 8, 2018

StabbinHobo
Oct 18, 2002

by Jeffrey of YOSPOS
for anyone that might care, here's what I wound up with:

code:
LATEST_TAG=$(gcloud container images list-tags "gcr.io/$GOOGLE_CLOUD_PROJECT/foo" --sort-by=~timestamp --limit=1 --format='value(TAGS)')
kubectl set env deployment/push-butan foo=$LATEST_TAG
edit: oh and cloudbuild.yaml went
code:
-   args: ['build', '-t', 'gcr.io/$PROJECT_ID/foo:bar', '.']
+   args: ['build', '-t', 'gcr.io/$PROJECT_ID/foo:$SHORT_SHA', '.']
- images: ['gcr.io/$PROJECT_ID/foo:bar']
+ images: ['gcr.io/$PROJECT_ID/foo:$SHORT_SHA']

StabbinHobo fucked around with this message at 20:36 on Nov 13, 2018

mr_package
Jun 13, 2000
Linking on my old xeons is taking forever. What can I buy that has better single threaded performance? I need like i9 9900k but in a dual CPU rack. I'm not sure such a thing exists?

I am tempted to replace build VMs with desktop components at this point. Is there something I'm missing, is there a better way to address the problem of Windows c++ link time? (Other than pushing back on engineering side to fix at project/code level).

Plorkyeran
Mar 22, 2007

To Escape The Shackles Of The Old Forums, We Must Reject The Tribal Negativity He Endorsed
lld now works well enough on windows for Chrome and is pretty dramatically faster than microsoft's linker, so it may be worth looking into using that.

Cancelbot
Nov 22, 2006

Canceling spam since 1928

Has anyone tackled service discovery in a multi-VPC/account environment in AWS? Consul is great but is very much focused around servers running agents that all talk to each other. As such this doesn't work when we also have lambda services and a mix of public and private DNS zones.

Ideally the solution is to be consul-like in that a service will self announce and that replicates in some way across all accounts and computing resources. Which suggests some sort of automation around Route53?

Vulture Culture
Jul 14, 2003

I was never enjoying it. I only eat it for the nutrients.

Cancelbot posted:

Has anyone tackled service discovery in a multi-VPC/account environment in AWS? Consul is great but is very much focused around servers running agents that all talk to each other. As such this doesn't work when we also have lambda services and a mix of public and private DNS zones.

Ideally the solution is to be consul-like in that a service will self announce and that replicates in some way across all accounts and computing resources. Which suggests some sort of automation around Route53?
Check out Consul Connect in version 1.4, which was just released yesterday
https://www.hashicorp.com/blog/consul-1-4-multi-data-center-service-mesh

New Yorp New Yorp
Jul 18, 2003

Only in Kenya.
Pillbug
Under what circumstances is it "good" or "correct" to build a different version of your container for different environments? I just saw a company building a container-per-environment, instead of building one container and having a testing/delivery pipeline for that image. It seemed like it was defeating the purpose of containers, but I'm not enough of an authority in the area to say anything.

minato
Jun 7, 2004

cutty cain't hang, say 7-up.
Taco Defender
It's debatable. There's a tension between (say) security who wants the container to have the least stuff inside it to reduce the attack surface area, and devs (or SREs) who want to have some basic tooling in the container so they can debug stuff. We solved the issue by forcing devs to bind-mount in their debug tools into the dev containers, but the SREs were SOL because (at least at the time) there was no easy way to bind-mount debug tools into a running container.

By "environment" I assume you're talking about dev->test->stage->prod. It does make sense to have multiple build flavors for dev (add debug tools) and test (builds with full debug info, or built with address/thread sanitizer guard code). But there should also be a vanilla flavor that doesn't change from test->stage->prod. Otherwise, the reasoning goes, you're not testing exactly what you're deploying.

I've seen situations where the build process was re-done just for stg->prod, because (for security reasons) the build environments were more tightly locked down and free from external dependencies (like network access to Github). This also meant they were slower, and it was inefficient to put them at the front of the pipeline. So folks decided that since the only significant differences were factors that were unlikely to affect the build itself, it was deemed low-risk.

New Yorp New Yorp
Jul 18, 2003

Only in Kenya.
Pillbug

minato posted:

By "environment" I assume you're talking about dev->test->stage->prod. ... But there should also be a vanilla flavor that doesn't change from test->stage->prod. Otherwise, the reasoning goes, you're not testing exactly what you're deploying.


Yes, that's what I meant. And thank you, that's exactly what I thought.

Hadlock
Nov 9, 2004

Our CTO (we're a small company, he is a co-founder) had us add in a feature where there's a set of folders inside the Java WAR artifact, and then based on the environment parameter pulls file(s) out of a subfolder in the war and we use those files to amend the config file/s or run sql against the db in ephemeral environments. This is working out...ok so far. This is mostly just for third party integrations.

I'm a big proponent of same container everywhere, what gets deployed to dev and qa is ultimately the same artifact that is deployed to production, this ensures that there is little chance for unexpected things to go wrong.

If a special tool is needed on a container it's installed during debug time inside of a container exec command line. This doesn't happen every day though, especially once the container pattern stabilizes.

This all goes out the window for statically compiled binary apps though.

CTO wants feature flags for our monolith to be controlled 100% by our deployment app, but we're trying to talk him away from the ledge on that one and make feature flags handled by developers/product in the database instead. Making feature flags an ops problem seems like a terrible idea.

Hadlock fucked around with this message at 08:42 on Nov 24, 2018

Nomnom Cookie
Aug 30, 2009



Hadlock posted:

Our CTO (we're a small company, he is a co-founder) had us add in a feature where there's a set of folders inside the Java WAR artifact, and then based on the environment parameter pulls file(s) out of a subfolder in the war and we use those files to amend the config file/s or run sql against the db in ephemeral environments. This is working out...ok so far. This is mostly just for third party integrations.

I'm a big proponent of same container everywhere, what gets deployed to dev and qa is ultimately the same artifact that is deployed to production, this ensures that there is little chance for unexpected things to go wrong.

If a special tool is needed on a container it's installed during debug time inside of a container exec command line. This doesn't happen every day though, especially once the container pattern stabilizes.

This all goes out the window for statically compiled binary apps though.

CTO wants feature flags for our monolith to be controlled 100% by our deployment app, but we're trying to talk him away from the ledge on that one and make feature flags handled by developers/product in the database instead. Making feature flags an ops problem seems like a terrible idea.

Your CTO thinks it’s a good idea to shove all the configs in the artifact and then dynamically mangle the world? That’s horrifying. The whole point of trying to build the artifact once is that it shouldn’t care where it’s being deployed, not that it should embody all possible configurations.

my homie dhall
Dec 9, 2010

honey, oh please, it's just a machine
Let’s say you want to have a version endpoint or string somewhere in your app. Isn’t that something that’s going to be static in your artifacts? If so, how do you keep the pattern of using same artifact in QA/Staging/Prod if you’ll need to modify this as it gets promoted?

necrobobsledder
Mar 21, 2005
Lay down your soul to the gods rock 'n roll
Nap Ghost
Feature flags should be controlled by whoever is willing to take the Pagerduty when it breaks during the break-in period. If I am not the appropriate person to resolve it, why should I get paged first? Sure, you should have automated rollbacks of some type but not all systems are capable of doing that either.

Configuration of applications with promoted artifacts is very much similar to the motivations for dependency injection - you invert the process and risk from "here's my configuration. Is it ok?" to "what's my configuration? I presume it's good." By making the configuration steps dynamic, you add in more room for error and exception logic / hacks. Each and every difference between environments is another variable that can cause a problem, and enshrining it in code is a great way to make the differences permanent. We nearly tanked a major release because our configuration behaved differently in AWS and external to AWS over two whitespace characters (one-level indent too many in YAML). None of the developers caught this in their code review and 1000+ line yaml config review specific to the environment's components, so presuming "because it's in code, it will be reviewed and problems caught early" is completely erroneous. This is no different than "because we have infrastructure as code, we'll reliably deploy releases now!" The way we did release de-risking was through canaries, but none of our canaries had the exact configuration combo that caused the problem, and we have so many errors logged we can't distinguish most normal operational behavior from unexpected behaviors.

For a real-world case, shoving configuration into build artifacts and dynamic configuration is a major reason we had a critical production incident at my last job specifically due to trying to be convenient for developers by utilizing dynamic configuration behavior. If there was no database connection made within 30 seconds, the code presumed you were a developer and granted root view ACLs for the session. Nobody
in management knew such a mode even existed until they were on a bridge and a developer said "oh yeah, that's the mode we use all the time for local work, X put it in 4 years ago." The resulting contract breach and lawsuit has sunk the company (on top of the other bad things going on, granted).

Lastly, baking configuration into artifacts makes rotating secrets and strings harder and, at best, slower. If you pull secrets from variables or an externally managed service like Vault, you rotate from a central point of management. The instant you start to sometimes pull from the artifact and sometimes from another source, you're going to start second-guessing the string and more potential for configuration error and outdated dynamic behaviors depending upon environment. So presuming all configuration is baked into the artifact, your build times for said artifact are now the bottleneck for updating settings.

So I implore whoever this CTO is to take warnings from my stories of attempts to dynamically do anything with configuration and to read the 12F manifesto. Until you can give really, really good reasons for your exceptions to those rules, it's a good set of guidelines to follow even if you have stateful applications just from a process / policy standpoint (because by and large it'll be better understood and adhered to by newcomers than whatever balkanized hellscape of documentation and policy at your company). I have had a lot more headaches from not following those guidelines than from trying to force applications to follow those patterns. Exceptions are legacy applications that I was forced to lift and shift.

kitten emergency
Jan 13, 2008

get meow this wack-ass crystal prison

necrobobsledder posted:

. If there was no database connection made within 30 seconds, the code presumed you were a developer and granted root view ACLs for the session. Nobody
in management knew such a mode even existed until they were on a bridge and a developer said "oh yeah, that's the mode we use all the time for local work, X put it in 4 years ago." The resulting contract breach and lawsuit has sunk the company (on top of the other bad things going on, granted).


what the gently caress

Doom Mathematic
Sep 2, 2008

necrobobsledder posted:

If there was no database connection made within 30 seconds, the code presumed you were a developer and granted root view ACLs for the session. Nobody in management knew such a mode even existed until they were on a bridge and a developer said "oh yeah, that's the mode we use all the time for local work, X put it in 4 years ago." The resulting contract breach and lawsuit has sunk the company (on top of the other bad things going on, granted).

I'm sorry, what?

necrobobsledder
Mar 21, 2005
Lay down your soul to the gods rock 'n roll
Nap Ghost
I'll leave it that the incident was very public and the company had a history of similar blunders with root causes all pointing to simply bad / ineffectively tested code.

Bhodi
Dec 9, 2007

Oh, it's just a cat.
Pillbug
Anyone else going to re:invent? Can't wait to look at all the stuff I can't / won't use :confuoot:

freeasinbeer
Mar 26, 2015

by Fluffdaddy
Nah I am looking forward to kubecon where I am just going to be pitched so much random poo poo by vendors.

Bhodi
Dec 9, 2007

Oh, it's just a cat.
Pillbug
Ah yeah, that one's next month. Our group split, half are going to reinvent and half are going there.

Docjowles
Apr 9, 2009

Yeah I am going to reinvent. Went last year as well. Looking forward to a day or two of ambitiously waiting in huge rear end lines before getting all :effort: and switching to day drinking with a pinky swear to watch the sessions later online.

Nomnom Cookie
Aug 30, 2009



Ploft-shell crab posted:

Let’s say you want to have a version endpoint or string somewhere in your app. Isn’t that something that’s going to be static in your artifacts? If so, how do you keep the pattern of using same artifact in QA/Staging/Prod if you’ll need to modify this as it gets promoted?

If it changes based on deployment environment then it's not a version.

my homie dhall
Dec 9, 2010

honey, oh please, it's just a machine

Kevin Mitnick P.E. posted:

If it changes based on deployment environment then it's not a version.

It doesn’t change based on deployment environment, but when it’s in QA it’s still just a release candidate for us, not a real release yet. We promote it to being a real release once it goes through a bunch of integration tests with other systems. (having a “QA process” instead of real CICD is non-ideal, but it’s unfortunately not my decision to make)

Maybe our endpoint should just return a build number?

Nomnom Cookie
Aug 30, 2009



Ploft-shell crab posted:

It doesn’t change based on deployment environment, but when it’s in QA it’s still just a release candidate for us, not a real release yet. We promote it to being a real release once it goes through a bunch of integration tests with other systems. (having a “QA process” instead of real CICD is non-ideal, but it’s unfortunately not my decision to make)

Maybe our endpoint should just return a build number?

Build a thing, do a bunch of QA on it, then build a different (but hopefully close enough) thing and release it is not a process I would be comfortable with.

my homie dhall
Dec 9, 2010

honey, oh please, it's just a machine

Kevin Mitnick P.E. posted:

Build a thing, do a bunch of QA on it, then build a different (but hopefully close enough) thing and release it is not a process I would be comfortable with.

Right, that’s something I don’t think I want to do. Here’s what I think are my options, but I’m wondering if there’s maybe another
- rebuilding/repackaging artifact at promote time, after QA (bad for the reasons stated)
- not having release candidates and only ever sending real releases to QA (would result in a ton of versions for us because we always find stuff in QA)
- not having version information available in the artifact

ThePeavstenator
Dec 18, 2012

:burger::burger::burger::burger::burger:

Establish the Buns

:burger::burger::burger::burger::burger:

Ploft-shell crab posted:

Right, that’s something I don’t think I want to do. Here’s what I think are my options, but I’m wondering if there’s maybe another
- rebuilding/repackaging artifact at promote time, after QA (bad for the reasons stated)
- not having release candidates and only ever sending real releases to QA (would result in a ton of versions for us because we always find stuff in QA)
- not having version information available in the artifact

I'm not sure if I've missed something but why can't you just do the old <Major>.<Minor>.<Release>.<Build> ?

FlapYoJacks
Feb 12, 2009
Our place just does year.month-revision

my homie dhall
Dec 9, 2010

honey, oh please, it's just a machine

ThePeavstenator posted:

I'm not sure if I've missed something but why can't you just do the old <Major>.<Minor>.<Release>.<Build> ?

actually I think this would totally work and I don’t know why I didn’t consider doing this. we have been sticking “RC” in our version strings for stuff that hasn’t gone through QA, but dropping that would alleviate all the pain. 🤦‍♂️ thanks!

Gyshall
Feb 24, 2009

Had a couple of drinks.
Saw a couple of things.
build once or die trying

Nomnom Cookie
Aug 30, 2009



Gyshall posted:

build once or die trying

Retry failed jobs

none shall ever know

poemdexter
Feb 18, 2005

Hooray Indie Games!

College Slice

Ploft-shell crab posted:

actually I think this would totally work and I don’t know why I didn’t consider doing this. we have been sticking “RC” in our version strings for stuff that hasn’t gone through QA, but dropping that would alleviate all the pain. 🤦‍♂️ thanks!

Symantic Versioning saves the day! Just be prepared to deal with people that are real pissy about having a high number in a build version because of "reasons".

toadoftoadhall
Feb 27, 2015
When commissioning a new server, I run through a mental checklist I've cobbled together from linode and digital ocean tutorials for CentOS or Ubuntu machines. Eg:
(1) upload public
(2) add non-root user
(3) add user to groups
(4) configure firewall
(5) ...
(6) N

Hasn't been an issue, because they've been my approaching-nil traffic personal VPSs. Weaknesses are obvious. What does "doing it right" involve?

Bhodi
Dec 9, 2007

Oh, it's just a cat.
Pillbug

toadoftoadhall posted:

When commissioning a new server, I run through a mental checklist I've cobbled together from linode and digital ocean tutorials for CentOS or Ubuntu machines. Eg:
(1) upload public
(2) add non-root user
(3) add user to groups
(4) configure firewall
(5) ...
(6) N

Hasn't been an issue, because they've been my approaching-nil traffic personal VPSs. Weaknesses are obvious. What does "doing it right" involve?
Not doing this every time

Less pithy answer: pick your favorite stig/hardening, make it into an image / kickstart / autoinstall. Basically don't do it by hand

toadoftoadhall
Feb 27, 2015

Bhodi posted:

Not doing this every time

Less pithy answer: pick your favorite stig/hardening, make it into an image / kickstart / autoinstall. Basically don't do it by hand

Stig? I may have misunderstood, but you're saying, take a distribution image, do your initial configuration, freeze it into a new image, and deploy that new image going forward?

In my case the virtualisation platform is managed by an independent team with some standard ISOs that I can choose from; so all configuration must be done for each deployment, using something like...fab, or ansible (I'm guessing...I've used the former a bit; it's a python package for scripting ssh. I haven't used ansible).

Gyshall
Feb 24, 2009

Had a couple of drinks.
Saw a couple of things.
Yeah if you can't bake an image, use Ansible imo.

StabbinHobo
Oct 18, 2002

by Jeffrey of YOSPOS
out of curiosity, what even still has you monkey loving vps configs at all? is there a technical limitation of the many PaaS vendors, a pricing issue, or just a cultural/old-ways thing?

Docjowles
Apr 9, 2009

Current re:Invent status: Waiting in an hour long line to even register for the conference. Going to miss my first session. No food or coffee because everyplace that serves those also has an hour long line.

My coworker went to a different venue, was registered and had coffee in like 10 minutes. Currently researching the legality of murder in Nevada.

toadoftoadhall
Feb 27, 2015

StabbinHobo posted:

out of curiosity, what even still has you monkey loving vps configs at all? is there a technical limitation of the many PaaS vendors, a pricing issue, or just a cultural/old-ways thing?

habit. not being plugged into many technical communities (and lowendbox being one of those I am plugged into). also the fact that hereto, they've been servers for not-very-important personal use. an impetus is that now it's for work.

toadoftoadhall fucked around with this message at 18:37 on Nov 26, 2018

Hughlander
May 11, 2005

StabbinHobo posted:

out of curiosity, what even still has you monkey loving vps configs at all? is there a technical limitation of the many PaaS vendors, a pricing issue, or just a cultural/old-ways thing?

My answer. I want docker zfs running with snapshots.

Adbot
ADBOT LOVES YOU

Bhodi
Dec 9, 2007

Oh, it's just a cat.
Pillbug

Docjowles posted:

Current re:Invent status: Waiting in an hour long line to even register for the conference. Going to miss my first session. No food or coffee because everyplace that serves those also has an hour long line.

My coworker went to a different venue, was registered and had coffee in like 10 minutes. Currently researching the legality of murder in Nevada.
it's pretty nuts. I got in last night at 2am and got up early because I knew badge line was gonna be long

the shuttle busses are the most silly thing to me, it takes an hour to get ferried across the street because this city is hell on earth, it's like everything is done and designed in the worst way possible, deliberately

it's 4pm and feels like 10pm, i spent all day getting pitched to and what little I learned could have been conveyed in a 10 minute blog post read; i want to pull the plug on this entire week

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply