|
Vanadium posted:There shouldn't be a human. My understanding based on doing something analogous with terraform and ECS and reading through a coworker's k8s setup is: Your automation/build script should not only generate a docker image and tag it with something like the git commit ID, it should at the same time also generate the updated yaml files (maybe from a template in the repo) referencing the new image tagged with a unique ID, and then shove that file into k8s or whatever. Methanar posted:What I mean more is when your build job completes, your code has had its dependencies pulled down, modules compiled, docker images built and pushed to a registry, tests ran. Your task runner completes the build step and then executes another job for the deploy step. That next step might just be a helm upgrade. Scikar posted:you can have multiple tags per image, updated on different cycles. So when you push foo_bar:ABCD1234 it can also update foo_bar:latest in the registry to point to the same image digest. That would give you a reference point for your deployments, so when you press deploy in your webapp it looks up foo_bar:latest, reads the digest, and creates a new deployment with that digest directly, kubernetes doesn't have to know or care what the tag is. Or if you need to upgrade existing deployments, get and patch them with the updated digest so kubernetes can do a rolling upgrade (I think, I'm not quite at that point yet myself). quote:Lastly, if you do have a hook I would put it on your registry rather than the CI build. The build process just has to get the new image up to the registry and it's done. The registry itself (I assume, we use the Azure one which can at least) can then notify your webapp that a new version was pushed and trigger that workflow, but if your webapp isn't running your build still succeeds (and presumably your webapp can just run the workflow when it does get started again). Vulture Culture posted:I'm having a little bit of trouble understanding what "my tool/webapp" actually does, or why it's important to your workflow that this application does it. Could you elaborate? I want to make sure I'm providing good recommendations and not inundating you with general K8s guidance that for some reason doesn't fit your specific business requirements at all. quote:An image isn't a deployment construct, it's an image, in the same way that a VMware template isn't a cluster of VMs, it's a template. If your deployment approach conflates the two, you're going to have a bad time. You don't need to do something crazy like put a Spinnaker CD system into production, or even a "package manager" like Helm, but you should leverage things like Kubernetes deployments where they make sense. quote:The problem you're trying to deal with—upgrading a deployed K8s application to a new image tag quote:One thing you might not have considered is that if you use tags for this approach, when an application is automatically respawned—say, because you have a host fall over and your pod gets automatically rescheduled onto another host—that host will start the deployment process by pulling down the latest version of the image, giving you an upgrade you weren't prepared for and might not want right now. In the case of something like Jenkins (an intentionally convoluted, triggering example of an app), this can be painful because it will grab an image that might be completely incompatible with the configuration or the on-disk volume data that you've configured your app to use. In almost all cases, it's better to be explicit about the version you want within Kubernetes, and use something else to drive the policy of ensuring your applications are at some specific version or other.
|
# ? Nov 8, 2018 02:28 |
|
|
# ? May 16, 2024 14:19 |
|
here i tried to rephrase and really distill it: ------------------------------------------------------------------------------------------------------------------------ git repo "foo-bar" has two files, a Dockerfile and a cloudbuild.yaml the first line of the dockerfile is "FROM foo:bar" after that we're just dropping a glorified readme file in a docroot the cloudbuild.yaml was pasted earlier, its last line pushes "foo:bar" into GCR git repo "push-butan" is a web interface for self service provisioning of the foo:bar app inside of its source code is the equivilant of: - randomly generate a name - template that name into the api equivilant of a 'kubectl apply -f foobar-deployment.yaml' - foobar-deployment.yaml sets the image to 'gcr.io/project/foo:bar' - user is presented with a link to random_name.example.org (via wildcard dns and nginx-ingress fwiw) - push-butan is already deployed and running we decide it would be nice if the readme had a "thank jeebus" section - add a "RUN echo '<marquee>thank you jeebus for this big rear end check</marquee>' >> /some/dir/index.html" to the dockerfile - git commit to the "foo-bar" repo - that triggers cloudbuild - that pushes a new image to the GCR repo with the "foo:bar" tag - pull up the push-butan web interface - mash the button - click the link - no sign of the jeebus update without involving another tool than what we already have (gke/k8s, gcr, cloudbuild, github, build triggers) is there a "right" or even just "clean and simple" way of solving this? StabbinHobo fucked around with this message at 18:10 on Nov 8, 2018 |
# ? Nov 8, 2018 02:57 |
|
for anyone that might care, here's what I wound up with:code:
code:
StabbinHobo fucked around with this message at 20:36 on Nov 13, 2018 |
# ? Nov 12, 2018 19:23 |
|
Linking on my old xeons is taking forever. What can I buy that has better single threaded performance? I need like i9 9900k but in a dual CPU rack. I'm not sure such a thing exists? I am tempted to replace build VMs with desktop components at this point. Is there something I'm missing, is there a better way to address the problem of Windows c++ link time? (Other than pushing back on engineering side to fix at project/code level).
|
# ? Nov 13, 2018 21:36 |
|
lld now works well enough on windows for Chrome and is pretty dramatically faster than microsoft's linker, so it may be worth looking into using that.
|
# ? Nov 14, 2018 19:48 |
|
Has anyone tackled service discovery in a multi-VPC/account environment in AWS? Consul is great but is very much focused around servers running agents that all talk to each other. As such this doesn't work when we also have lambda services and a mix of public and private DNS zones. Ideally the solution is to be consul-like in that a service will self announce and that replicates in some way across all accounts and computing resources. Which suggests some sort of automation around Route53?
|
# ? Nov 15, 2018 12:46 |
|
Cancelbot posted:Has anyone tackled service discovery in a multi-VPC/account environment in AWS? Consul is great but is very much focused around servers running agents that all talk to each other. As such this doesn't work when we also have lambda services and a mix of public and private DNS zones. https://www.hashicorp.com/blog/consul-1-4-multi-data-center-service-mesh
|
# ? Nov 15, 2018 22:51 |
|
Under what circumstances is it "good" or "correct" to build a different version of your container for different environments? I just saw a company building a container-per-environment, instead of building one container and having a testing/delivery pipeline for that image. It seemed like it was defeating the purpose of containers, but I'm not enough of an authority in the area to say anything.
|
# ? Nov 21, 2018 18:38 |
|
It's debatable. There's a tension between (say) security who wants the container to have the least stuff inside it to reduce the attack surface area, and devs (or SREs) who want to have some basic tooling in the container so they can debug stuff. We solved the issue by forcing devs to bind-mount in their debug tools into the dev containers, but the SREs were SOL because (at least at the time) there was no easy way to bind-mount debug tools into a running container. By "environment" I assume you're talking about dev->test->stage->prod. It does make sense to have multiple build flavors for dev (add debug tools) and test (builds with full debug info, or built with address/thread sanitizer guard code). But there should also be a vanilla flavor that doesn't change from test->stage->prod. Otherwise, the reasoning goes, you're not testing exactly what you're deploying. I've seen situations where the build process was re-done just for stg->prod, because (for security reasons) the build environments were more tightly locked down and free from external dependencies (like network access to Github). This also meant they were slower, and it was inefficient to put them at the front of the pipeline. So folks decided that since the only significant differences were factors that were unlikely to affect the build itself, it was deemed low-risk.
|
# ? Nov 21, 2018 19:20 |
|
minato posted:By "environment" I assume you're talking about dev->test->stage->prod. ... But there should also be a vanilla flavor that doesn't change from test->stage->prod. Otherwise, the reasoning goes, you're not testing exactly what you're deploying. Yes, that's what I meant. And thank you, that's exactly what I thought.
|
# ? Nov 21, 2018 19:24 |
|
Our CTO (we're a small company, he is a co-founder) had us add in a feature where there's a set of folders inside the Java WAR artifact, and then based on the environment parameter pulls file(s) out of a subfolder in the war and we use those files to amend the config file/s or run sql against the db in ephemeral environments. This is working out...ok so far. This is mostly just for third party integrations. I'm a big proponent of same container everywhere, what gets deployed to dev and qa is ultimately the same artifact that is deployed to production, this ensures that there is little chance for unexpected things to go wrong. If a special tool is needed on a container it's installed during debug time inside of a container exec command line. This doesn't happen every day though, especially once the container pattern stabilizes. This all goes out the window for statically compiled binary apps though. CTO wants feature flags for our monolith to be controlled 100% by our deployment app, but we're trying to talk him away from the ledge on that one and make feature flags handled by developers/product in the database instead. Making feature flags an ops problem seems like a terrible idea. Hadlock fucked around with this message at 08:42 on Nov 24, 2018 |
# ? Nov 24, 2018 08:38 |
|
Hadlock posted:Our CTO (we're a small company, he is a co-founder) had us add in a feature where there's a set of folders inside the Java WAR artifact, and then based on the environment parameter pulls file(s) out of a subfolder in the war and we use those files to amend the config file/s or run sql against the db in ephemeral environments. This is working out...ok so far. This is mostly just for third party integrations. Your CTO thinks it’s a good idea to shove all the configs in the artifact and then dynamically mangle the world? That’s horrifying. The whole point of trying to build the artifact once is that it shouldn’t care where it’s being deployed, not that it should embody all possible configurations.
|
# ? Nov 24, 2018 10:15 |
|
Let’s say you want to have a version endpoint or string somewhere in your app. Isn’t that something that’s going to be static in your artifacts? If so, how do you keep the pattern of using same artifact in QA/Staging/Prod if you’ll need to modify this as it gets promoted?
|
# ? Nov 24, 2018 15:27 |
|
Feature flags should be controlled by whoever is willing to take the Pagerduty when it breaks during the break-in period. If I am not the appropriate person to resolve it, why should I get paged first? Sure, you should have automated rollbacks of some type but not all systems are capable of doing that either. Configuration of applications with promoted artifacts is very much similar to the motivations for dependency injection - you invert the process and risk from "here's my configuration. Is it ok?" to "what's my configuration? I presume it's good." By making the configuration steps dynamic, you add in more room for error and exception logic / hacks. Each and every difference between environments is another variable that can cause a problem, and enshrining it in code is a great way to make the differences permanent. We nearly tanked a major release because our configuration behaved differently in AWS and external to AWS over two whitespace characters (one-level indent too many in YAML). None of the developers caught this in their code review and 1000+ line yaml config review specific to the environment's components, so presuming "because it's in code, it will be reviewed and problems caught early" is completely erroneous. This is no different than "because we have infrastructure as code, we'll reliably deploy releases now!" The way we did release de-risking was through canaries, but none of our canaries had the exact configuration combo that caused the problem, and we have so many errors logged we can't distinguish most normal operational behavior from unexpected behaviors. For a real-world case, shoving configuration into build artifacts and dynamic configuration is a major reason we had a critical production incident at my last job specifically due to trying to be convenient for developers by utilizing dynamic configuration behavior. If there was no database connection made within 30 seconds, the code presumed you were a developer and granted root view ACLs for the session. Nobody in management knew such a mode even existed until they were on a bridge and a developer said "oh yeah, that's the mode we use all the time for local work, X put it in 4 years ago." The resulting contract breach and lawsuit has sunk the company (on top of the other bad things going on, granted). Lastly, baking configuration into artifacts makes rotating secrets and strings harder and, at best, slower. If you pull secrets from variables or an externally managed service like Vault, you rotate from a central point of management. The instant you start to sometimes pull from the artifact and sometimes from another source, you're going to start second-guessing the string and more potential for configuration error and outdated dynamic behaviors depending upon environment. So presuming all configuration is baked into the artifact, your build times for said artifact are now the bottleneck for updating settings. So I implore whoever this CTO is to take warnings from my stories of attempts to dynamically do anything with configuration and to read the 12F manifesto. Until you can give really, really good reasons for your exceptions to those rules, it's a good set of guidelines to follow even if you have stateful applications just from a process / policy standpoint (because by and large it'll be better understood and adhered to by newcomers than whatever balkanized hellscape of documentation and policy at your company). I have had a lot more headaches from not following those guidelines than from trying to force applications to follow those patterns. Exceptions are legacy applications that I was forced to lift and shift.
|
# ? Nov 24, 2018 19:23 |
|
necrobobsledder posted:. If there was no database connection made within 30 seconds, the code presumed you were a developer and granted root view ACLs for the session. Nobody what the gently caress
|
# ? Nov 24, 2018 19:54 |
|
necrobobsledder posted:If there was no database connection made within 30 seconds, the code presumed you were a developer and granted root view ACLs for the session. Nobody in management knew such a mode even existed until they were on a bridge and a developer said "oh yeah, that's the mode we use all the time for local work, X put it in 4 years ago." The resulting contract breach and lawsuit has sunk the company (on top of the other bad things going on, granted). I'm sorry, what?
|
# ? Nov 24, 2018 20:21 |
|
I'll leave it that the incident was very public and the company had a history of similar blunders with root causes all pointing to simply bad / ineffectively tested code.
|
# ? Nov 24, 2018 23:53 |
|
Anyone else going to re:invent? Can't wait to look at all the stuff I can't / won't use
|
# ? Nov 25, 2018 00:04 |
|
Nah I am looking forward to kubecon where I am just going to be pitched so much random poo poo by vendors.
|
# ? Nov 25, 2018 00:09 |
|
Ah yeah, that one's next month. Our group split, half are going to reinvent and half are going there.
|
# ? Nov 25, 2018 00:09 |
|
Yeah I am going to reinvent. Went last year as well. Looking forward to a day or two of ambitiously waiting in huge rear end lines before getting all and switching to day drinking with a pinky swear to watch the sessions later online.
|
# ? Nov 25, 2018 01:53 |
|
Ploft-shell crab posted:Let’s say you want to have a version endpoint or string somewhere in your app. Isn’t that something that’s going to be static in your artifacts? If so, how do you keep the pattern of using same artifact in QA/Staging/Prod if you’ll need to modify this as it gets promoted? If it changes based on deployment environment then it's not a version.
|
# ? Nov 25, 2018 02:24 |
|
Kevin Mitnick P.E. posted:If it changes based on deployment environment then it's not a version. It doesn’t change based on deployment environment, but when it’s in QA it’s still just a release candidate for us, not a real release yet. We promote it to being a real release once it goes through a bunch of integration tests with other systems. (having a “QA process” instead of real CICD is non-ideal, but it’s unfortunately not my decision to make) Maybe our endpoint should just return a build number?
|
# ? Nov 25, 2018 03:06 |
|
Ploft-shell crab posted:It doesn’t change based on deployment environment, but when it’s in QA it’s still just a release candidate for us, not a real release yet. We promote it to being a real release once it goes through a bunch of integration tests with other systems. (having a “QA process” instead of real CICD is non-ideal, but it’s unfortunately not my decision to make) Build a thing, do a bunch of QA on it, then build a different (but hopefully close enough) thing and release it is not a process I would be comfortable with.
|
# ? Nov 25, 2018 04:22 |
|
Kevin Mitnick P.E. posted:Build a thing, do a bunch of QA on it, then build a different (but hopefully close enough) thing and release it is not a process I would be comfortable with. Right, that’s something I don’t think I want to do. Here’s what I think are my options, but I’m wondering if there’s maybe another - rebuilding/repackaging artifact at promote time, after QA (bad for the reasons stated) - not having release candidates and only ever sending real releases to QA (would result in a ton of versions for us because we always find stuff in QA) - not having version information available in the artifact
|
# ? Nov 25, 2018 05:06 |
|
Ploft-shell crab posted:Right, that’s something I don’t think I want to do. Here’s what I think are my options, but I’m wondering if there’s maybe another I'm not sure if I've missed something but why can't you just do the old <Major>.<Minor>.<Release>.<Build> ?
|
# ? Nov 25, 2018 05:12 |
|
Our place just does year.month-revision
|
# ? Nov 25, 2018 05:23 |
|
ThePeavstenator posted:I'm not sure if I've missed something but why can't you just do the old <Major>.<Minor>.<Release>.<Build> ? actually I think this would totally work and I don’t know why I didn’t consider doing this. we have been sticking “RC” in our version strings for stuff that hasn’t gone through QA, but dropping that would alleviate all the pain. 🤦♂️ thanks!
|
# ? Nov 25, 2018 05:46 |
|
build once or die trying
|
# ? Nov 25, 2018 05:56 |
|
Gyshall posted:build once or die trying Retry failed jobs none shall ever know
|
# ? Nov 25, 2018 07:32 |
|
Ploft-shell crab posted:actually I think this would totally work and I don’t know why I didn’t consider doing this. we have been sticking “RC” in our version strings for stuff that hasn’t gone through QA, but dropping that would alleviate all the pain. 🤦♂️ thanks! Symantic Versioning saves the day! Just be prepared to deal with people that are real pissy about having a high number in a build version because of "reasons".
|
# ? Nov 26, 2018 16:51 |
|
When commissioning a new server, I run through a mental checklist I've cobbled together from linode and digital ocean tutorials for CentOS or Ubuntu machines. Eg: (1) upload public (2) add non-root user (3) add user to groups (4) configure firewall (5) ... (6) N Hasn't been an issue, because they've been my approaching-nil traffic personal VPSs. Weaknesses are obvious. What does "doing it right" involve?
|
# ? Nov 26, 2018 17:03 |
|
toadoftoadhall posted:When commissioning a new server, I run through a mental checklist I've cobbled together from linode and digital ocean tutorials for CentOS or Ubuntu machines. Eg: Less pithy answer: pick your favorite stig/hardening, make it into an image / kickstart / autoinstall. Basically don't do it by hand
|
# ? Nov 26, 2018 17:12 |
|
Bhodi posted:Not doing this every time Stig? I may have misunderstood, but you're saying, take a distribution image, do your initial configuration, freeze it into a new image, and deploy that new image going forward? In my case the virtualisation platform is managed by an independent team with some standard ISOs that I can choose from; so all configuration must be done for each deployment, using something like...fab, or ansible (I'm guessing...I've used the former a bit; it's a python package for scripting ssh. I haven't used ansible).
|
# ? Nov 26, 2018 17:43 |
|
Yeah if you can't bake an image, use Ansible imo.
|
# ? Nov 26, 2018 17:56 |
|
out of curiosity, what even still has you monkey loving vps configs at all? is there a technical limitation of the many PaaS vendors, a pricing issue, or just a cultural/old-ways thing?
|
# ? Nov 26, 2018 18:13 |
|
Current re:Invent status: Waiting in an hour long line to even register for the conference. Going to miss my first session. No food or coffee because everyplace that serves those also has an hour long line. My coworker went to a different venue, was registered and had coffee in like 10 minutes. Currently researching the legality of murder in Nevada.
|
# ? Nov 26, 2018 18:32 |
|
StabbinHobo posted:out of curiosity, what even still has you monkey loving vps configs at all? is there a technical limitation of the many PaaS vendors, a pricing issue, or just a cultural/old-ways thing? habit. not being plugged into many technical communities (and lowendbox being one of those I am plugged into). also the fact that hereto, they've been servers for not-very-important personal use. an impetus is that now it's for work. toadoftoadhall fucked around with this message at 18:37 on Nov 26, 2018 |
# ? Nov 26, 2018 18:33 |
|
StabbinHobo posted:out of curiosity, what even still has you monkey loving vps configs at all? is there a technical limitation of the many PaaS vendors, a pricing issue, or just a cultural/old-ways thing? My answer. I want docker zfs running with snapshots.
|
# ? Nov 26, 2018 21:11 |
|
|
# ? May 16, 2024 14:19 |
|
Docjowles posted:Current re:Invent status: Waiting in an hour long line to even register for the conference. Going to miss my first session. No food or coffee because everyplace that serves those also has an hour long line. the shuttle busses are the most silly thing to me, it takes an hour to get ferried across the street because this city is hell on earth, it's like everything is done and designed in the worst way possible, deliberately it's 4pm and feels like 10pm, i spent all day getting pitched to and what little I learned could have been conveyed in a 10 minute blog post read; i want to pull the plug on this entire week
|
# ? Nov 27, 2018 00:53 |