Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
StabbinHobo
Oct 18, 2002

by Jeffrey of YOSPOS
Has anyone built a CI/CD pipeline for a Unity 3d app? I've built lamp and jvm pipelines but never a win/c# one, I Just need a good blog post that walks through the options at the different steps.

Adbot
ADBOT LOVES YOU

StabbinHobo
Oct 18, 2002

by Jeffrey of YOSPOS
vr app, targeting gear-vr/cardboard. total greenfield.

StabbinHobo
Oct 18, 2002

by Jeffrey of YOSPOS

wins32767 posted:

Rails/MySQL on Centos. To start with: Rebuild our infrastructure to be fault tolerant, get us tooling to support Continuous Delivery, troubleshoot and resolve production issues. Me and the CTO to start.
you'd rather pay a salary than heroku

StabbinHobo
Oct 18, 2002

by Jeffrey of YOSPOS

Virigoth posted:

Jenkins has a new blog post up for GC tuning on large instances. I"m going to put it on our test server and throw some load at it.
GC tuning blog post
this is great, ty

StabbinHobo
Oct 18, 2002

by Jeffrey of YOSPOS
I started trying to rig up a combo terraform+ansible aws setup... does everyone really just use local_exec to run ansible-playbook? feels too easy, like a trap

its strange i like both tools so far but i cannot get my head around the notion of arbitrary clients running arbitrary versions of them connecting into live poo poo, seems bannanas. but if you try to have a dedicated node for it all of a sudden you're in recursive spiral.

StabbinHobo
Oct 18, 2002

by Jeffrey of YOSPOS

necrobobsledder posted:

The deployment patterns I'm seeing still are lift and shifts taking 3+ years and rewrites that will take another 5+, so in 8 years they will be where most companies that use cloud somewhat right will have been 5 years ago while the technology gap between laggards and fast movers has grown larger.
this is simultaneously killing me and making me a ton of money. you can consult on these projects drat near in perpetuity, but all you've gone and done is spent 5 years making decent money off of dying/failing orgs.

StabbinHobo
Oct 18, 2002

by Jeffrey of YOSPOS

Hadlock posted:

I've been selected to put our key monitoring stuff in to a unified dashboard that's going to be powered by a raspberry pi 3.

Our current monitoring collectors are,

Nagios
Datadog
Muinen
Graylog
Splunk

I'd love to feed all this crap in to something like prometheus and then report on it via grafana, and have it rotate through various dashboards but that seems like a ton of work

Option B, I guess write up one or two dashboards per service, then have the Pi browse to each board? Seems super clunky though.

What's best practice here? This org is about a decade behind so whatever is current is going to be a million times better than whatever they have now.

Googling for advice on this is useless, every dashboard company has SEO'd good modern discussion off the first couple of pages.
you're doing this: https://xkcd.com/927/

just put up a datadog dashboard that includes splunk data. delete nagios, muinen and greylog from your environment.

StabbinHobo
Oct 18, 2002

by Jeffrey of YOSPOS

necrobobsledder posted:

I have tried to do the migration steps just to get applications somewhat stateless and monitored at about 15 different companies / customers now and basically all of them are failures for cultural reasons rather than some technical reason
i'm 0 for 3 as well.

StabbinHobo
Oct 18, 2002

by Jeffrey of YOSPOS

AWWNAW posted:

I migrated dozens of JVM services to Kubernetes without much effort or pain, but they were all stateless to begin with and containerizing them was easy too.

java -jar thing.jar was containerization before it was cool

StabbinHobo
Oct 18, 2002

by Jeffrey of YOSPOS

jaegerx posted:

I do openshift if anyone wants help with that.
is there a "centos" of openshift? (or just like, explain to me how i would go about freeloading without it turning into a fedora like freakshow)

StabbinHobo
Oct 18, 2002

by Jeffrey of YOSPOS
i can't fathom how terrible someone must be at making vendor decisions if they let a douchey sales rep even loving register in the thought process let alone be the key takeaway

StabbinHobo
Oct 18, 2002

by Jeffrey of YOSPOS
you can only skip phoenix project if you have 100% found religion about bottlenecks and local vs systemic optimization

StabbinHobo
Oct 18, 2002

by Jeffrey of YOSPOS
edit: oh crap i was pages back

StabbinHobo
Oct 18, 2002

by Jeffrey of YOSPOS
what are people using for build pipelines on greenfield GKE projects? I have a use case where I want to add very thin custom layers on top of pre-existing container images as fast as conceivably possible (ideally closer to hundreds of milliseconds than seconds).

StabbinHobo
Oct 18, 2002

by Jeffrey of YOSPOS

necrobobsledder posted:

Trying to get a rough idea of what’s expected stress / responsibilities compared to others that have broader experience than myself.

Is it normal for companies to hire “devops” engineers as a hero engineer that are expected to take completely garbage, stateful, poorly documented, unautomated legacy (5 - 15 years old) software and have exactly one engineer out of 8 - 30 engineers take over most of infrastructure ownership, deployments, release management, and deliver a CI/CD pipeline in less than half a year while being on-call? I’ve talked to dozens of companies (large, small, b2c, enterprise - the full gamut) in several non-tech hubs for years and all but 3 companies seem to want / need exactly this (in veiled or not so veiled intent) while paying maybe 20% more for said engineer(s). It’s getting super old being deployment dave when I spend 30% of my time documenting and making deployments push-button easy for others and getting stuck with marching orders like Dockerizing super stateful, brittle software intended to be pushed into a K8S cluster.

yep, I call it "the devops trap". I've turned down three "director of devops" jobs just in the last year because I simply can't bring myself to step into that awful loving life again. Its not just a lovely work dynamic, it really fucks up your brain and your life to spend the majority of your waking hours mentally crouching in the fetal position, being interrupt-bombarded with broken poo poo that somehow the onus is on you now for. you inevitably grow to resent everyone around you, and then your behavior changes such that they inevitably grow to resent you. this is why every "devops guy" is a 1 - 3 year per company job-hopper, and some kind of functional-alcoholic/stoner.

the naive think that if they just job hop another time or two they'll find the place that does it right. I'm 0 for 9.

StabbinHobo
Oct 18, 2002

by Jeffrey of YOSPOS
been there a hundred times. you don't get to have the process you didn't build before the emergency. if your pipeline takes 15 minutes, and you're not even sure if the first try will fix it, well then you're looking at a 30+ minute incident. thats what you built, thats what you get.

this is where so many shops get stuck on ci/cd. they build a massive amount of rear end-covering junk into the build & deploy cycle, because thats what it took to get all the cowards and middle management onboard, and because most developers never met a widget they wouldn't gladly bolt onto the contraption. it leaves you incapable of timely responses to real world events. poo poo like extraneous tiers (stage/qa/uat), manual/human-intervention blocking, serialized tests, lots of fetches to external poo poo like github/apt/docker-hub/etc, slow ramps because no one ever bothers to profile & tune the startup/warmup phase.

your options are to speed up your deploys to take seconds/minutes, or to religiously wrap every discreet feature in a flag so that it can be disabled live without a code deploy (in your case whatever sub-section of your status dashboard is making that expensive redis call).

imho you should do both, but if you did neither there isn't really a workaround anyone can tell you other than "go full cowboy with whatever poo poo you got", and thats a bad plan.

code is data. you can read and write to a db in milliseconds, if you cant deploy code within five orders of magnitude of that ur doin it rong.

StabbinHobo fucked around with this message at 03:28 on Apr 25, 2018

StabbinHobo
Oct 18, 2002

by Jeffrey of YOSPOS
you cannot jenkins your way out of the devops trap

StabbinHobo
Oct 18, 2002

by Jeffrey of YOSPOS
its very rudimentary but also very solid and economical so I'm sticking with it for now

the gcloud cli support for it is pretty weak, you can't trigger a (remote) build using it, you have to curl the rest api.

however once we got the hang of the container-builder-local plugin for gcloud, we almost never do the remote/cloud build now.

StabbinHobo
Oct 18, 2002

by Jeffrey of YOSPOS

Methanar posted:

<roomba stuck in a corner>
can you believe they pay you for this poo poo

StabbinHobo
Oct 18, 2002

by Jeffrey of YOSPOS
nginx-ingress does too but i have no idea how

StabbinHobo
Oct 18, 2002

by Jeffrey of YOSPOS
CI/CD has a slash for a reason. Ansible was named ansible because mpdehaan couldn't name it "func 2.0: gently caress puppet edition". He left puppet labs thinking the declarative model was wrong and what people really wanted was a combined config and deploy tool that simply ran the checklist of steps automatically, and he no longer worked at redhat who held the func name copyright.

It was and is a "CD" tool from inception. No, it is not a "CI" (build) server. Yes its annoying to draw this distinction as you can usually bleed 80% of one side into the other and get-by in a lot of scenarios.

fuckin yoots

StabbinHobo fucked around with this message at 00:36 on Sep 24, 2018

StabbinHobo
Oct 18, 2002

by Jeffrey of YOSPOS
lol i'd start ranting about the awsvpc cni for eks being the same drat thing as the one for ecs so you're still pod-count limited to the number of enis if you want security groups to mean anything, flip the table, and storm out

StabbinHobo
Oct 18, 2002

by Jeffrey of YOSPOS
if you're taking a paycut to get into devops you're either a surgeon or doing it wrong

StabbinHobo
Oct 18, 2002

by Jeffrey of YOSPOS
is it possible to have a cloudbuild.yaml in your git repo that works for both your prod ci/cd pipeline *and* devs using "cloud-build-local" against their own test envs/gcp-projects?

related q: y'all just yolo a kubectl command at the end of the steps or have a better deploy method?

StabbinHobo
Oct 18, 2002

by Jeffrey of YOSPOS

Vulture Culture posted:

I've got a five-digit number of server instances booting off of read-only NFS and running a bootstrap script to deploy services into tmpfs. We don't even cloud-init, we just key off a couple of DHCP fields.

nfs is making a comeback

StabbinHobo
Oct 18, 2002

by Jeffrey of YOSPOS
you're pretty much on a good track, only thing i'd say is have the f5 point to the nginx ingress controller instead of rolling your own nodeport/proxy-container solution:
https://github.com/kubernetes/ingress-nginx

also start with some kind of rancher/openshift distro for a datacenter, doing it all yourself is just too much.

StabbinHobo
Oct 18, 2002

by Jeffrey of YOSPOS

Warbird posted:

the architects who designed the DevOps ecosystem ... apparently never discussed the matter with the Ops team that owns the environment
https://www.youtube.com/watch?v=GibiNy4d4gc

StabbinHobo
Oct 18, 2002

by Jeffrey of YOSPOS
no tool can can cleanly abstract the architectural landfill heap produced by the standard corporate model of "a gaggle of people changing their mind every two weeks" over a meaningful time horizon

StabbinHobo
Oct 18, 2002

by Jeffrey of YOSPOS
<3 charity

on another note... i'm stumped on docker registry image names and tags and how they relate to k8s imagepullpolicy's and how I can tie this all together.

say you have an upstream docker hub image named "foo" and a tagged version of it we want named ":bar"

then you have a git repo with a dockerfile that starts with "FROM foo:bar" and a half dozen tweaks.

then you have a cloudbuild.yaml in that repo that looks roughly like

quote:

steps:
- name: 'gcr.io/cloud-builders/docker'
args: ['build', '-t', 'gcr.io/$PROJECT_ID/foo:bar', '.']
images: ['gcr.io/$PROJECT_ID/foo:bar']

then a build trigger setup so that any commits to that dockerfile push the new image into your GCR

great so we add a line to our docker file, the build gets triggered, and the new image gets pushed to the registry and it steals the name from or overwrites the previous one somehow (don't get this part).

that all "works" but then i have no idea how to get newly deployed instances of foo:bar to pickup the new image. they keep using the old one, even for new deployments.

googling around and reading it seems like there are three lovely options:
- change the ImagePullPolicy on all new deployments to be Always, live with whatever slowdown this causes (these things already take too long, ~30 seconds, I want that to go down not up)
- somehow involve the tag ":latest" which seems to have some hacked in corner case support, but also all the docs warn you against doing it, also I have no idea how I would make a "foo:bar:latest"
- stuff the short-sha in as a tag somehow, say foo_bar:ABCD1234, and only ever deploy the specific image (tag? wtf is the diff) for each new deployment.

I instinctively like the third option because its more explicit, however that leaves me with a random metadata dependency now. my utility that fires off new deployments (a webapp) now has to somehow what? scan the registry for new images? accept a hook at the end of the cloudbuild to know the tag and store that somewhere? That would create a weird lovely circular dependency where my webapp has to be up and working for my builds to succeed. And I have to somehow provide or pass it said tag at startup. Seems like a mess.

so, to recap:
- how do i make it so that my push-butan webapp can always deploy the latest version of our tweaked version of an upstream docker hub image, without making GBS threads where i sleep

edit: also i have no idea how its working now that i can just refer to the upstream "foo:bar" in my dockerfile, and that works, but then in my deployments say "foo:bar" and it somehow magically knows to use my "foo:bar" from GCR not the "foo:bar" from docker hub.

StabbinHobo fucked around with this message at 19:15 on Nov 7, 2018

StabbinHobo
Oct 18, 2002

by Jeffrey of YOSPOS
thank you mr. vulture

Vulture Culture posted:

I'd just tell it explicitly what version needs to be deployed, personally, then you can wire that into any other process you want. I personally push to keep the deployment mechanics separate from the policy ("only the latest version should ever be deployed"), because keeping the two tightly coupled complicates the process of changing either one.

TBH it's pretty weird that your tool both doesn't know what version it's deploying and can't be told what version to deploy. How do you track what versions are being deployed where at what times? This design seems like a liability at multiple levels that's worth spending time to fix.

ok sorry but I don't understand what you're saying. this is all new and not meaningfully live yet. when I started I just had my tool/webapp hardcoded to use the "foo:bar" image for for each new deployment. for a good while we didn't have to change the image so it didn't matter. then a few times we did have to change it, but we were tearing down and rebuilding the cluster so frequently anyway for other stuff that it was easier to just do that and not think about it further. now we're getting to the point where we actually need to update it without a teardown so i'm here futzing. lol at even calling it a "design" I'm just playing k8s-docs+stack-overflow whack-a-mole.

so now if I go change the code in my webapp to deploy foo_bar:12345 then i've *one time* solved the problem but just moved the problem down the road a little farther. or i guess i've created a "every time you update the image you also have to redeploy the tool" workflow. not the end of the world, just more dependencies and schlock work. I could automate it by having the cloudbuild job post it to an endpoint on my webapp and then have my webapp store it in a datastore or something, but that seems wrong in a way I can't quite articulate. assume for the moment my webapp is ephemeral and doesn't even have a persistent datastore, when it gets redeployed it doesn't know what image to use until the next cloudbuild job is run to tell it.

conceptually "latest" is exactly what I want, its just this [1] [2] [3] [4] make that seem like a bad road.

[1] https://github.com/kubernetes/kubernetes/issues/33664
[2] https://kubernetes.io/docs/concepts/containers/images/#updating-images
[3] https://discuss.kubernetes.io/t/use-latest-image-tag-to-update-a-deployment/2929
[4] https://github.com/kubernetes/kubernetes/issues/13488

much like an "apt install foo" will just always give you the latest foo, I just want my deployment of an image to always use the latest image.

StabbinHobo fucked around with this message at 00:22 on Nov 8, 2018

StabbinHobo
Oct 18, 2002

by Jeffrey of YOSPOS

Hadlock posted:

We use core roller as our source of truth for container image versions. The web app, broker and database each have a application, and each app has three channels, dev/nightly, release (candidate)/qa and stable/prod. When the ci does a new build, it updates the dev/nightly channel, qa will cherry pick a release for testing, and then when they, or their tests bless a release, stable gets updated.

Coreroller has a simple restful api and all our scripts/deploy apps reference it as the up to date source of truth. Works great. We used the official CoreOS version, coreupdate at another company to do the same thing. Also helps keep your release manager sane as it generates an audit log in the db.

You could arguably handle this by managing it in a csv file on an http endpoint somewhere using basic auth, but restful json stuff is very portable and pretty much universally compatible, and they already did the work for you

I googled core roller

and I'm pretty sure you just told me i'm fat and need to work out more. Sure, true, but I don't see how that helps :)

Nah, I think I get what you mean, but pretend this isn't some pre-existing professional IT environment with lots of tools and people to do work and poo poo. I'm trying to configure the basic bare minimum components for a PoC, not hook into an existing enterprise. Just looking at coreroller's screenshots screams overkill.

StabbinHobo fucked around with this message at 00:45 on Nov 8, 2018

StabbinHobo
Oct 18, 2002

by Jeffrey of YOSPOS

Methanar posted:

Can you make your build system output the sha of the branch you built somewhere and inject it into a kubectl/helm/whatever command to upgrade as part of your CD, if you're doing CD

For me that looks like this where image.tag is a variable in my deployment.yaml

code:
 
 17     spec:
 18       containers:
 19       - name: {{ .Values.name }}
 20         image: "gcr.io/thing/thing/thing:{{.Values.image.tag}}"

code:
helm upgrade thing  thing.tgz --set image.tag=f6e5810
If you really just want :latest to always be the latest and expect it to work, I think you will need to just set your imagepullpolicy to always. Which maybe isn't the end of the world if your stuff does deploy quick.

sure but what is that "somewhere" and then who's running that helm command? that just dragged another dependency *and* human into the flow.

StabbinHobo
Oct 18, 2002

by Jeffrey of YOSPOS

Vanadium posted:

There shouldn't be a human. My understanding based on doing something analogous with terraform and ECS and reading through a coworker's k8s setup is: Your automation/build script should not only generate a docker image and tag it with something like the git commit ID, it should at the same time also generate the updated yaml files (maybe from a template in the repo) referencing the new image tagged with a unique ID, and then shove that file into k8s or whatever.
I'm interpreting that as "add a step to the end of your cloudbuild.yaml" but what does that step do?

Methanar posted:

What I mean more is when your build job completes, your code has had its dependencies pulled down, modules compiled, docker images built and pushed to a registry, tests ran. Your task runner completes the build step and then executes another job for the deploy step. That next step might just be a helm upgrade.
ok, second vote for "add a step to the end of your cloudbuild.yaml"

Scikar posted:

you can have multiple tags per image, updated on different cycles. So when you push foo_bar:ABCD1234 it can also update foo_bar:latest in the registry to point to the same image digest. That would give you a reference point for your deployments, so when you press deploy in your webapp it looks up foo_bar:latest, reads the digest, and creates a new deployment with that digest directly, kubernetes doesn't have to know or care what the tag is. Or if you need to upgrade existing deployments, get and patch them with the updated digest so kubernetes can do a rolling upgrade (I think, I'm not quite at that point yet myself).
so I should use ":latest" but not actually in the deployment, just as a thing to query the registry on? how do you query GCR like that?

quote:

Lastly, if you do have a hook I would put it on your registry rather than the CI build. The build process just has to get the new image up to the registry and it's done. The registry itself (I assume, we use the Azure one which can at least) can then notify your webapp that a new version was pushed and trigger that workflow, but if your webapp isn't running your build still succeeds (and presumably your webapp can just run the workflow when it does get started again).
ok so thats a vote for "don't add a step to the end of your cloudbuild.yaml, figure out how to have GCR trigger its own separate thing"

Vulture Culture posted:

I'm having a little bit of trouble understanding what "my tool/webapp" actually does, or why it's important to your workflow that this application does it. Could you elaborate? I want to make sure I'm providing good recommendations and not inundating you with general K8s guidance that for some reason doesn't fit your specific business requirements at all.
basically think of a blank webpage with a single submit button in the middle, you click that button and you get back a hostname. you click that hostname and your browser loads your "instance" (deployment) of the image. the most rudimentary self-service "provision me a copy of the x app" tool you can fathom.

quote:

An image isn't a deployment construct, it's an image, in the same way that a VMware template isn't a cluster of VMs, it's a template. If your deployment approach conflates the two, you're going to have a bad time. You don't need to do something crazy like put a Spinnaker CD system into production, or even a "package manager" like Helm, but you should leverage things like Kubernetes deployments where they make sense.
yea maybe I phrased that wrong somewhere because these are single-container-pods that all run the same image, but yes I'm using the Deployment construct for this.

quote:

The problem you're trying to deal with—upgrading a deployed K8s application to a new image tag
no, i don't care (yet) about updating any of the previously deployed ones, I just want to know that the *next* one (when someone mashes that button) deploys the latest image

quote:

One thing you might not have considered is that if you use tags for this approach, when an application is automatically respawned—say, because you have a host fall over and your pod gets automatically rescheduled onto another host—that host will start the deployment process by pulling down the latest version of the image, giving you an upgrade you weren't prepared for and might not want right now. In the case of something like Jenkins (an intentionally convoluted, triggering example of an app), this can be painful because it will grab an image that might be completely incompatible with the configuration or the on-disk volume data that you've configured your app to use. In almost all cases, it's better to be explicit about the version you want within Kubernetes, and use something else to drive the policy of ensuring your applications are at some specific version or other.
yea so far all our changes are backward compatible additions so even though I'm aware of this possible issue its not an issue for us (yet). also the "use something else" thing is basically the question here :)

StabbinHobo
Oct 18, 2002

by Jeffrey of YOSPOS
here i tried to rephrase and really distill it:
------------------------------------------------------------------------------------------------------------------------

git repo "foo-bar" has two files, a Dockerfile and a cloudbuild.yaml
the first line of the dockerfile is "FROM foo:bar" after that we're just dropping a glorified readme file in a docroot
the cloudbuild.yaml was pasted earlier, its last line pushes "foo:bar" into GCR

git repo "push-butan" is a web interface for self service provisioning of the foo:bar app
inside of its source code is the equivilant of:
- randomly generate a name
- template that name into the api equivilant of a 'kubectl apply -f foobar-deployment.yaml'
- foobar-deployment.yaml sets the image to 'gcr.io/project/foo:bar'
- user is presented with a link to random_name.example.org (via wildcard dns and nginx-ingress fwiw)
- push-butan is already deployed and running

we decide it would be nice if the readme had a "thank jeebus" section
- add a "RUN echo '<marquee>thank you jeebus for this big rear end check</marquee>' >> /some/dir/index.html" to the dockerfile
- git commit to the "foo-bar" repo
- that triggers cloudbuild
- that pushes a new image to the GCR repo with the "foo:bar" tag
- pull up the push-butan web interface
- mash the button
- click the link
- no sign of the jeebus update

without involving another tool than what we already have (gke/k8s, gcr, cloudbuild, github, build triggers) is there
a "right" or even just "clean and simple" way of solving this?

StabbinHobo fucked around with this message at 18:10 on Nov 8, 2018

StabbinHobo
Oct 18, 2002

by Jeffrey of YOSPOS
for anyone that might care, here's what I wound up with:

code:
LATEST_TAG=$(gcloud container images list-tags "gcr.io/$GOOGLE_CLOUD_PROJECT/foo" --sort-by=~timestamp --limit=1 --format='value(TAGS)')
kubectl set env deployment/push-butan foo=$LATEST_TAG
edit: oh and cloudbuild.yaml went
code:
-   args: ['build', '-t', 'gcr.io/$PROJECT_ID/foo:bar', '.']
+   args: ['build', '-t', 'gcr.io/$PROJECT_ID/foo:$SHORT_SHA', '.']
- images: ['gcr.io/$PROJECT_ID/foo:bar']
+ images: ['gcr.io/$PROJECT_ID/foo:$SHORT_SHA']

StabbinHobo fucked around with this message at 20:36 on Nov 13, 2018

StabbinHobo
Oct 18, 2002

by Jeffrey of YOSPOS
out of curiosity, what even still has you monkey loving vps configs at all? is there a technical limitation of the many PaaS vendors, a pricing issue, or just a cultural/old-ways thing?

StabbinHobo
Oct 18, 2002

by Jeffrey of YOSPOS
joiners gotta join

StabbinHobo
Oct 18, 2002

by Jeffrey of YOSPOS
you're probably thinking of the australian one

StabbinHobo
Oct 18, 2002

by Jeffrey of YOSPOS
there are few dirtier words than "hybrid". its the "have your cake and eat it too" of tech.

Adbot
ADBOT LOVES YOU

StabbinHobo
Oct 18, 2002

by Jeffrey of YOSPOS

necrobobsledder posted:

How do you guys do continuous prod deployments to systems that have message queue based communication and handle heterogeneous application component versions consuming from shared queues? In a synchronous processing world that'd be an endpoint handled by different versions of the service. We have a web request frontend, clients upload large artifacts separately (S3, their own hosting service, etc.), reference them in their API request, and processing is picked up asynchronously via SQS queues serialized as < 1 KB XML messages across several upstream services that self-report the status of their tasks to the primary Aurora MySQL DB. I'm trying to setup an architecture in AWS using a canary / blue-green approach using environment-specific SQS queues, load balancers, and instances but shared data stores like S3 buckets and DBs. DB updates to apps will be done by mutating their views, not by changing the actual underlying DB structures (the latency hit isn't measurable for us in tests so far). This would allow us to make a bunch of changes in production as necessary, cherry pick messages from queues to run through a deployment candidate's queues, and rollback changes faster than we do now (a deployment process straight out of 1995 but in AWS and with 90% of our services that can't be shut down on demand without losing customer data, which really, really, really is a pain in the rear end)
this somewhat impossible recursive chasing of a way to abstract away a state assumption is, in large part, why kafka was invented.

jury is still out on if thats a good thing (i lean yes).

sorry that doesn't really help you though because "rewrite everything to upgrade from rmq to kafka" is about as helpful as "install linux problem solved".

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply