|
The Fool posted:you don't Agreed and thanks for the advice. I'm desperately trying to convince people that making a separate opinionated module for every single resource is a god-awful idea but I'm not getting any traction.
|
# ? Jan 26, 2024 01:07 |
|
|
# ? May 22, 2024 20:45 |
|
Joining in on the cloud vs on-prem discussion, I work in the IT for largish university and we have relatively minimal cloud usage. I'm often wondering how wrong my views on cloud are. The way I see it, a large switch to cloud would produce lot of costs and work, without similar levels of reduction on our on-prem side. It would probably be impossible to get rid of our multiple data centers and reduction on the servers running on them would give minimal savings. We would still have too much services and systems we would need to run on-prem and some of them we would want to spread to at least two data centers. We've had multiple data centers since the 60s, it's easy to keep up the momentum. And we are hesitant to put a lot of our data to somewhere we can't see it. We have an understanding in out organisation that some of our data would be illegal to take out of country/EU. No one was ever jailed for keeping data on-prem. I have currently a case of a virtual server storing a PhD students data is reaching EOL and need to be replaced. The current one is running on our main VMware clustered and when I heard what kind of data it contained I have been thinking it should have been in our restricted cluster all along. I'm considering if I should consult our IT security team. I certainly wouldn't dare to put it on MS Azure without consulting a lawyer. Another case last month, a research group requested for a 8-core, 32GB virtual server for a six month project. I assume something like this would accrue quite a bit of costs in the cloud. This could have fit in our VMware, but the VMware admin was concerned if it might detrimental effects running at full blast regularly. So as a compromise I installed it on a recently decommissioned VMware ESXi host. I'm also thinking about the HPC and Lustre clusters the guys on the research side built using old servers and 100Gb Infiniband hardware they bought used. Sure, the Lustre blows up on their face every other week, but it's something interesting to tinker with and something similar would probably be hellishly expensive in the cloud. 12 Rats mentioned about "renting the spike" and that is not really a concern for us. Either we have computation systems that run heavily all the time, or we have services running at pretty steady load. Our most important customers are the prospective students and the loads may increase during application period, but I doubt many would choose another university just because our website was running a bit slow.
|
# ? Jan 26, 2024 03:54 |
|
I just left a job at a big research University in central IT where we were just getting started with cloud and I had a lot of the same thoughts you did. Our hardware and licensing costs were low enough, we had enough inertia with existing data centers, we had enough use cases that were just too crazy for cloud, that I could just never see a reason to "move" everything to the cloud. I'm sure many of the problems I saw were solvable, but we couldn't pay enough to get or keep the talent that could pull properly vision it or pull any of it off, and our decentralized nature meant we couldn't focus on building any kinds of platforms or processes on top of the cloud, so we were pretty much left to sign the agreements with cloud providers and deal with billing and let departments do their own thing in the cloud.
|
# ? Jan 26, 2024 04:36 |
|
As an example of a workload that is able to leverage the real advantage of cloud services: I work for a tax prep company, we scale down an average of 80% during the off season.
|
# ? Jan 26, 2024 04:54 |
|
Yeah cloud providers let you scale up and down or experiment without committing to capital. Alternately there's a lot of useful services and stuff, it's not just "rent a server in the cloud".
|
# ? Jan 26, 2024 06:11 |
|
That was a big problem at my old place, most people couldn't conceive of cloud as anything other than VMs. And it would never make sense for them to just run general VMs in the cloud. The groups that were most successful were using more managed services rather than just pure IaaS. Meanwhile the new place is a young tech company where I think the biggest benefit of the cloud early on was that they could start without a huge upfront investment. And now they're cloud native and also world wide and but still a relatively small compute foot print, and that just won't ever make sense to go physical.
|
# ? Jan 26, 2024 06:51 |
|
The problem with most of the managed services in most major clouds is that aside from opaque repackaging of other services, most of what they give you aren't fully-baked products. They're building blocks with few opinions, which are great when you're trying to construct business processes, and absolutely awful when you're trying to build humane user experiences for your developers. The most valuable products are still the things that could have come from a 2004-era Google paper: S3, DynamoDB, anything you build for intentionally that lets you forget about scale. Once you get cross-cutting concerns like backup and data retention involved, every other product manages to somehow make scaling harder just by existing. This isn't a feather in the cap of on-prem in any way; it's more saying that cloud fails to add value more than it actually does it. FISHMANPET posted:Meanwhile the new place is a young tech company where I think the biggest benefit of the cloud early on was that they could start without a huge upfront investment. And now they're cloud native and also world wide and but still a relatively small compute foot print, and that just won't ever make sense to go physical. Vulture Culture fucked around with this message at 14:29 on Jan 26, 2024 |
# ? Jan 26, 2024 14:23 |
|
I'll fight anyone that claims that there's a better production-ready queue than SQS, taking into account performance, usability, scalability, and maintenance. It may have its warts but I've been doing this long enough that I remember having to run clustered RabbitMQ and that was an absolute nightmare.
|
# ? Jan 26, 2024 14:47 |
|
Vulture Culture posted:The problem with most of the managed services in most major clouds is that aside from opaque repackaging of other services, most of what they give you aren't fully-baked products. They're building blocks with few opinions, which are great when you're trying to construct business processes, and absolutely awful when you're trying to build humane user experiences for your developers. The most valuable products are still the things that could have come from a 2004-era Google paper: S3, DynamoDB, anything you build for intentionally that lets you forget about scale. Once you get cross-cutting concerns like backup and data retention involved, every other product manages to somehow make scaling harder just by existing. This explains a lot about our ROSA experience, which seems to require handholding for anything but the most trivial operations.
|
# ? Jan 26, 2024 18:17 |
|
Blinkz0rz posted:I'll fight anyone that claims that there's a better production-ready queue than SQS, taking into account performance, usability, scalability, and maintenance. No, I agree, SQS is very good. EMR / Spot are also very good and are certainly defining characteristics of running in the cloud. Being able to scale down 80% during the off season also rules. There's a point at which you've thrown enough workflows into EMR/Spot/S3 that it makes sense to host it yourself but that point is very far away from the starting line. Another issue is if you have to hire a bunch of cloud touchers you end up hiring a bunch of people who read a lot of cloud vendor marketing materials. You'll end up with a micro-account based strategy that makes sense to nobody and requires dedicated admin/accounting staff to pay your bill and janitor the cost allocation tags across 20+ aws accounts to make sure they're being applied to resources correctly and then you need a few more security people to set up Security Hub Guard Duty Whatever The gently caress in all 20+ of your accounts and then that means s3 logs, cross account permissions, log forwarding, probably you have some collector that ingests all the logs into a product the security team bought that runs compliance scans, and now you have tons of junk logs that only exist for a compliance check to run so you're janitoring 20+ accounts worth of s3 lifecycle policies, etc. It's not my experience that the cloud is cheaper in headcount either, is what I'm getting at, the headcount just looks different. I think the bigger truth is that some environments are just easier and simpler, due to being new companies without 15 years of microsoft office sourced technical and process debt. It doesn't have that much to do with whether you bought the servers or you rent them from jeff bezos.
|
# ? Jan 26, 2024 18:33 |
|
It is worth noting that on-prem is bit pain in its own right because you're having to manage that hardware somehow. If you own the datacenter then you've got to hire staff to run it, maintain the equipment, pay for the power, run the HVAC, etc. Like yeah, an EC2 VM isn't cheap, but definitely people forget about all the things you don't have to worry about directly when you're on a cloud provider. I think there's still plenty of reasons to do on-prem but make sure you're thinking through the whole cost of ownership. Similarly, I agree that the hosted services like S3 / SQS / DDB (and their Azure/Gcloud equivilents) are all extremely useful too and IMO are one of the biggest arguments for a cloud provider, and often help you avoid things like 'oh yeah we need a server to host files sometimes' or 'I guess we need to build up a message bus so that's a few more servers'. At certain scale levels it's a total slam dunk from a cost perspective. I was extremely skeptical of cloud offerings back when my only exposure was 'lift and shift onto the cloud' but once you start thinking about the service offerings and the reasonably easy interoperation between those services it gets a lot more interesting.
|
# ? Jan 27, 2024 01:41 |
|
There's several factors going on that may make for a moderate-scale on-premises renaissance. I think that cloud will become the default option for most organizations, especially because cloud's offerings of turnkey managed services and ability to empower groups to self-service lets everyone move faster. However, like DHH's post said, a large subset of the technologies and practices that we're using for infrastructure management in the cloud can also be applied to on-premises hardware. I think that Broadcom being so aggressive with the VMWare pricing is going to accelerate low-expertise groups moving to the cloud, but also speed higher technical expertise groups with bigger workloads will find some way to get to a K8s on bare metal model, without the need to pay exploitive hypervisor licensing costs. From when EC2 launched to somewhere around 2018, AWS was passing on much of the cost savings from newer servers with more cores and more performance. Sometime around 5 years ago, they stopped doing that as much and now AWS list pricing for fundamental compute building blocks is becoming a (comparatively) worse deal every year. The value prop is now in the higher level managed services, but if you need a lot of compute, take a look at modern server pricing. As a tiny no-name customer at very small volume, I'm able to purchase servers with 336 hardware threads and 1.5TB of RAM for under $20k. With $200k to spend, you're able to buy more than 3000 vCPUs and 15TB of RAM. You can stand up a decent sized internal private cloud that's oversized enough that you shouldn't have teams waiting on a long cycle of load planning / purchase / rack and stack on demand. Hell, now that 100gbit networking is commodity and cheap there's room for hyper-converged storage to mature further. I'm really rooting for the previously mentioned Oxide Computer that's a turnkey private-cloud-in-a-box. All of this depends on having some pretty large compute loads that are not massively assisted by the higher-level cloud product offerings though, which is a small niche.
|
# ? Jan 27, 2024 02:38 |
|
Pick, the, right, tool, for, the, job.
|
# ? Jan 27, 2024 06:15 |
|
drunk mutt posted:Pick, the, right, tool, for, the, job. It's weird, I could have sworn you said "More unnecessary EC2 VMs".
|
# ? Jan 27, 2024 09:51 |
|
did I interpret the recent ddos thing hey ran into wrong? it sounded like they were raw dogging the internet, suffered the inevitable consequences, learned a lesson they should have already known the hard way, and *then* migrated behind cloudflare.
|
# ? Jan 27, 2024 16:03 |
|
MightyBigMinus posted:did I interpret the recent ddos thing hey ran into wrong? it sounded like they were raw dogging the internet, suffered the inevitable consequences, learned a lesson they should have already known the hard way, and *then* migrated behind cloudflare. I'm actually madder about "just use Cloudflare" becoming the default option for any internet-facing service than I am about unnecessary cloud migrations. I guess the problem is that any broadly effective DDoS prevention tool is going to involve a CDN of decent size, but it feels incredibly wrong to me that Cloudflare is becoming the internet. The Cloudflare challenges also suck, I hate how they lie to users and say "Checking if your connection is secure." I wish they'd just say "Checking if you are human". The scale of these network-level DDoSes now is something new-ish still. You say that they were raw dogging the internet, but it wasn't clear to me if it was their application that fell over or was down, it was the internet connections of their colo host getting fully saturated by traffic. I guess I'm forgetting that their whole stack is in Ruby so maybe it was actually their applications getting overwhelmed and I misunderstood. It's funny, because I've been playing with Cloudflare's new cloud tools for a while and am optimistic about Cloudflare taking over a slice of the FaaS / PaaS business from AWS, MS, and Google, but I hate their rapidly entrenching monopoly on front-end serving a majority of the internet. I've been hosting some small toy things that get small, consistent traffic out of my house for many years at this point, and I host them with DNS records pointing straight at my IP. In the last few years people have started reacting with horror at this and act like my home IP address is a secret that needs to be protected, but I am loathe to put Cloudflare in front of myself without paying for it, and I don't want to pay money for this when I already have a perfectly good gigabit fiber internet connection. I do appreciate that doing effective fine-grained abuse protection looks a lot harder on IPv6 because of how big the IP space is now, it'd be really tough to avoid over-broad blocklists.
|
# ? Jan 27, 2024 16:33 |
|
12 rats tied together posted:Another issue is if you have to hire a bunch of cloud touchers you end up hiring a bunch of people who read a lot of cloud vendor marketing materials. You'll end up with a micro-account based strategy that makes sense to nobody and requires dedicated admin/accounting staff to pay your bill and janitor the cost allocation tags across 20+ aws accounts to make sure they're being applied to resources correctly and then you need a few more security people to set up Security Hub Guard Duty Whatever The gently caress in all 20+ of your accounts and then that means s3 logs, cross account permissions, log forwarding, probably you have some collector that ingests all the logs into a product the security team bought that runs compliance scans, and now you have tons of junk logs that only exist for a compliance check to run so you're janitoring 20+ accounts worth of s3 lifecycle policies, etc. Not discounting what you're saying but wanted to point out that aws orgs and control tower can make 'do thing in a bunch of accounts' pretty quick and easy
|
# ? Jan 27, 2024 18:28 |
|
I’d probably like Cloudflare more if our security team didn’t maintain it, but it’s honestly great. I’m a big fan of Fastly too, you can do some really fun things with Varnish.
|
# ? Jan 27, 2024 19:08 |
|
Cloudflare kinda lost me when they went to the mat for loving Stormfront. No ethical consumption under capitalism all other companies suck too etc but still
|
# ? Jan 27, 2024 19:34 |
|
Resdfru posted:Not discounting what you're saying but wanted to point out that aws orgs and control tower can make 'do thing in a bunch of accounts' pretty quick and easy Micro accounts are containers for data planes, and if you're trying to manage containers using Puppet or something, you're obviously missing some benefits out of that approach. e: I'll give 12 rats that Amazon's underinvestment in Resource Access Manager is also an obnoxious complication Vulture Culture fucked around with this message at 22:10 on Jan 27, 2024 |
# ? Jan 27, 2024 22:04 |
|
yeah the tools all exist but they're bad, is the short version although it hasn't been my job for a few years so maybe they're less bad now
|
# ? Jan 27, 2024 22:37 |
|
Docjowles posted:Cloudflare kinda lost me when they went to the mat for loving Stormfront. No ethical consumption under capitalism all other companies suck too etc but still I’m not saying that Cloudflare is justified or whatnot, only that sometimes keeping known sinks and lightning rods around is sometimes better than to scatter and further isolate really volatile people, especially if you’re essentially running a sting operation.
|
# ? Jan 29, 2024 03:54 |
|
Didn't have "history of income tax" on my CI/CD bingo card
|
# ? Jan 29, 2024 04:17 |
|
On week 2 at the new place I found myself researching Rundeck as a way to let developers spin up new dev environments in the cloud. I searched the thread and there's zero conversation about it here, which seems odd, I would think it fits here. Unless there's some new hotness that can take some parameters a dev passes in and trigger something to happen. I doubt I'd have rundeck do much of the environment building, it would basically serve as a front end to collect some information to pass it into whatever system we decide on to actually build something (probably something in Circle CI and using some Terraform... somehow).
|
# ? Jan 30, 2024 00:41 |
|
imo whatever you think you want to use rundeck for there's a 90% chance that a real cicd platform will be a better choice e: actually you already mentioned circleci just just use that don't involve Yet Another Thing The Fool fucked around with this message at 01:03 on Jan 30, 2024 |
# ? Jan 30, 2024 01:00 |
Just write your own web app from scratch to frontend composable provisioning requests and display status query results. Hook it up to your oauth provider. Backend is even easier, you write your own API that proxies API calls to your cloud provider(s), make it in node or even better flask. Bing bong so simple. Make sure you don’t document anything, and they’ll never be able to fire you. Now, I told two truths and a lie about what I did at my previous job. Good luck PS make sure you give your API service account write permissions on everything it can touch to reduce the busywork of RBAC management.
|
|
# ? Jan 30, 2024 01:48 |
|
pretty sure we're coworkers actually and you still work here last i knew
|
# ? Jan 30, 2024 01:50 |
12 rats tied together posted:pretty sure we're coworkers actually and you still work here last i knew It was YOU that ratted me out to the security principal, I knew it. I’m signing you up for so many camgirl websites now
|
|
# ? Jan 30, 2024 01:52 |
|
I’m surprised Rundeck doesn’t come up cause I’ve definitely posted about it but it must have been in other threads. I used it at a past job for its advertised usage, letting people run defined tasks/jobs on rails and view the logs and results without having to give them blanket permissions on the relevant systems. It was… ok? It suffers from being designed by people who have never designed anything user facing before, and for a tool that is meant to make ops’ life easier it is a massive pain in the rear end to operate. You have to tweak all sorts of settings to get it to run beyond toy scale which aren’t documented anywhere besides “how I made rundeck not totally suck rear end” blog posts. It’s also Extremely Written In Java and if you aren’t familiar with operating Java you will have a bad time. It ended up being a critical part of our infrastructure at a place where there were thousands of cron jobs which devs were able to claw back control and visibility of via Rundeck jobs. But I would hope there is something better these days Also disclaimer the last time I used it was like 2020 lol. So maybe it is better? But I would be surprised.
|
# ? Jan 30, 2024 03:15 |
|
What's the difference between Rundeck jobs and Jenkins Pipelines? Nothing. They do literally the same thing and jenkins pipelines are more robust and more of a standard than introducing Rundeck anyway.
|
# ? Jan 30, 2024 04:04 |
|
I wound up building our ephemeral environments based around Crossplane and ArgoCD in EKS, with a custom powershell module to wrap all the CLI crap so folks didn't need to learn 5 different utilities. Our environments consist of container stacks, a fat Windows EC2 instance with MSSQL and our legacy platform, mysql backed by EFS, vhosts on a Cloudamqp instance, Cloudflare and ALBs for ingress, telepresence.io, etc. There's like 9 kubernetes operators going and the learning cliff was steep, but I'm super happy with how reliable it's proving to be now that it's tuned for our scale.
|
# ? Jan 30, 2024 07:40 |
|
Back in uh, 2017 we rolled out rundeck as a container for two power users in the biz ops group to do some crude ETL stuff. It was fine and pretty painless Could we have 1) given them access to our Jenkins server? Sure but our cicd was a hot mess at the time and it didn't need any more cooks in the kitchen + granular access controls in Jenkins wasn't something we wanted to gently caress around with 2) rolled them their own Jenkins server? Sure but managing one is enough of a pain in the rear end, now you want me to janitor/babysit a second one with god knows what running there? Oh hell no Gonna have to migrate some jobs to k8s cron jobs here soon, that should be interesting
|
# ? Jan 30, 2024 08:03 |
|
I can get not wanting to add the role based permissions plugins to an established Jenkins install, but on a fresh rollout using them to control access is far from the worst part of the software. You just build out some roles and assign them to user names and move on with life.
|
# ? Jan 30, 2024 12:42 |
|
Hadlock posted:Back in uh, 2017 we rolled out rundeck as a container for two power users in the biz ops group to do some crude ETL stuff. It was fine and pretty painless Yeah I would put Jenkins (especially for non technical biz ops types) in the same bucket of "fine I guess but is this really the best we can do". Is anyone ITT ready to go on record that they are happy to be running Jenkins?
|
# ? Jan 30, 2024 16:23 |
|
Alright I'm freshly rested so I'm ready to get philosophical. If I were to use Rundeck, it would be almost exclusively as a "front end" that provides some guide rails, and then pass the actual work off to something else, probably our CI/CD. And that's a reason that even looking at Rundeck gives me pause, because it seems like an awful lot of work just for a "front end." On the other hand, I have kind of a philosophical belief that a CI/CD tool should be used to build artifacts based on code changes, rather than just using it as a generic "task execution engine." So even using it to do something like spin up a new environment gives me a little ick. An idea that kind of came to me as we were discussing it internally is that we'd have a repo that just stores configurations of these dev environments, which kind of gets around my general "ick" around using CI/CD for this. Basically, each branch of the repo would define the dev environment, using some kind of configuration file. And changes to that branch (or a new branch) would trigger a pipeline run that would take those configuration values, use something like terraform to spin up the relevant infrastructure, and then something else (some combination of Ansible and just some raw scripts?) to do some extra configuration, like populate a DB instance and whatever doesn't make sense in Terraform. But I'd like some kind of guard rails around those options in the config file. That's where my hope for something like Rundeck was, that it would offer some combo of free text and maybe drop down menus to provide guidance. And then maybe all it would do is take those values and then make a branch in our "dev environment config" repo and populate the config properly, which would then trigger a pipeline job to actually build out the environment. This all also feels very hacky even as I write it out, and has to be a common enough need that I would imagine the DevOps community has come up with solutions for this, and maybe I'm just not googling the right words to find it.
|
# ? Jan 30, 2024 16:41 |
|
In the past I wrote a big nasty python glue script to put everything inside a single repo and run them in order. It was gross but lowest effort for 60 mostly unrelated scripts that had a handful of sftp steps The plan was to move them over to Airflow 2 but didn't get that far the main driver for that project was to unblock the python 2->3 project at the time Cisco bought Tidal back in like 2014 and attempted to rewrite the highly optimized client in JavaScript and was slow as balls as part of the version 6 roll out. The v5.3 was great and we used to to run all sorts of scripts and executables. Reportedly tidal is the scheduler for Hitachi's global banking operations and our use case wasn't too far off from that. But it's not open source. But it's hugely stable and just works. I think for like, sub-30 scripts, id be fine hands-on managing k8s cron jobs using IaC and ArgoCD. Since most bizops stuff is the unholy union of SQL and excel I suspect airflow 2 would be great for their use case but I haven't used it personally
|
# ? Jan 30, 2024 16:51 |
|
FISHMANPET posted:Alright I'm freshly rested so I'm ready to get philosophical. I worked at one place and they were using pentaho/kitchen/spoon and whatever else is in that ecosystem. I think that stuff was fairly state of the art in like 2011 I think the solution is to look at what state of the art ETL workflow stuff is doing these days, which is why I'm leaning into the idea of something like airflow 2, since 99% of these weird script runners are bizops tangential or directly for bizops so you might as well put everything under a single roof I am absolutely not deploying Jenkins in TYOOL 2024. Absolutely not I'll quit on the spot Kind of curious to play around with K8S cron jobs deployed as ArgoCD jobs pointing at a helm chart. Going to start deploying Argo jobs to k8s using terraform next week. Argo job files that deploy a helm chart of cron jobs is basically a Jenkins file but as I'm writing this getting that ick feeling There's that uncanny valley between 3 and ~30 cron jobs where it's like, "I don't want to maintain a dedicated system, but this is gonna become a giant mess for the next guy because there's no business need to set up a mature solution today or possibly ever"
|
# ? Jan 30, 2024 17:09 |
|
I brought up Rundeck in our standup this morning and my lead had never heard of it before. I feel so old.Hadlock posted:I think the solution is to look at what state of the art ETL workflow stuff is doing these days, which is why I'm leaning into the idea of something like airflow 2, since 99% of these weird script runners are bizops tangential or directly for bizops so you might as well put everything under a single roof So this is where I keep getting hung up. I don't want weird script runners. I don't want an automation engine. I have all that. I want a front end to it, that's all. Rundeck is insane overkill for that, but nobody else seems to offer it, which itself makes me worried that I'm just missing something huge. I'm just not groking how something like Airflow even remotely helps here, but I also don't know much about airlfow, is there anybody writing up how to use airflow in a devops workflow that isn't ETL related?
|
# ? Jan 30, 2024 17:17 |
|
My team ended up using Tilt+kind for local development and then tilt with our existing EKS cluster if someone needs to run on a "real" cluster. It's a bit jank in that some of the Tilt config lives in the individual service repos and some changes require you to git pull those changes even though we just run the latest container images from ECR. The slick part is you can have tilt build+deploy your local changes instead of ECR, which is great if a change requires changes in multiple services to work.
|
# ? Jan 30, 2024 17:20 |
|
|
# ? May 22, 2024 20:45 |
|
FISHMANPET posted:
We did something similar prior to moving to Terraform Enterprise. It is not the worse idea but it is really easy to underestimate the number of dev hours they will go into building and maintaining it. Don't do seperate branches though, so you can add validation to PR's to main. quote:But I'd like some kind of guard rails around those options in the config file. We used pykwalify to validate a yml config file in pre-merge tests.
|
# ? Jan 30, 2024 17:22 |