Continuous Integration/build engineering/devops thread

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Continuous Integration/build engineering/devops thread

Mr Shiny Pants: Nov 12, 2012

Space Whale posted:

So the "branch off of master, merge all into a dev branch; merge that dev branch into master to deploy" pattern has my architect a bit antsy, since with TFS it was hell. I've also seen at least one nasty git merge, but we were having spaghetti merges that looked like a cladogram of the protists.

...why not just use that dev as your master? What would make it all go pear shaped? I know that the project files can get a bit messy, but seriously, why?

For me personally it makes it easy to reason about the state of a project. Development is where everybody is working in and merges into. Master is what "works". If something breaks or a merge goes awry the master repo is still good.

It also makes it easy to do a baseline system that works, just use the master branch and deploy it.

# ¿ Feb 9, 2015 17:52

Adbot: ADBOT LOVES YOU

# ¿ Apr 29, 2024 05:07

Mr Shiny Pants: Nov 12, 2012

Space Whale posted:

That raises another question I've gotten.

Let's say that there are breaking changes made by work being done in the API or architecture of the project, or whatever. These changes all need to be shared and the breaking bits fixed. Would pushing that to master early then doing the rebase/merge from master onto their branches be the best way to do this?

The way I look at it, if there are breaking changes then you don't want them in master at first. What if you come across some unforeseen side effect of these breaking changes and you want to rollback? It is a lot easier to keep the master as a known good, and if everything passes testing and you are sure you have delivered a stable version, fast forward the master to the development branch. You share out the development branch so people are merging in, and working on, the version that has the changes they need.

This seems reasonable to me.

I am not a Git guru or a development guru, so if anybody has a different perspective please share. I'd like to learn as well.

# ¿ Feb 9, 2015 20:37

Mr Shiny Pants: Nov 12, 2012

What makes me wonder in such a situation is how the code path for such an error to even exist comes about.

I mean networking is such a core OS feature, that stuff should be bulletproof at this point.

# ¿ Feb 17, 2017 13:04

Mr Shiny Pants: Nov 12, 2012

smackfu posted:

Aye that sounds good. Since our full test suite takes 15 minutes to run, even with our pretty small team we run into that conflict issue and it does waste time.

It's kind of a bummer... the BitBucket feature that I am piloting does the build *pre-merge* which is kind of annoying because it still means the merged code can break the master build. Your system seems better.

What I've seen at some companies is a Gated merge, which means everything has to build and pass all the unit tests before it gets merged in the Master branch.

Used in conjunction with: https://trunkbaseddevelopment.com/

I don't know the pros and cons of this, but I've seen it used successfully.

# ¿ Mar 31, 2017 21:04

Mr Shiny Pants: Nov 12, 2012

Let's say I will have some Linux machines that will be deployed around the world with intermittent connectivity ( think satellites ) running docker on some Linux variant.

What would be a good way to keep these under control and up to date?

Would it be possible to just run regular docker on them and push updates using something like Ansible?

# ¿ Mar 11, 2019 17:42

Mr Shiny Pants: Nov 12, 2012

minato posted:

I think it comes down to splitting the update into 2 phases:
1) Ensure the desired state description and necessary assets are on the machine.
2) Initiate a server-local process to update the desired state from those local assets.

1 can be retried until successful with no side effects, which mitigates the issue of poor or intermittent connectivity. And 2 is likely to succeed because all the assets are in place and no connectivity is required.

So something like: rsync the required files or updates and update from the local files?

Is this something I need to build myself or are there some nice utilities available, I can't be the only one dealing with something like this.

# ¿ Mar 11, 2019 19:00

Mr Shiny Pants: Nov 12, 2012

Vulture Culture posted:

This could really be as simple as a systemd unit that downloads a compose file and runs it, if you're capable of running your reporting out of band

Ok, I can figure this out. We have no CM tooling currently available, is there something in Salt or something else that can schedule connectivity?

Like check for updates at some interval? Or wait for the network to become available?

# ¿ Mar 11, 2019 19:56

Mr Shiny Pants: Nov 12, 2012

Erwin posted:

Do you have control over the connection and purposefully bring it down to limit costs, or is it a random as-available thing? As much as I wouldn't recommend Chef to anyone these days, this sounds like something Chef is more suited for than Ansible, if you wanted to use an existing CM tool. Since Chef runs as an agent in a 'pull' model, each client would hit the Chef server whenever they had connectivity to grab any new configuration. Looks like the Docker cookbook can manage containers as well, but I think I'd separate system configuration and container orchestration and update a compose file like VC said.

If you have the budget for it, I know first-hand that DC/OS runs on cruise ships with sporadic satellite connections. I have strong opinions on DC/OS and simple container orchestration wouldn't be a use case I'd pick it for, but I know it's successfully solving the problem you are trying to solve.

I will take a look at this, thanks for the suggestions guys.It is brought down to limit costs, so we have control over it.
How dumb would it be to run Jenkins agents remotely and have them to do the hard work? In a slave mode they do almost exactly what I want.

# ¿ Mar 12, 2019 16:33

Mr Shiny Pants: Nov 12, 2012

Hadlock posted:

Check out CoreOS and their cloud config files

Basically you pull down a new cloud config file, then reboot the server. The server executes the cloud config file (run these containers with these arguments) and away you go. If the cloud config file fails, it boots from the previous safe config. Maybe setup a cron job to do updates at predicted connectivity periods.

This sort of follows the telcom/networking model of "new update, but fail back to the previous version if it doesn't work"

CoreOS uses Omaha protocol to poll for updates, but that is an entire other rabbit hole.

TL;DR CoreOS was designed from the ground up to do exactly what you plan on doing

I'll have a look, thanks.

# ¿ Mar 14, 2019 06:52

Mr Shiny Pants: Nov 12, 2012

New Yorp New Yorp posted:

It's even worse when a company just renames their operations team to "devops" and puts a bunch of people with somewhere between "zero" and "minimal" prior development experience in charge of writing software.

Or even worse, do all the ceremony without changing the culture.

# ¿ May 5, 2019 17:52

Mr Shiny Pants: Nov 12, 2012

NihilCredo posted:

I'm looking into a storage abstraction for a product so it can run with no code changes in anything from a piddly under-the-table VM with a plain HDD to a MS/AWS/GC environment with that vendor's blob storage. Ideally, it would also handle a "I have a poor man's datacenter, N physical machines running orchestrated containers and some network storage (either a NAS or even each one with their HDD), please replicate my data as much as you are able without giving it to those icky cloud vendors" scenario, which I'm really hoping to avoid but cannot dismiss out-of-hand .

Is Ceph what I'm looking for or is there a less overkill solution? I've seen Storidge advertised around which seems simpler, but also very unproven.

The S3 API is pretty much supported everywhere it seems. You could take a look at Gluster.

# ¿ Jul 8, 2019 04:55

Mr Shiny Pants: Nov 12, 2012

Nomnom Cookie posted:

Try that and your system will end up not working with real s3 unless the emulator vigorously enforces metadata inconsistency. S3 provides fewer guarantees than almost any storage system and stale reads are common

So if you write it to the S3 spec it can only run better anywhere else?

# ¿ Jul 8, 2019 08:28

Mr Shiny Pants: Nov 12, 2012

Walked posted:

It's been a while since I did a survey of the field for CI/CD systems; just changed jobs and get to do it again.

Any new players in the last 1-2 years worth checking out? We were on CircleCI at my last place and it was fine; dont mind using it again but want to be sure I'm not missing anything making waves more recently.

I've been using Drone at my work. I like it, especially when paired with Gitea.

# ¿ Mar 27, 2020 21:29

Mr Shiny Pants: Nov 12, 2012

Walked posted:

Drone always stuck out to me as a sweet option; but never heard anyone else using it.

I�ll give it another peek

Me neither, but we did not have anything here yet so I was free to choose whatever struck my fancy. And it did.

I like to host my own stuff so that was a big plus.

Compared to all the other stuff I saw: Gitlab, Circle and Jenkins it feels really clean. IMHO though.

It's lightweight, one thing that took me awhile to figure out is that everything needs to be done through containers. Copy something? Run a container etc. etc.

Took me a day to set it up and now it pulls Repos that have a release flag set, compiles everything inside a container and deploys to a docker host. Pretty sweet.

Mr Shiny Pants fucked around with this message at 22:04 on Mar 27, 2020

# ¿ Mar 27, 2020 21:58

Mr Shiny Pants: Nov 12, 2012

Newf posted:

Just checking: is it a bad practice for mydomain.com/.env.production.local to happily serve up a file with, eg, admin database credentials?

edit: asking for a friend

It's fine, think of all the extra admins you'll get, free of charge.

# ¿ Apr 23, 2020 07:41

Mr Shiny Pants: Nov 12, 2012

Boz0r posted:

What is it called in Azure DevOps when committed code gets rejected if it doesn't build and pass all tests? People from our team break our pipelines all the time and I'm sick of it.

A good thing(tm).

# ¿ May 11, 2020 08:14

Mr Shiny Pants: Nov 12, 2012

Gangsta Lean posted:

SAFe loving owns. Previous job flew or bought Amtrak for engineers, engineering managers, product managers, executives, *scrum masters*, etc from remote offices all across the country, even Hawaii, to a city in the northeastern US for 2 fun-filled days of sitting in a room together discussing tickets. Huge breakfast, huge lunch like 3 hours later, then after the afternoon session was over they leave you to find your own way back to a hotel in the middle of an isolated office park with nothing within walking distance except more office buildings. All the non-US-based contractors (there were a ton, mostly in eastern Europe) were not invited, they dialed in via Zoom just like they did every work day.

Executive management floated around and sat in random sessions for a few minutes throughout they day, interjecting opinions that everyone promptly ignored when they left. At the end of the second day everyone gathered to hear each team commit their soul to some unrealistic goal, lots of head nodding all around, but failing to meet a commitment never resulted in any consequences that I ever saw.

The best part about SAFe is, when management tells you to do something counter to SAFe (it happens a lot), the company buy-in is so hard that all you have to do is point out how contradictory what they�re asking is to the SAFe philosophy. It�s an excuse for following the process instead of doing the work that actually needs to be done. The product is now the process, not the software you�re actually delivering to customers.

This, the rituals become the product instead of actually doing work. This has been bugging me a lot, talking about stuff but not actually doing anything.

# ¿ Oct 7, 2020 07:29

Mr Shiny Pants: Nov 12, 2012

NihilCredo posted:

We're using Gitlab and are quite happy with it as well.

Sometimes it's got minor bugs that have been left open for 2+ years in favor of adding more enterprise paid features, which I can't really blame them for. None of those have been show-stoppers, just stuff like the build cache not triggering and slowing down builds by a few minutes.

I might consider Gitea if I did not need a built-in CI/CD system or built-in package manager, and/or if I didn't have a beefy server to host it on. Gitlab is a massive resource hog, while Gitea runs on a Pi and feels blazing fast at all times.

Then again, Gitlab is an enterprise product with all that it entails, e.g. I've never had a single issue running a plain `gitlab backup create && apt-get upgrade` after a new release; whereas Gitea is an open-source project that isn't even dogfooding itself yet (is code hosted on Github).

edit: Gitea apparently supports git mirroring (while it's a paid feature in Gitlab) so you can maybe install both with mirrored repos and get a feel for which one you like better.

Gitea and Drone CI worked pretty good at my last job. Simple to install, simple to maintain.

# ¿ Nov 3, 2020 08:05

Mr Shiny Pants: Nov 12, 2012

I would like some guidance about the following:

Say I want to have my Ansible stuff in a Git repo and have the plays be run by a CI pipeline when I update a file. What would be a good way to structure the repo?

And how do I get it to not run everything in the repo? This seems like a pretty basic thing but I haven't found a satisfactory answer for this.

An example would be running a play to update some users on a specific server, how do I build it to just run the play I updated without running everything else again that may also be in the repo? Do I just change the inventory?
I have a hard time getting my head around this.

# ¿ Jan 17, 2021 10:08

Mr Shiny Pants: Nov 12, 2012

Thanks, that is exactly what I was getting at, I don't need it to try and create the already existing users or some other configuration item.

I don't think, for an example, the LDAP modules/scripts/etc. are smart enough to not create AD users again, even though it will probably error out because the user already exists.

Or is that they way it should work? It doesn't "smell" right to be honest.

I'll have a look at the Drone documentation if I can filter out the changed file....

Edit: My thinking is that I want one Repo per "tenant" as it were, which hosts all the configuration stuff relating to that tenant (infrastructure, network etc.)
If I need to update something or add something I want the whole repo on my machine, change or add the necessary things, check it in, review it, merge it, and have the CI do it's thing without re-running the playbooks that haven't changed in the meantime.

Is this a good approach? Or should I divvy up the repo's by kind eg: DC's, Routers, webservers?
My thinking on how I should approach this is not completely clear.

Mr Shiny Pants fucked around with this message at 20:52 on Jan 17, 2021

# ¿ Jan 17, 2021 20:42

Mr Shiny Pants: Nov 12, 2012

Thanks for the replies 12 rats, I have read them and you gave something to think about.

# ¿ Jan 21, 2021 06:49

Mr Shiny Pants: Nov 12, 2012

I am looking to automate a lot of our stuff, especially the deployment of resources.

Is Terraform still the way to go? Or are there better alternatives?

I really like what I've seen so far of Terraform.

# ¿ Jan 30, 2021 10:25

Mr Shiny Pants: Nov 12, 2012

New Yorp New Yorp posted:

Needs more context. What clouds are you using? Are you targeting Kubernetes? What's your technology stack? What are you using for CI? What automation do you already have in place? Are you the only person who will be supporting this, or will it be a team effort?

Zilch at the moment. I am not looking for anything that only works on a specific cloud, we have way too many configurations running for that to work. Looking at Terraform and the provider model and the huge opensource effort writing providers behind it, that gives a lot of possibilities ( GCP, Vsphere Libvirt etc.). No CI at the moment, but if I can help it: probably Drone. It will be a team effort, I am just asking you guys for some insights.

Hadlock posted:

Terraform is still good, better since v0.11 or 0.12 when they sort of committed (briefly) to a 1.0 spec

Terraform is better if you're doing Career-Driven Development as it's a transferable skill and most everybody interviewing you will nod their head when those words come out of your mouth

edit: kind of wondering what happened to terraform 1.0? I guess they're afraid that if they publish a stable spec, the community will fork it and add all the good features people have been wanting for years and leave them in the dust? Terraform is like this close >.< to being truly great but has a bunch of weird gotchas related to opinionated decisions the community has zero control over

I am not exactly doing this for my career, I just happen to really like the things I've seen from it. Especially the provider stuff and it is nice to get behind a technology that won't be dead just after we took an interest in it.

Vulture Culture posted:

One person on one of our teams did a Terraform configuration generator and it took them like three days to do it. This should not be difficult for a company whose Sentinel engineering team is bigger than most software companies

Could you expand on this? Edit: Generating Terraform configs from already running deployments I guess?

Mr Shiny Pants fucked around with this message at 18:38 on Jan 31, 2021

# ¿ Jan 31, 2021 18:07

Mr Shiny Pants: Nov 12, 2012

the talent deficit posted:

terraform consumes so much time and attention at every place i've been that used it that i'm convinced it's a scam to ensure full employment of programmers who don't want to program

Hmmm, that's not good. I am wondering if it would be good fit to provision VMs and the like and use something like Ansible for configuration.

# ¿ Feb 1, 2021 07:07

Mr Shiny Pants: Nov 12, 2012

Methanar posted:

idgi.

fletcher posted:

For us terraform has been more of a "set it and forget it" type of experience, it's been great

Maybe we should hold a poll.

# ¿ Feb 1, 2021 07:19

Mr Shiny Pants: Nov 12, 2012

The way I see it for our situation:

We are not a big company by any stretch of the imagination, but we do have a lot of "tenants" that we manage.
One of the things that I could see Terraform be very good at is the whole: "We have a blueprint of how we would like a certain environment to look" and have it take care of that.

These environments don't change that much but it would be really handy to be able to copy configs of certain resources between tenants and to do the low level security stuff on who gets to login where.

Another thing is having the ability to quickly spin up some test environments on whichever cloud provider makes sense so we can test new technologies without needing to manually install it. (which won't be done if you need to do it by hand)

# ¿ Feb 3, 2021 07:50

Mr Shiny Pants: Nov 12, 2012

So I started working with Terraform some more and can I just say that it is loving awesome? I forgot how jaded I've become regarding software in general, but this just made me smile.

And it just loving worked, no weird error messages, no configuration errors, no nothing. Add provider, put in the API key and you're off.
Pretty awesome for a change, especially seeing the stuff it does (let's not get carried away, it just calls APIs) but still, I was pleasantly surprised.

So I've been reading some more about it and I've come to the following setup:
I read that keeping all your state in one file leads to long planning phases so I've decided to separate my infrastructure by update frequency.
Resource groups, subnets and all the other stuff in one, instances and the like in the other. I might even put the network stuff in a separate file.
One thing though, I am already using a "globals" module and using the state of the "top" infrastructure file as a datasource, It seems there is just no way around that?

Is this smart?

# ¿ Feb 6, 2021 09:29

Mr Shiny Pants: Nov 12, 2012

Methanar posted:

Separate things by area of concern. Shared infra like vpc subnet network stuff should be independent.

Then do one terraform project per stack of whatever you're doing.

Tag management is another perfect example of something else that is shared and should be broken down into a module and attached to other terraform projects. We have an org wide set of tf modules with all the environments defined in a big struct full of key values for what cost accounting tags to use. What the chef server urls are. What the aws region is. That sort of thing. Just one place to update and then every thing else inherits.

By concern is a better description.

I have one "global" module that has the region, and all the "environment" variables needed for this specific provider.
I don't really want to over engineer it, we don't use stacks as it were, but I might separate a VMware cluster from the other instances.

I do get a vibe of "over-engineering" Terraform in a lot of the stuff I read.

# ¿ Feb 6, 2021 10:03

Mr Shiny Pants: Nov 12, 2012

Volguus posted:

I have a question: What build system should I use for my personal projects?

Background: I have a server in my basement that does not have enough VMs on it at the moment. I would really like to have a build system that would monitor some git repository, pull when changes are made to it and run a particular script and provide (or put somewhere) the resulting artifacts. This is just for personal projects. The less I'd have to gently caress around with it after setup the better.

The only build system I ever touched was Jenkins, and that was (and is) very briefly for little things. I don't have a problem with it, except that it seems to be a lot more than what I need. Is there a simpler one? I have a git repo on the network, now I'm using gitolite, but I wouldn't mind switching to another one if there's some tool out there that can combine git remote repo and CI and it would be relatively headache free for maintenance. I've heard Github itself does have some build system as well, I've never tried it. I usually only upload to github if I deem my project to be worthy of sharing, until then I just keep it on the local network. Though, now they do have free private repositories, but still ... why bother? But if their build system is worth learning, then it would be a compelling argument to just move to github.

What would you do for your own little projects?

Drone is nice. I use it in combination with Gitlab.

https://www.drone.io/

# ¿ Sep 13, 2021 06:01

Mr Shiny Pants: Nov 12, 2012

Volguus posted:

I run/work in linux. A windows VM on my machine would have the same issues as the one running in proxmox: need to be replaced every 70+ days. What I was looking for was an automated way to do that (replace that VM). The obviously easy thing is to just use some key and set it to never expire and just move on with life. But I thought I may as well explore other possibilities before I'd go there. The scripts that I have make me having a working VM capable of building stuff in around 30 minutes to 1 hour, but they require me to click buttons. It's not a big deal once every 70+ days.

Buy a key from SA-Mart build your VM snapshot it (if you can) and it should work fine. I have a couple of Windows VMs created this way and it works fine.

# ¿ Dec 31, 2021 21:44

Mr Shiny Pants: Nov 12, 2012

Methanar posted:

In the last 48 hours I've turned off 3500 CPU cores worth of poo poo that was completely unnecessary. And I've got at least another 3000 I can turn off in the next week.

Absolutely ridiculous how wasteful and careless people are.

Since you are probably not alone in this, guess the amount of resources just sitting around. It is staggering.

# ¿ Jan 7, 2022 17:03

Adbot: ADBOT LOVES YOU

# ¿ Apr 29, 2024 05:07

Mr Shiny Pants: Nov 12, 2012

The Iron Rose posted:

I could but the fact that customers are banging on the drums here is a limiting factor. There�s lots of approaches here with service meshes though. There was a whole �nother solution I briefly entertained of using canary traffic splitting to do the same thing before I realized I was trying to fit a square into a round jole.

Which frankly is what�s happening in general, and there�s no way I�d want to actually implement the lua plugin.

Anyways this thread remains the best grey tech thread, I continue to learn a phenomenal amount from reading what folks post here. Never even heard of CQRS before and it�s fascinating. And 2000s era tech read replicas may be, but it�s quite new and exciting round these here parts.

E: With regards to splitting traffic between read and write services, it�s really a terrible idea and the difficulty of implementation reflects that. While a fun afternoon�s research and testing, this is not the model we�re going to use. With the requirement to split traffic across multiple services with session affinity gone, the problem set becomes much simpler. More importantly, the service becomes easier to comprehend and maintain, we reduce reliability and quality problems, and we preserve flexibility in how we design our application going forwards.

CQRS is amazing, good luck finding good examples and people understanding it well enough. Combined with event sourcing it is really powerful but it will take quite a rewrite.

# ¿ Jul 24, 2022 14:59

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Continuous Integration/build engineering/devops thread