Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Hadlock
Nov 9, 2004

setup your dockerfile to generate a slim code image using alpine or whatever

docker build . -t elgrillo:latest
docker push

on servers:

docker pull elgrillo:latest
docker run -d elgrillo:latest

Git clone shallow is a great solution too with minimal setup

You could push code using rsync but pulling from server side using git or containers would be the Right Way in 2022

Adbot
ADBOT LOVES YOU

Super-NintendoUser
Jan 16, 2004

COWABUNGERDER COMPADRES
Soiled Meat
I have a question for you guys. I have a lot of experience in Ansible and Terraform, but I have a problem that I think doesn't fall into either of these domains, and I'll explain why.

They sell a cloud based security platform, where basically you rent slots on their cloud fabric (they have tens of thousands of nodes all over the planet), or purchase on-prem appliances that you offload your content filtering, proxy authentication, ssl decryption, etc. The primary method of configuration is a centralized cloud web-gui that pushes configs to all your specific nodes, and there is also an API that allows you get to get and put configuration details. From what I see 100% of what you do in the GUI is possible in the API.

So all that said, what I'm finding is that the issues they deal with become challenging at scale because of the size of these larger customers. Also the platform is so feature rich, it's easy to get lost in all the clicking. At the moment I'm not assigned any specific clients so I have a lot of time to learn and work to reduce the pain points so when I do get assigned a client I'll have built something I can use to help myself.

Example, there was a customer that had POC environment, on which they had around 10,000 separate and specific domain proxy whitelist rules specifying complex allow/deny permissions for various sites, services, and groups. They moved from this POC to their final system, and basically had a guy just clicking and copying all the rules. (There is no specific "connector" tool to push settings between accounts).

I took about an hour and wrote a little python that hit the API, and got the list of rules (which the API happily gives you as a json), made a few formatting changes on it (do to version differences) and pushed it to the other system, and there was much rejoicing. So what I did next was pulling any service ticket that I think using the API intelligently to solve at scale and add methods to what has turned into a rather largish python module that is extremely useful for troubleshooting common issues.

So this week, a problem came in where a customer fat fingered and deleted a specific configuration item. Now, the cloud environment has backups, but it's an all or nothing change (ie revert the node to the night before). I've started to figure out the typical things that people configure and I've been adding them to my pip module, with the idea to maintain a local config library. The logical extension of that is why do I even need the gui, why can't I just write flat files, maintain them in git and push them as I make changes. A non-trivial number of serious production issues would be eliminated if customers managed their accounts this way. This production outage where they deleted the access rules for their ERP would have been easy to fix.

Basically I'm sort of envisioning a terraform provider where you can write configuration and then push it, but I think that terraform and ansible aren't the best way to manage persistent configuration of a platform via API. I believe that Puppet is a better solution for this, since I've heard that it's a good configuration management tool, but before I start to learn it, I'd like to get some input. I don't have any experience with it, however I have interacted with an ops team that uses it (I use ansible to deploy nodes, they took ownership and used puppet to push configs via SSH and API). Would the correct concept be to write a puppet module that interfaces with the API? I

I think I can bring serious value to the company by proposing something along this line, and they have a dev team that can easily make this happen. They have all the parts to really use configuration as code, they are just missing the engine. I'm not in Devops any more, I'm in Sales/Customer Management (ultimate I want to be a sales engineer), but I think something like this as a product can be a huge asset and drive some revenue so I want to get it in peoples mind.

vanity slug
Jul 20, 2010

Why wouldn't Terraform be the right tool for this job? Sounds like a perfect fit to me.

The Fool
Oct 16, 2003


Terraform providers are literally and interface between the hcl and an api.

Super-NintendoUser
Jan 16, 2004

COWABUNGERDER COMPADRES
Soiled Meat

Jeoh posted:

Why wouldn't Terraform be the right tool for this job? Sounds like a perfect fit to me.

I've been of the assumption that Terraform was really good for defining infrastructure and not configuration, that was really all that make me slow down, and I was poking around at writing a terraform module already, but any google searches for configuration management return Puppet or Chef and I didn't want to embark down a false start.

Docjowles
Apr 9, 2009

+1 for terraform being a good fit based on your description. Puppet/chef etc are fine tools for defining the configuration of a traditional server. Like put this file with this content here, make sure Apache is installed, ensure syslog is running. Which is why they are described as configuration management tools. You can write modules that execute arbitrary code and do whatever you want, but interfacing with remote APIs isn’t really their wheelhouse.

It is exactly terraform’s wheelhouse, however. You describe the desired config in HCL and then your provider goes and talks to the API to make it so. Also, your infrastructure’s definition in terraform is totally a configuration.

Quebec Bagnet
Apr 28, 2009

mess with the honk
you get the bonk
Lipstick Apathy

El Grillo posted:

We have a large-ish repo, and relatively small server storage space (due to cost). We only need to distribute the repo folders to our servers, we don't need to send the ~14gig /.git directory. Is there any way to do this, whilst still having it as a repo on the servers so that we can do small updates to servers without having to just redownload the whole repo every time?

The only suggestions I see on the net seem to be to do a git clone --shallow depth:1, and then basically deregister the repo on the server and delete the /.git directory. But that still requires you to have enough space to get the /.git directory in the first place, and --shallow doesn't help us much (only reduces that directory by about 1 gig).

I suspect the answer is 'no' but figured here of all places someone would be able to give a definitive answer.

This should only pull down the specific branch. As others said you should probably figure out if you can permanently shrink your repo size.

code:
git clone --depth 1 --single-branch --branch my_deploy_branch example.com/repo.git

Quebec Bagnet fucked around with this message at 00:09 on Aug 17, 2022

Hadlock
Nov 9, 2004

If you have a ton of images, might look into other architectural solutions, like storing images in s3 and pointing cloudflare at s3

If you have a bunch of compiled machine learning models, s3 is a good place for that too

If the shallow clone is 14gb, I would figure out why and how you can trim that down, something went wrong

12 rats tied together
Sep 7, 2006

Super-NintendoUser posted:

I took about an hour and wrote a little python that hit the API, and got the list of rules (which the API happily gives you as a json), made a few formatting changes on it (do to version differences) and pushed it to the other system,

Basically I'm sort of envisioning a terraform provider where you can write configuration and then push it, but I think that terraform and ansible aren't the best way to manage persistent configuration of a platform via API.

In addition to what other posters have said, which is all correct, if you spend some time poking around in ansible source I think you'll find that it contains hundreds of thousands of lines of code that are exactly this, for a variety of persistent configurations of a variety of platforms, exclusively by API (since there's otherwise no way to do this).

You should take your python script and turn it into an ansible module. It's very easy to do this, you pretty much just wrap your module in a function (usually called `run_module()`), and then the ansible task queue worker will call that function with whatever input the play currently has in scope for you.

This is probably the best place to get started: https://docs.ansible.com/ansible/latest/dev_guide/developing_modules_general.html#creating-a-module

Warbird
May 23, 2012

America's Favorite Dumbass

My goodness. I decided to set up a quick GitHub actions pipeline because I was tired of dealing with spinning up a Jenkins VM/Agent on Proxmox each time I wanted to build a new container and the entire affair took like 10 minutes, and most of that was my dumb rear end not realizing that there were a few syntax changes I hadn't caught. God this is nice.

El Grillo
Jan 3, 2008
Fun Shoe

Twerk from Home posted:

I've got a couple of thoughts. How big is the actual total distributable directory you're wanting to send out? Is it almost 14GB? Bigger, because .git is compressed and your uncompressed is even bigger? If you have huge files in your git history that were checked in in the past and then removed and you are confident you will no longer need them, you can clean them out easily with a tool that edits git history like The BFG Repo Cleaner: https://rtyley.github.io/bfg-repo-cleaner/

Alternatively, you could start using a flow with git archive, which will create a zip or tar of the entire contents of the repository without the .git, and then you could distribute the git archive to the servers..

I'm assuming that big size is driven by some type of huge binary files in the repository, you could also separate those out with Git Large File Storage: https://git-lfs.github.com/, or even just by having a script to download those files over https from some server that offers them that you could run immediately after checkout.

Does any of that sound reasonable?

Hadlock posted:

If you have a ton of images, might look into other architectural solutions, like storing images in s3 and pointing cloudflare at s3

If you have a bunch of compiled machine learning models, s3 is a good place for that too

If the shallow clone is 14gb, I would figure out why and how you can trim that down, something went wrong

Quebec Bagnet posted:

This should only pull down the specific branch. As others said you should probably figure out if you can permanently shrink your repo size.

code:
git clone --depth 1 --single-branch --branch my_deploy_branch example.com/repo.git
These are all super helpful, sorry for the delay in responding.

The actual repo files are just over 8GB.

Unfortunately the size is not due to large individual files that can be pruned or split off in some way (in fact, by necessity due to other systems we're using there is no individual file over 15mb and there are about 150k files total). This fact combined with many commits, and some of those commits making changes to thousands of files at a time, is I presume why the .git is ~14gb even when using shallow clone.

All that being said, I'm ignorant enough to not be sure yet whether the code suggestion above from Quebec Bagnet would work but I'm pretty sure it wouldn't because --depth 1 simply prevents it from pulling versions older than the most recent, but the most recent contains all that history of commits anyway right?

The ideal solution might be some way to get clone to ignore all the commit history before the latest tagged release, because we essentially do not need easy access to that previous history once we've reached a new tagged release.
Maybe the ultimate solution is that we just use rclone to get the latest tagged release each time we deploy.. that would mean downloading the whole 8gigs+ each time though right? Because rclone won't diff and update individual files?
Then we could keep production servers at low drive volumes, and only need to have a larger drive to contain the whole repo on our test server which we need to rapidly deploy new commits to (i.e. needs to have the whole repo so we can simply update it rapidly at every new commit as we're approaching release and are debugging).

In any scenario, rapid deployment is pretty important so ideally we don't want to have to do a full blank slate download of 150k files (the 8GB) to all servers at every tagged release. But it might simply be that there is no software solution that allows us to just get the latest commits (properly diff'd so we're not doing a full clone every time) without also having the whole .git folder locally too.

edit:

Hadlock posted:

setup your dockerfile to generate a slim code image using alpine or whatever

docker build . -t elgrillo:latest
docker push

on servers:

docker pull elgrillo:latest
docker run -d elgrillo:latest

Git clone shallow is a great solution too with minimal setup

You could push code using rsync but pulling from server side using git or containers would be the Right Way in 2022
Ah we are not using docker but maybe we could. I will pass this along to our admin see what he thinks. Shallow doesn't do the job sadly (see above).
We can push using git because no need to push from the deployment side.

El Grillo fucked around with this message at 09:22 on Aug 21, 2022

beuges
Jul 4, 2005
fluffy bunny butterfly broomstick

El Grillo posted:

These are all super helpful, sorry for the delay in responding.

The actual repo files are just over 8GB.

Unfortunately the size is not due to large individual files that can be pruned or split off in some way (in fact, by necessity due to other systems we're using there is no individual file over 15mb and there are about 150k files total). This fact combined with many commits, and some of those commits making changes to thousands of files at a time, is I presume why the .git is ~14gb even when using shallow clone.

All that being said, I'm ignorant enough to not be sure yet whether the code suggestion above from Quebec Bagnet would work but I'm pretty sure it wouldn't because --depth 1 simply prevents it from pulling versions older than the most recent, but the most recent contains all that history of commits anyway right?

The ideal solution might be some way to get clone to ignore all the commit history before the latest tagged release, because we essentially do not need easy access to that previous history once we've reached a new tagged release.
Maybe the ultimate solution is that we just use rclone to get the latest tagged release each time we deploy.. that would mean downloading the whole 8gigs+ each time though right? Because rclone won't diff and update individual files?
Then we could keep production servers at low drive volumes, and only need to have a larger drive to contain the whole repo on our test server which we need to rapidly deploy new commits to (i.e. needs to have the whole repo so we can simply update it rapidly at every new commit as we're approaching release and are debugging).

In any scenario, rapid deployment is pretty important so ideally we don't want to have to do a full blank slate download of 150k files (the 8GB) to all servers at every tagged release. But it might simply be that there is no software solution that allows us to just get the latest commits (properly diff'd so we're not doing a full clone every time) without also having the whole .git folder locally too.

edit:

Ah we are not using docker but maybe we could. I will pass this along to our admin see what he thinks. Shallow doesn't do the job sadly (see above).
We can push using git because no need to push from the deployment side.

You said initially that the destinations all have low storage space due to cost so I’m guessing there are a lot of destinations where an extra 10gb per node will add up. In that case, what if you provision one new node with 16gb of space, clone the repo there including whatever extra history comes along, and use that to publish the resulting 8gb of actual data/code that you want to distribute to the other nodes?

Hadlock
Nov 9, 2004

El Grillo posted:

The actual repo files are just over 8GB.

Unfortunately the size is not due to large individual files that can be pruned or split off in some way (in fact, by necessity due to other systems we're using there is no individual file over 15mb and there are about 150k files total). This fact combined with many commits, and some of those commits making changes to thousands of files at a time,

I think the core problem here is your code base is poorly architected

What you're describing here sounds like what a database would be used for

Sounds like

1) developer updates code
2) code generates/updates code objects
3) these secondary objects are stored in the same git repo*
4) code + compiled objects are distributed together
5) code that consumes/uses compiled objects is also distributed with objects **

Generally as the person distributing the code, you don't have much control over the design, but if you are brave enough sometimes if you complain loudly enough that the design is so hosed, even modern tooling can't bandaid it anymore, sometimes you'll pique someone's interest and they'll pull at that string and unravel the sweater of poo poo someone has been weaving for years

* Generating code should be it's own repo, storing compiled objects in the repo is the first mistake
**1 Compiled objects should live in a database, or just S3; postgres is more than happy to store 15mb binary objects all day long, it is, of course, a database
**2 code that consumes objects doesn't need to be in the same repo as the object generators

Still sounds like you can split out the data and store it elsewhere, something like

1) CI/CD system builds/updates objects
2) objects get stored in database
3) workers pull down objects from db as needed, ad hoc

Congrats the super complex system your really clever developer slapped together with duct tape and bailing wire over 10 years is now just a plain-rear end CRUD system that is maintainable, your worker cost per unit can be cut in half you saved the company hooray

Bhodi
Dec 9, 2007

Oh, it's just a cat.
Pillbug
It's unlikely you're going to get cycles to re-architect at this late date, so the next best thing is to do some repo surgery as was suggested. In terms of work:reward you're probably looking at half a day's effort which is a pretty reasonable sell.

At least you can buy yourself some more time, as twerk says by using bfg-repo-cleaner, git filter-repo, and git-sizer.

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb
There was a nice self-hosted CI/CD project I came across a few weeks ago and for the life of me I can't remember the name of it. I want to say it had something about fresh bananas in the name? I may be way off here.

What's a nice simple self hosted CI/CD pipeline? I don't think I'm ready to migrate my repos to a self hosted solution but I have some pipelines I'd rather run at home instead of poking holes in my firewall for bitbucket to talk to my server.

The Fool
Oct 16, 2003


I've never used bitbucket but neither ado pipelines or github actions need firewall rules for their local agents.

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb

The Fool posted:

I've never used bitbucket but neither ado pipelines or github actions need firewall rules for their local agents.

Ohh I didn't realize this was a thing! Looks like I could self host runner for bitbucket as well without having to mess with the firewall.

LochNessMonster
Feb 3, 2005

I need about three fitty


fletcher posted:

Ohh I didn't realize this was a thing! Looks like I could self host runner for bitbucket as well without having to mess with the firewall.

Don’t use Shitbucket unless you absolutely have to.

Methanar
Sep 26, 2013

by the sex ghost

LochNessMonster posted:

Don’t use Shitbucket unless you absolutely have to.

bitbucket is... fine

LochNessMonster
Feb 3, 2005

I need about three fitty


Methanar posted:

bitbucket is... fine

Easily the worst git platform out there, besides selfhosted probably.

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb

LochNessMonster posted:

Easily the worst git platform out there, besides selfhosted probably.

Certainly wouldn't use bitbucket if I was starting fresh today, but I started using it ages ago and I'm too lazy to spend a few hours migrating to a different platform. Tempting to self host with gitea or something, maybe once I get caught up on other projects

Hadlock
Nov 9, 2004

Had bitbucket, github and very very briefly used gittea never had any issues

I prefer GitHub because I'm more familiar with their UI but never really had an issue with either

Mr. Crow
May 22, 2008

Snap City mayor for life
Ya bitbucket is fine.


This is maybe late but for the goon with large repo problem (same :hfive:), you could use --filter=blob:none which will tell git to not download binaries unless it needs them. Its better than --depth because you get the full git history just none of the blobs. This talks about some other strategies

https://github.blog/2020-12-21-get-up-to-speed-with-partial-clone-and-shallow-clone/

Junkiebev
Jan 18, 2002


Feel the progress.

is microsoft going to start hard-charging into gihub and let AzDO whither on the vine? I'd venture that they are!

New Yorp New Yorp
Jul 18, 2003

Only in Kenya.
Pillbug

Junkiebev posted:

is microsoft going to start hard-charging into gihub and let AzDO whither on the vine? I'd venture that they are!

No, they're not going to start... they've been doing it for at least a year already.

Microsoft is pushing GitHub hard. Most of the Azure DevOps dev team appears to have been moved over to GitHub. You can tell from the Azure DevOps release notes that it's basically a skeleton crew providing bugfixes and security updates, not really focusing on new feature development. That said, Azure DevOps is still great at what it does, and GitHub Actions is still essentially a fork of Azure Pipelines. There's no harm in continuing to use it if it's meeting your needs.

There is also talk of first-party migration tooling.

Methanar
Sep 26, 2013

by the sex ghost
I have just spent the last 90 minutes conclusively proving that something should not work. And yet it does.

Plorkyeran
Mar 22, 2007

To Escape The Shackles Of The Old Forums, We Must Reject The Tribal Negativity He Endorsed
The next step is for it to magically stop working and then never work again.

12 rats tied together
Sep 7, 2006

It will have been working because of conntrack nat entries, and eventually enough things will reboot at once to flush this translation from whatever cache. Your job is to make sure it's not your problem when this happens.

LochNessMonster
Feb 3, 2005

I need about three fitty


Methanar posted:

I have just spent the last 90 minutes conclusively proving that something should not work. And yet it does.

Calling it, it’s DNS.

Wizard of the Deep
Sep 25, 2005

Another productive workday

Methanar posted:

I have just spent the last 90 minutes conclusively proving that something should not work. And yet it does.

I will gladly trade you the past week I've spent figuring out why something that should work does not.

My case is not DNS.

some kinda jackal
Feb 25, 2003

 
 

Methanar posted:

I have just spent the last 90 minutes conclusively proving that something should not work. And yet it does.

That's nothing. I spent the last 90 minutes conclusively proving that no human being should work with Oracle Cloud. And yet they're in our cloud portfolio.

Methanar
Sep 26, 2013

by the sex ghost

LochNessMonster posted:

Calling it, it’s DNS.

It's some seriously jacked up PKI where x509 client certs which belong to an expired CA are somehow still working.
The situation is more complicated than that for a bunch of stupid reasons, but as far as I can tell I ultimately have an invalid x509 client cert successfully authing which means a lot of things have been working by accident for the past 31 days.

I don't even want to touch it.

Methanar fucked around with this message at 23:27 on Aug 24, 2022

Wizard of the Deep
Sep 25, 2005

Another productive workday

Methanar posted:

It's some seriously jacked up PKI where x509 client certs which belong to an expired CA are somehow still working.
The situation is more complicated than that for a bunch of stupid reasons, but as far as I can tell I ultimately have an invalid x509 client cert successfully authing which means a lot of things have been working by accident for the past 31 days.

I don't even want to touch it.

I bet all they're doing is verifying the cert thumbprint, and ignoring literally everything else.

Docjowles
Apr 9, 2009

Methanar posted:

I have just spent the last 90 minutes conclusively proving that something should not work. And yet it does.

I feel like past a certain seniority level in Ops/SRE, interviews should include a section on the works of Nietzsche, Lovecraft, and Kafka. If you can’t identify with these men, you haven’t been gazing into the abyss long enough

Zorak of Michigan
Jun 10, 2006


"I think applying for this director position demonstrates that I have the will to power. Thinking that a director has meaningful power demonstrates my Kafkaesque understanding."

Plorkyeran
Mar 22, 2007

To Escape The Shackles Of The Old Forums, We Must Reject The Tribal Negativity He Endorsed

Methanar posted:

It's some seriously jacked up PKI where x509 client certs which belong to an expired CA are somehow still working.
The situation is more complicated than that for a bunch of stupid reasons, but as far as I can tell I ultimately have an invalid x509 client cert successfully authing which means a lot of things have been working by accident for the past 31 days.

I don't even want to touch it.

If you touch it then it's your problem to fix when it stops working, so I highly recommend not touching it.

Dukes Mayo Clinic
Aug 31, 2009

Methanar posted:

It's some seriously jacked up PKI where x509 client certs which belong to an expired CA are somehow still working.
The situation is more complicated than that for a bunch of stupid reasons, but as far as I can tell I ultimately have an invalid x509 client cert successfully authing which means a lot of things have been working by accident for the past 31 days.

I don't even want to touch it.

this will somehow still be DNS. best advice: continue to not touch the poop.

LochNessMonster
Feb 3, 2005

I need about three fitty


Methanar posted:

It's some seriously jacked up PKI where x509 client certs which belong to an expired CA are somehow still working.
The situation is more complicated than that for a bunch of stupid reasons, but as far as I can tell I ultimately have an invalid x509 client cert successfully authing which means a lot of things have been working by accident for the past 31 days.

I don't even want to touch it.

Is it a DIY PKI client implementation that only checks the issued certificate (assuming that is still valid) but not the chain or is your code running it without certificate validation entirely?

NihilCredo
Jun 6, 2011

iram omni possibili modo preme:
plus una illa te diffamabit, quam multæ virtutes commendabunt

Plorkyeran posted:

The next step is for it to magically stop working and then never work again.

Wil E. Coyote was an extended satire of software engineering.

Adbot
ADBOT LOVES YOU

freeasinbeer
Mar 26, 2015

by Fluffdaddy
Letsencrypt abused android not checking root CA expiration when their root cert expired, so I wouldn’t be shocked if that’s more widespread then people think.

Letsencrypt posted:

IdenTrust has agreed to issue a 3-year cross-sign for our ISRG Root X1 from their DST Root CA X3. The new cross-sign will be somewhat novel because it extends beyond the expiration of DST Root CA X3. This solution works because Android intentionally does not enforce the expiration dates of certificates used as trust anchors. ISRG and IdenTrust reached out to our auditors and root programs to review this plan and ensure there weren’t any compliance concerns.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply