Continuous Integration/build engineering/devops thread

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Continuous Integration/build engineering/devops thread

Hughlander: May 11, 2005

Space Whale posted:

So the "branch off of master, merge all into a dev branch; merge that dev branch into master to deploy" pattern has my architect a bit antsy, since with TFS it was hell. I've also seen at least one nasty git merge, but we were having spaghetti merges that looked like a cladogram of the protists.

...why not just use that dev as your master? What would make it all go pear shaped? I know that the project files can get a bit messy, but seriously, why?

I feel we are missing some context here. Usually the reason to do that is to use the named branch refs as tags representing state. Ie: tip of master is always what's on production. Places with more environments will use tags though. Like where I am uses the pattern pm deployment of making two tags CURRENT/ENVNAME and ENVNAME/YYYYMMDDHHMM. So like CURRENT/PRODUCTION and QA23/201502090844. The first being overwritten with each deployment but great for scripts to boot from and the latter being a history and deployment log that is useful to use when reproing bugs. But even then when prod is smoke tested its then set with a git checkout -B master.

# ¿ Feb 9, 2015 17:46

Adbot: ADBOT LOVES YOU

# ¿ Apr 29, 2024 05:09

Hughlander: May 11, 2005

Space Whale posted:

What I mean is after you coalesce your feature branches into dev, and you're ready to deploy, sometimes that dev branch doesn't merge smoothly into master. So, why not just overwrite master with it?

You're missing what should be a step it seems. Long lived branches should be constantly rebasing themselves onto the latest master. If this is done then there is no merge back into master, but rather master is a fast-forward of the long lived branch. If you go down that route you literally should use git -merge --ff-only at the end.

# ¿ Feb 9, 2015 19:02

Hughlander: May 11, 2005

Space Whale posted:

That raises another question I've gotten.

Let's say that there are breaking changes made by work being done in the API or architecture of the project, or whatever. These changes all need to be shared and the breaking bits fixed. Would pushing that to master early then doing the rebase/merge from master onto their branches be the best way to do this?

Depends on how large and crippling they are. If the person doing the change fixes it in the master branch that is going to be pushed then the only thing that really matters for the most part is what some other engineer is working on in their branch. Which is the point of frequently rebasing from master. However one time we had a java project that was removing a level of the package namespace (com.company.project.tech to com.company.tech) along with other large changes. It basically involved moving every file in the entire project down a level. What the architect who was doing that did was just called a meeting late in the day of, "Ok guys this is going out tomorrow, everyone merge this branch in and we'll sit here quietly working on it making sure that all of your long lived branches are clean and pass tests." But that was literally a one time thing and only example of that I can think of. I've done plenty of API rewrites as part of an old and crusty code-base and the mantra is basically, "It works on master and passes tests, if it doesn't work on your branch why do you have such a long lived branch?"

# ¿ Feb 10, 2015 15:09

Hughlander: May 11, 2005

Pollyanna posted:

What options are there for centralized config options/keys? One of our projects relies on a single config file that's copied over to every new Dev machine, and when changes in the config file happen, it makes everyone else's outdated and causes problems with failing tests and poo poo. Is there a service that offers a "centralized" config file or ENV variables?

If the source requires it to build/run and the source changes breaks it, you may consider it to be part of the source that needs control. Maybe by a source control system.

# ¿ Jul 8, 2015 15:14

Hughlander: May 11, 2005

the talent deficit posted:

tell me how dumb this plan is:

1. encrypt sensitive information in my ansible playbooks locally with ansible vault
2. push to git
3. provisioning pulls git repo and unencrypts using ansible-vault and a password stored on the provisioning server
4. aws codedeploy using ansible locally to do final provisioning on a mostly preprovisioned ami running in an asg

i don't need really good security, i just want to keep passwords out of git but i don't want to run zookeeper/consul/whatever to do so if i can avoid it. i also want to avoid putting the passwords in my asg amis if possible, hence decrypting on provisioning

Have you also considered http://aws.amazon.com/kms/

# ¿ Jul 13, 2015 15:10

Hughlander: May 11, 2005

Ithaqua posted:

For Git, they have pull requests + branch policies, where it will reject the pull request automatically if a build fails.

We have something like that using the pre receive hook of git. If a commit comes in and it hasn't passed tests schedule them. If it's master and tests hasn't run/passed. Reject the push.

# ¿ Feb 5, 2016 21:22

Hughlander: May 11, 2005

Virigoth posted:

We use a c4.2xlarge for the master. I should say it's 250 slave executors on other standalone Jenkins slave boxes. I'll look up data tomorrow and post it. I can also toss up a script that checks for jobs on master if you need any easy one to run and get s report back on a timer.

We use a 8xl for about a similar load with the jvm tuned for large heaps. We also stop nightly for a clean backup.

# ¿ Nov 12, 2016 19:09

Hughlander: May 11, 2005

Dreadrush posted:

Thanks for your advice. I read a blog post saying that Docker can be used for compiling your application too and not just solely concentrating on what is deployed, but I guess this is not the right way to do it.

I used that as a half joke at the otime k this week. Person complaining that he hardest problem with open source was building from source with all the assumptions that aren't documented. And i pitched docker as the one true configure/autoconf.

# ¿ Nov 20, 2016 21:42

Hughlander: May 11, 2005

smackfu posted:

Does anyone here work somewhere that forces all commits to master to go through a pull request (which has to build green before merging)? Is it good or does it just add more annoying process? Currently we just do something like "mvn install && git push" which runs all our integration tests before pushing which is pretty good at keeping the build green. But it does require discipline.

Where I am we have a system that watches for branches a certain way. Merges master into it runs tests then pushes it to master. This takes care of the problem you'll see while scaling. There could be 3 people doing that mvn install at the same time. And 2 of them will need to rerun their tests at best or won't at worst.

# ¿ Mar 31, 2017 20:34

Hughlander: May 11, 2005

StabbinHobo posted:

you're doing this: https://xkcd.com/927/

just put up a datadog dashboard that includes splunk data. delete nagios, muinen and greylog from your environment.

Datadog has a large limitation on the number of custom event types you can do as well as API limits for number of events per minute. It�d collapse almost instantly under pretty much any Greylog use case. IE: I️ was on a project that just saved off every request/response to greylog for 5 days and we could correlate errors back to the state change that caused it days before. Datadog is great for �What�s the memory pressure changes over the past 6 releases.� But real time event correlation is not it�s strong suit at all.

# ¿ Nov 12, 2017 19:00

Hughlander: May 11, 2005

Punkbob posted:

I don�t like splunk or datadog. Is there something wrong with me?

Depends, what do you like that fits their forte? I use datadog pretty heavily but don't particularly like it. But it does it's job enough and the price is low enough that the cost to replace it wouldn't outweigh my dislike of it. Next time around I'd probably do grafana and nagios.

Likewise I don't use Splunk but have ELK in place for one set of projects and Graylog for another. It's a checkbox that needs to be filled in the infrastructure but not really attached to the tool there.

# ¿ Nov 15, 2017 15:32

Hughlander: May 11, 2005

Schneider Heim posted:

Has anyone tried learning DevOps using small, personal projects? If so, how do you go about them? I've been thinking of getting a DigitalOcean droplet to practice DevOps stuff and tools which should help me with my work and hobbies (been thinking of making a couple of online apps to catalog information related to them). I don't want to be vendor-locked so I want to keep things as open source as possible.

I have a small VPS that I got tired of rebuilding when I moved to different providers. So I used puppet to configure it for a bit, tested it locally with vagrant, and now it's a docker compose file with a mix of locally built containers, registry containers, and private registry containers.

# ¿ Feb 21, 2018 15:10

Hughlander: May 11, 2005

Hadlock posted:

I spent like 2 hours today reading through all the kubernetes UDP-related doco, as far as I can tell SREs are allergic to UDP

Someone on Stack Exchange noted back in 2016 that docker's official documentation don't even document how to expose UDP ports when using the docker run command. That comment is still true to this day.

I found a super cool docker image that lets you run a UDP load balancer on a container with very low config:

https://hub.docker.com/r/instantlinux/udp-nginx-proxy/

Going to move it to container linux and have it boot with a launch config in an auto-scaling group of 1. Lambda will update the cloud.config file with the new backends and then nuke the node(s) in the autoscaling group... haven't figured out how to attach our singular ENI to a singular autoscaled node yet.

So far no issues. Waiting on my coworker to get done building our bind container and will put things through the wringer the next couple of weeks, this crazy DNS system will be the lynch pin for our database DR system... should be interesting.

Not sure if it is still true but when I last looked it wasn�t possible to run udp openvpn in docker just tcp which had other issues. But that was a long time ago.

# ¿ May 30, 2018 20:52

Hughlander: May 11, 2005

Cancelbot posted:

Have any of you done a sort of "terms and conditions" for things like cloud adoption? Our internal teams are pushing for more autonomy and as such are wanting to get their own AWS accounts and manage their infrastructure, which in theory sounds great. But when they need to VPC peer into another teams account or connect down to our datacentre something has to be in place to ensure they're not going hog wild with spend, not create blatant security risks etc.

I'm trying to avoid a heavily prescriptive top-down approach to policy as that slows everybody down, but management want to be seen to have a handle on things, or at least make sense of it all. I've started work on a set of tools that descend from our root account and ensure simple things are covered; do teams have a budget, are resources tagged, etc. but not sure where to go from here in terms of making this all fit together cohesively.

We have a few sets of things like that. Security has an IAM account with basically god level read only access to the accounts and run real time monitoring of changes. There's a cloud committee that discusses how to play well together but it's at a mass scale, multiple loosely related corporations under one ownership touching every cloud provider out there.

# ¿ Jun 28, 2018 14:15

Hughlander: May 11, 2005

StabbinHobo posted:

out of curiosity, what even still has you monkey loving vps configs at all? is there a technical limitation of the many PaaS vendors, a pricing issue, or just a cultural/old-ways thing?

My answer. I want docker zfs running with snapshots.

# ¿ Nov 26, 2018 21:11

Hughlander: May 11, 2005

jaegerx posted:

What are y�all using for docker garbage collection? Just docker prune?

Spotify/docker-GF

# ¿ Dec 14, 2018 05:33

Hughlander: May 11, 2005

Vulture Culture posted:

SQS is a good fit for things like notification processing or deferred task queuing, but it's an extremely poor fit for generalized messaging between application components. Send latency is good but not great, but receive latency might be in the 1s+ range under normal operating conditions. Throughput-wise, with 25 consumers/producers you might hit 80,000 transactions per second. Compare to Kafka, which in this configuration would easily be able to hit 10M+ messages per second.

At that scale aren't you looking at Kinesis not SQS?

# ¿ May 14, 2019 16:06

Hughlander: May 11, 2005

Vulture Culture posted:

Possibly. Kinesis is really intended as an ingest mechanism for S3 or RedShift, with hooks to invoke Lambda along the way. Its performance characteristics are really well-suited to event sourcing and data ingest, but it's fairly high-latency for a real-time messaging system.

It's 2019, and with modern service mesh approaches, it's not clear to me that this kind of architecture is terribly valuable to most people anymore. It's probably most important to folks who need robust real-time signaling for things like WebRTC at scale, which is somewhat in conflict with the durability guarantees provided by a system like Kafka.

I was going to write something about how I'm the dumbass streaming realtime video through Kafka, but it turns out Amazon does this as a product on Kinesis now.

Ouch. My last Kafka experience was GDPR anonymouzation with up to 6 hour latency on the queues

# ¿ May 14, 2019 19:50

Hughlander: May 11, 2005

necrobobsledder posted:

Anyone else going to be at Hashiconf next week in Seattle? I have other plans afterward besides happy hour but I can certainly hang out with some of you fine folks at lunch instead of awkwardly hanging out alone.

Thing I miss from being a block from the convention center... Random lunches with Goons... I'm down in Renton now about 15mi to the south and wish could have a drink with ya.

# ¿ Sep 3, 2019 03:00

Hughlander: May 11, 2005

Hadlock posted:

Yesterday my coworker proposed moving one of the major dev environments databases from an old postgres handbuilt thing to Amazon RDS. We're on PG 9.5 in prod. He suggested just upgrading to RDS pg 11 straight away. We spent the next 20 minutes in a meeting with 15 engineers + release manager telling him no, even though pg 11 is technically compatible, it's not a good idea to jump straight to 11 immediately. He finally relented and we all agreed RDS was fine, but only use pg 9.6, as it's a safe known version for our monolith.

Today in another meeting with 30+ engineers in it, he announces that he converted the major dev environment to RDS pg 11 anyways.

So we again, ask him what version production is on (9.5), how can we be certain what we're testing on pg 11 is going to catch problems with our ORM between pg 9.5 and 11, exact same conversation as not even 18 hours before. Finally the software architect gets involved in the conversation tells him no and he agrees again not to convert any more major environments to pg 11 and stick to pg 9.6.

So by now all environments are on pg 11?

# ¿ Apr 16, 2020 19:35

Hughlander: May 11, 2005

SeaborneClink posted:

I'm seriously FURIOUS at them for unilaterally coming in, forcibly taking over the Postgres chart, pissing off a few long time contribs in the process and breaking it.

Oh yeah then once they started doing things their way they deprecated helm/postgres and won't approve any PRs into it anymore, and now want you to use their bitnami/postgres chart.

gently caress Bitnami that poo poo is community hostile.

tl;dr they took over a chart and told everyone using it to go gently caress themselves.

Not knowing much about helm is it something to just fork and direct everyone to the version maintained?

# ¿ Apr 17, 2020 02:55

Hughlander: May 11, 2005

Gyshall posted:

Any of you goons working with Apache Kafka via Confluent Cloud? We're looking for the best way to repeatably deploy Kafka along side our GKE clusters. Confluent's terminology is all over the place, and it seems like they have multiple products, so I'm not sure of the best/most effective way to deploy this stuff.

Yes but not what I own. if there's specific questions I can ask the right people.

# ¿ Oct 9, 2020 19:13

Hughlander: May 11, 2005

Hadlock posted:

Nuke your etcd cluster and get back to me

On prem k8s is possible, but it's not fun

At least if you're going to do it, do it with a private cloud setup. Company I worked for had a few tens of thousands of servers in DCs around the world that were either being migrated or their foot print was being migrated to private cloud to then have groups run k8s on top of.

# ¿ Oct 17, 2020 20:05

Hughlander: May 11, 2005

Soricidus posted:

switching to the atlassian cloud sounds like a great option. how�s its performance across an airgap?

It's getting better all the time https://arxiv.org/abs/2004.06195

# ¿ Oct 28, 2020 22:17

Hughlander: May 11, 2005

I thought all the cool kids were moving to githubs registry now anyway

# ¿ Nov 7, 2020 02:37

Hughlander: May 11, 2005

Hadn't seen Orka before that looks neat. Previously we'd split between on prem minis and macstadium minis.

Another place was large enough that went to Apple to get them to agree to let us run Hackintosh on VMs as long as we had the same amount of hardware. That was cool because we'd use a jenkins slave on demand. Job comes in spin up the hackintosh, run the job, kill the VM.

# ¿ Nov 10, 2020 01:45

Hughlander: May 11, 2005

my homie dhall posted:

something about the name "data dog" rubs me the wrong way and I can't get over it

Is it because it's noun noun instead of adjective noun like god intended?

# ¿ Nov 14, 2020 06:30

Adbot: ADBOT LOVES YOU

# ¿ Apr 29, 2024 05:09

Hughlander: May 11, 2005

Hadlock posted:

Not to make light of your situation, but I never regret installing linters on my text editors. It's not always possible to install linters on headless servers you don't control, but it's certainly saved me hundreds of hours of debug

Any place I've been at that doesn't already have it, one of the first things I do / push for is a pre-receive hook that "If it ends in .json and isn't in a dirpath called test then it passes jslint or the push is rejected."

Last place it fired about twice a week from non-engineering people wanting to ignore tools and hand edit json. No amount of sitting with them to improve the tools led to any change of behavior.

# ¿ May 26, 2021 16:36

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Continuous Integration/build engineering/devops thread