Continuous Integration/build engineering/devops thread

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Continuous Integration/build engineering/devops thread

the talent deficit: Dec 20, 2003; self-deprecation is a very british trait, and problems can arise when the british attempt to do so with a foreign culture

Bhodi posted:

An issue I keep coming across is an elephants all the way down problem of then having to have an associated prod/dev/test/whatever for your all your management code / servers when you use puppet/chef/ansible/whatever.

For example, I built a jenkins test suite that pulls a branch from git and runs a bunch of tests on our cloud environment including creating VMs on a bunch of different vlans with configs using the tool we distribute to users. But now I need to be able to reproduce jenkins itself in both prod and dev, so I have a separate repo for the jenkins configs. And I need a program to be able to import/export, so I wrapped ansible around that and have some ansible tasks to pull/push configs to the various jenkins servers. But wait, the jenkins configs are subtly different because, for example, prod jenkins needs to pull from the prod branch and dev from dev, so now I have to munge it through a tool to dynamically make the jenkins configs.

It's ugly and now I have 3 repos to manage and try and keep in sync, all with different versions and good release process. It's messy but the best I could come up with. My sister group dealing with our openstack silo has it three or four times as bad.

All these cloud products enable people to easily do continuous integration on them, but not the app itself.

I have one project at work that has provisioning scripts for it's very own jenkins instance. It's smart enough to realize when it can reuse one already built but sometimes I really do have to build the whole environment. Config management is a special kind of hell.

# ¿ May 28, 2015 22:21

Adbot: ADBOT LOVES YOU

# ¿ Apr 29, 2024 12:36

the talent deficit: Dec 20, 2003; self-deprecation is a very british trait, and problems can arise when the british attempt to do so with a foreign culture

i want to get all state off a jenkins server because our provisioning is in a state of quantum uncertainty. ideally, you should be able to run a simple bootstrap script and there should be a provisioning server ready and waiting for you inside our vpc. however, i want to retain build logs and numbers. i looked at thinBackup but it seems pretty heavyweight for what i want. i thought about writing a plugin that writes logs to s3 and job numbers/status to postgres but i already manage like 15 postgres dbs and i don't want to add anymore. is an ebs block store device mounted at /var/lib/jenkins/jobs a terrible, terrible idea or do you think i can get away with it? i'll just add a line to the script that shuts down any running provisioning server and detaches the block device before starting a new one and attaching it

anyways, bad plan?

# ¿ Jul 1, 2015 03:07

the talent deficit: Dec 20, 2003; self-deprecation is a very british trait, and problems can arise when the british attempt to do so with a foreign culture

Bhodi posted:

What are you conceivably going to do with these logs? Do you really need full console logs and build numbers? Why?

That seems like a lot of effort for way too much data that's almost certainly worthless. What about generating / uploading a (junit) test report instead?

Or, do some groovy parsing and peel off what you actually need...

I get the log hoarder mentality but unless you really do have the capability and man power to go back and do heavy analysis with correlation to networking storage or whatever, with a feedback loop to actually drive change, it's kind of wasted. If your provisioning system really is that much of a mess, most likely you're going to get a shrug and a "well, it works now" so you might want to refocus your effort.

Also, build numbers aren't really useful in and of themselves, which is why I suggested a test report so you can tie it to whatever number is actually meaningful to your attempt - git id, tag, date or whatever.

If you go down the road of trying to capture the exact state of the jobs dir, the first time you need to reset the build ids or clear the logs it's going to be a mess. You say you never need to do that, but there are some reasons why you might need to, anything from running out of inodes to re-engineering your jobs, or maybe a future version of Jenkins changes the format. You'd be painting yourself into a corner if you went by that method, never mind the extremely convoluted solution.

you have made some really good points, particularly that no one is ever actually going to look at the logs and we should be using meaningful ids and not whatever garbage jenkins generates. i'm already enforcing that artifacts have to be saved somewhere other than the provisioning vm disk if you want to retain them so i guess i could do the same with logs/reports. worst case i can always add a post build step to the template job that moves logs to elasticsearch or something

# ¿ Jul 1, 2015 03:38

the talent deficit: Dec 20, 2003; self-deprecation is a very british trait, and problems can arise when the british attempt to do so with a foreign culture

TheresNoThyme posted:

Thanks for the input, that's the way I am leaning as well and it's nice to have input from someone already down the road on Docker. I had not heard of Mesos, will have to check that out.

It's a bit annoying because I just know I'm going to be back to "well it works on my local!" for the inevitable early slew of puppet script problems. I guess I just need to accept baby steps here and get the transparency docker provides, then worry about the other stuff later.

we run mesos in production and it's basically zero help. the stuff on mesos requires just as much hand holding as the stuff we deploy via jenkins/ansible

# ¿ Jul 6, 2015 17:40

the talent deficit: Dec 20, 2003; self-deprecation is a very british trait, and problems can arise when the british attempt to do so with a foreign culture

tell me how dumb this plan is:

1. encrypt sensitive information in my ansible playbooks locally with ansible vault
2. push to git
3. provisioning pulls git repo and unencrypts using ansible-vault and a password stored on the provisioning server
4. aws codedeploy using ansible locally to do final provisioning on a mostly preprovisioned ami running in an asg

i don't need really good security, i just want to keep passwords out of git but i don't want to run zookeeper/consul/whatever to do so if i can avoid it. i also want to avoid putting the passwords in my asg amis if possible, hence decrypting on provisioning

the talent deficit fucked around with this message at 20:37 on Jul 12, 2015

# ¿ Jul 12, 2015 20:33

the talent deficit: Dec 20, 2003; self-deprecation is a very british trait, and problems can arise when the british attempt to do so with a foreign culture

Hughlander posted:

Have you also considered http://aws.amazon.com/kms/

yes and also vault but kms makes some things that are dumb but company culturally ingrained impossible and vault is another service i'd have to manage. my main concern for now is removing user accounts and passwords from plaintext git

# ¿ Jul 13, 2015 20:18

the talent deficit: Dec 20, 2003; self-deprecation is a very british trait, and problems can arise when the british attempt to do so with a foreign culture

Boz0r posted:

What are peoples' thoughts on TeamCity vs Jenkins?

awful vs horrid

chasing good build servers is a waste of time: there isn't one and you should stick with what you have

# ¿ Feb 2, 2016 16:24

the talent deficit: Dec 20, 2003; self-deprecation is a very british trait, and problems can arise when the british attempt to do so with a foreign culture

Boz0r posted:

What I have is four developers trying to do weird patch jobs for each deploy. I think anything would be better than that.

if deploy is your problem (and not building) i would encourage you to look elsewhere. aws codedeploy is pretty good if you are on aws

# ¿ Feb 4, 2016 02:57

the talent deficit: Dec 20, 2003; self-deprecation is a very british trait, and problems can arise when the british attempt to do so with a foreign culture

revmoo posted:

I'm quite happy with my deployment methodology and I'm not interested in changing it. I would definitely like to explore Docker for infrastructure management but I couldn't imagine using it to deploy code.

how do you envision docker helping with infrastructure management?

we use docker at work for two things:

we have docker images for each of our build environments that our jenkins master runs jobs on. that let's us easily do things like build some services against openjdk7 and some against openjdk8 and also provision things like databases needed for testing

we also deploy python and ruby services to aws ecs with docker. we haven't moved anything jvm or beam based to ecs/docker because they have better ways to build self contained images

i have no idea how we would use docker for infrastructure management (we do all that with cloudformation and a pile of python scripts)

# ¿ Sep 28, 2016 19:03

the talent deficit: Dec 20, 2003; self-deprecation is a very british trait, and problems can arise when the british attempt to do so with a foreign culture

revmoo posted:

I need a new server node spun up. I provision the VM and deploy a docker image to it. Then I deploy our app to the node.

Is this not the intended use case for Docker?

no, not really? what is in the docker image in this case? a service like fluentd or postgres? or runtime components your app relies on, like openjdk or openssl? in the former case you could do that but unless you need to isolate the service for some reason you are not really gaining anything. if it's the latter your app needs to also run in the container to access those components

docker doesn't replace something like chef or ansible, it's more like building a vm image

# ¿ Sep 28, 2016 19:37

the talent deficit: Dec 20, 2003; self-deprecation is a very british trait, and problems can arise when the british attempt to do so with a foreign culture

Votlook posted:

Does anyone have experience doing blue-green deployments in versioned infrastructures?
At work we are using AWS CloudFormation templates to manage our infrastructure.
We're mostly happy with this, but updating CloudFormation stacks is a bit of a black box at times,
so we want to move to blue-green deployments.
I get the impression that blue-green deployments don't really fit in CloudFormation.
Does anyone have experience with this, or is CloudFormation the wrong tool for the job?

blue green is kinda bad but you can do blue green with cfn. separate your resources into blue and green and then update whichever is the standby on one pass, then flip the load balancer/dns/config on a second pass when you are ready to flip. optionally do a third pass where you clean up the old live resources

the talent deficit fucked around with this message at 00:14 on Oct 19, 2016

# ¿ Oct 19, 2016 00:08

the talent deficit: Dec 20, 2003; self-deprecation is a very british trait, and problems can arise when the british attempt to do so with a foreign culture

i think blue/green encourages/enables some really harmful practices like treating your standby environment as a staging/integration environment and relaxing requirements on api compatibility. i think in the small (like using blue/green for a particular subsystem like a database or an application group) blue/green can be okay but if you can do blue/green in the small you can probably just do gradual replacement where you can have multiple versions deployed simultaneously without impacting users. basically, i think if you have a healthy blue/green procedure you don't need it, and if you need it you probably have a hard time deploying regularly

# ¿ Oct 24, 2016 04:50

the talent deficit: Dec 20, 2003; self-deprecation is a very british trait, and problems can arise when the british attempt to do so with a foreign culture

docker is fine for compiling, but you should have your container produce whatever artifact you need and then use that artifact in separate containers (or just run it directly). you shouldn't try to compose your build container with your run container. that's the worst of all worlds

# ¿ Nov 20, 2016 22:06

the talent deficit: Dec 20, 2003; self-deprecation is a very british trait, and problems can arise when the british attempt to do so with a foreign culture

do people complaining about waiting on integration tests not do code review? we don't merge anything in less than 12 hours (unless it's a critical fix) because all prs have to go through extensive code review. that always takes longer than running integration tests

# ¿ Apr 1, 2017 03:46

the talent deficit: Dec 20, 2003; self-deprecation is a very british trait, and problems can arise when the british attempt to do so with a foreign culture

Vulture Culture posted:

Unsolicited opinion: if a code review takes 12 hours, your change batches are probably too big. Most of the code reviews I submit can be completed in a minute or two (this is obviously not true for enormous refactors)

we encourage all developers to at least read the commit log for each pr and raise questions/objections if they have them, even if they are not an assigned reviewer. we tend to leave them open until the next morning just to give people an option to review

# ¿ Apr 1, 2017 21:37

the talent deficit: Dec 20, 2003; self-deprecation is a very british trait, and problems can arise when the british attempt to do so with a foreign culture

ansible is fine as a way to fill in some variables in templates and install some apt packages or Ruby gems or whatever. it's a total nightmare once you move beyond that

# ¿ Dec 14, 2017 01:11

the talent deficit: Dec 20, 2003; self-deprecation is a very british trait, and problems can arise when the british attempt to do so with a foreign culture

Hadlock posted:

We'll see how long this lasts for, but for the moment, after some good proof of concept work on containerizing two of our core products, my boss is giving me carte blanche to move his company's systems out of the dark ages of managed VMs and in to the light of kubernetes on AWS.

Trying to compile a "2018-era devops starter kit", looks something like this, what would you add/delete/modify?

Hosting: AWS
Orchestration: Kubernetes
Reverse proxy: Traefik
Monitoring/Alerting: Prometheus/Grafana
Log management: Graylog
Build system: Jenkins
Secret management: Vault
Source control: Github
Release management: Coreroller

if you're going to use aws i'd probably use ALB instead of traefik and cloudwatch logs -> es instead of graylog. you might want to replace jenkins with codebuild too

# ¿ Jan 7, 2018 05:02

the talent deficit: Dec 20, 2003; self-deprecation is a very british trait, and problems can arise when the british attempt to do so with a foreign culture

i just wrapped up a big project consulting on a failed migration from self hosted to k8s-on-aws. i think they would have probably succeeded if they'd done self hosted k8s (or openshift) or just straight self hosted to aws but where they ended up was a huge mess

i'm reserving judgement on aws eks until it's actually ga but i think right now k8s on aws is a mistake unless you need to also support self hosted k8s or you are starting from something easily ported to k8s (like all your applications are already running in docker in production). almost every legacy project is going to have a hard enough time moving to aws rds/aws ec2/aws ecs/... without also throwing docker and k8s into the mix

# ¿ Jan 27, 2018 22:07

the talent deficit: Dec 20, 2003; self-deprecation is a very british trait, and problems can arise when the british attempt to do so with a foreign culture

Virigoth posted:

Do you have a debrief of what went wrong? There is so much kubernetes running on AWS that they made a service specifically to help more people use it on AWS. Our TAMs were happy they pushed the kubernetes service so fast because it was blowing up their time from enterprises.

i don't think k8s is bad or anything i just think it's a big change from running your own data center. if they'd setup a k8s cluster onsite and moved what was easy to move to k8s they would have built valuable experience in k8s that would have helped moving the hard to move parts. similarly, if they'd moved to aws via using managed services like rds/elasticache/emr where appropriate and simple autoscale groups for their application tier i think they would have had issues but would have had a better chance of success. there was no chance they were going to do the whole data center to k8s on aws migration tho

# ¿ Jan 29, 2018 05:40

the talent deficit: Dec 20, 2003; self-deprecation is a very british trait, and problems can arise when the british attempt to do so with a foreign culture

Ploft-shell crab posted:

the idea that everyone on a team should be involved in all three of writing, releasing, and deploying code is a bad one

it's not that everyone is responsible, it's that there's no walls or organizational barriers between ops and dev. you don't need to do the deploys and write the code, you just need to be on a team that can write code and deploy it and maintain it

# ¿ Aug 1, 2018 21:53

the talent deficit: Dec 20, 2003; self-deprecation is a very british trait, and problems can arise when the british attempt to do so with a foreign culture

itskage posted:

What's the current hot poo poo for web e2e tests?

We're still using selenium. Is that still relevant? I'm trying to evaluate this before we start writing a ton of new ones for a new project.

puppeteer is good if you can live with chrome only

# ¿ Oct 2, 2018 21:35

the talent deficit: Dec 20, 2003; self-deprecation is a very british trait, and problems can arise when the british attempt to do so with a foreign culture

goatsestretchgoals posted:

Traditional DBA checking in. What is the current hot pattern for ephemeral database processes with long lived storage?

still presto, if i understand your question

# ¿ Jan 2, 2019 15:39

the talent deficit: Dec 20, 2003; self-deprecation is a very british trait, and problems can arise when the british attempt to do so with a foreign culture

terraform consumes so much time and attention at every place i've been that used it that i'm convinced it's a scam to ensure full employment of programmers who don't want to program

# ¿ Feb 1, 2021 06:28

the talent deficit: Dec 20, 2003; self-deprecation is a very british trait, and problems can arise when the british attempt to do so with a foreign culture

12 rats tied together posted:

This is something I could post about for hours, but I've already done that a bunch of times ITT so I'll just try to summarize my take on this for you: Terraform is a fine tool for simple workloads, it's especially nice as a "high floor" tool where it's impossible for you to be under a certain level of productivity and still count as "using Terraform".

It gets worse the more you rely on it, and especially as the complexity of your deployments gets higher. If you're using it as a feature team, for a volunteer/personal project, or a small infrastructure deployment as part of a PaaS consumer team (ex: dba, stream processing team, etc), it is good enough to basically be an emerging best practice.

If you're on an infrastructure engineering team providing that PaaS abstraction to other feature teams, it's a really bad tool and you shouldn't use it, you'll be able to come up with something way better yourselves.

this is basically where i land. if you can do it in an afternoon terraform is fine (but also most things are going to be fine and it comes down mostly to taste and experience). if you are writing terraform to enable other teams to write more terraform you end up with awful messes

# ¿ Feb 1, 2021 20:51

Adbot: ADBOT LOVES YOU

# ¿ Apr 29, 2024 12:36

the talent deficit: Dec 20, 2003; self-deprecation is a very british trait, and problems can arise when the british attempt to do so with a foreign culture

that list of questions is a lot of trivia and not a lot of questions focusing on experience. i'd ask broader questions that let the candidate flex their knowledge of k8s rather than questions that reveal how many boxes they can check

for instance, if you want to ensure your hire has solid knowledge of k8s networking don't ask them what a vxlan is, ask them about a k8s networking problem they experienced and how they solved it

# ¿ Apr 16, 2021 21:33

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Continuous Integration/build engineering/devops thread