Continuous Integration/build engineering/devops thread

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Continuous Integration/build engineering/devops thread

kitten emergency: Jan 13, 2008; get meow this wack-ass crystal prison

I dunno if there's a better thread to ask this in, but does anyone have any experience with Selenium for automated browser testing? We've got a whole lot of old Selenium RC tests and for some reason haven't updated our Selenium Server or test runners since 2.22. Nothing uses WebDriver as far as I can tell. We'd like to upgrade the whole thing so we can test against current versions of FF and Chrome but also keep testing against IE8.

I guess I'd also like to know what (if anything) is better than Selenium for this purpose.

# ¿ Mar 7, 2015 05:01

Adbot: ADBOT LOVES YOU

# ¿ Apr 29, 2024 13:27

kitten emergency: Jan 13, 2008; get meow this wack-ass crystal prison

Cancelbot posted:

Our TeamCity agents are now described in Packer! No more unicorn build agents

On the downside it takes over 20 minutes for a Windows AMI to boot in AWS if it's been sysprepped (2 reboots!). Even with provisioned IOPS

that sounds excruciating. why are you sysprepping your agents?

# ¿ May 21, 2017 05:07

kitten emergency: Jan 13, 2008; get meow this wack-ass crystal prison

Paul MaudDib posted:

I'm trying to set up a simple docker setup to learn (container my home fileserver's services).

Just to be sure I'm understanding this right: you create a data volume and that's an overlay on the base state of the container that's persisted between runs of the app container? And you can then update the app container independently for point releases without nuking your data?

What's the advantage of calling it a "data volume" vs just mounting a host dir in and calling it a day? More ability to scale when you move to a swarm/compose-type system?

With a swarm/compose/whatever system, how do you handle the failure of something stateful like a database server where "spin up another" isn't an option? Replication across multiple nodes? One master instance? (Postgres's ability to unfuck itself from the WAL should be handy here)

Is there a way for container apps to programmatically signal that they think a service is offline and vote for a reboot of a dead service container in a swarm-type system?

For swarm-type setups, what is the best way to accomplish logging for the swarm in a centralized application for debugging/health monitoring?

The answer to most of these questions is why Kubernetes exists. You can add cluster-level services for things like health, logging, etc. I'm not super familiar with what's available for swarm, but you'd probably be looking at deploying instances of your logging service in containers alongside your app containers, and have those feed data to another service (like Prometheus or ELK or w/e).

Technically, mapping a local directory is just another type of data volume. The docker volume stuff has drivers for various other bits and bobs, like mounting shared storage directly into your containers. Mostly, creating a data volume allows you to somewhat easily share that volume with multiple separate containers.

# ¿ Jun 8, 2017 03:42

kitten emergency: Jan 13, 2008; get meow this wack-ass crystal prison

Paul MaudDib posted:

This seems to be the shortest way from A to B for small-scale setups in my (very novice) opinion. Have a postgres application container which points to a data directory which is mounted in from the host (eg /srv/postgres/). Unless there's pitfalls here with scalability or something? I'm not sure it's a huge practical gain versus just running it as a bare-metal service but at least this way you bring it inside the Docker system and can apply your logging/health monitoring layer or w/e over the top.

But I can definitely see an argument that scale-on-demand database instantiation is probably more trouble than it's worth for small-scale applications and you are probably better off just having one DB per docker swarm or whatever.

I'll take a look at Kubernetes too.

Production DBs should definitely not necessarily be containerized, although like you said, it's great for small scale dev/test stuff.

For prod, I'd have your DB clusters be outside of docker, but you can do stuff like proxy DB calls through a service container (and collect your logging/health stuff there) and then scale that independently for whatever reason. Can be useful for multi-tenancy.

# ¿ Jun 9, 2017 13:10

kitten emergency: Jan 13, 2008; get meow this wack-ass crystal prison

EssOEss posted:

What I have found is that container shutdowns sometimes are not clean - it just kills the process and some data loss can occur due to incomplete writes. Maybe related to high system load, maybe just random luck. Have not investigated in depth.

It's even worse with Windows containers where a graceful shutdown is flat-out not supported yet (though supposedly coming in RS3).

Other than that, I have not encountered any issues. To those saying don't do it - why?

Your containers should gracefully handle SIGTERM/SIGKILL because that's what the daemon is going to send them.

# ¿ Jun 10, 2017 03:40

kitten emergency: Jan 13, 2008; get meow this wack-ass crystal prison

Dren posted:

Where will devs get the environments from? Do they have to build it themselves?

You could either have devs build it themselves, host an internal (if it's somehow private/confidential) docker registry and push it to there and then they'd pull the base image once, or push the base image to docker hub and do the same.

# ¿ Jun 30, 2017 01:22

kitten emergency: Jan 13, 2008; get meow this wack-ass crystal prison

If you're already on AWS, you can use ECR registries for pretty cheap.

# ¿ Jun 30, 2017 12:24

kitten emergency: Jan 13, 2008; get meow this wack-ass crystal prison

necrobobsledder posted:

I didn't want to have even a chance of the container being accessible on the public Internet and since ECR doesn't support provisioning inside a VPC last I saw, that was a no-go for me. The EC2 instance + EBS probably costs more than what ECR would cost us probably, but with a $280k+ / mo AWS bill from gross mismanagement (110 RDS instances idling 95% of the time, rawr) I'm not being paid to care about cost efficiency anymore.

ECR requires authentication, though.

Also Jesus Christ I'd give my left nut to have management that gave that few of a shits about AWS spend

# ¿ Jul 1, 2017 04:47

kitten emergency: Jan 13, 2008; get meow this wack-ass crystal prison

Bhodi posted:

This is such a weird take because all I hear management talk about is the AWS bill every month

My problem is that they just don't want to spend money, period. Our main CI server is seven years old and rapidly failing, but when I ask for money to get a new server, it gets rebuffed. When I say that we're going to extend the useful life by offloading build agents to EC2, I get complained at that we're spending too much on AWS. We run a lot of testing in the cloud because when we ran it locally, we'd lose end to end test runs because there was too much load on our VMware cluster and all of the runs would fail, causing lost days and missing milestones. Put it in the cloud, now they bitch about spend.

Sometimes you just can't please anyone except yourself.

# ¿ Jul 1, 2017 17:14

kitten emergency: Jan 13, 2008; get meow this wack-ass crystal prison

EssOEss posted:

Yeah, I do not want to limit it to one agent.

For the sake of simplicity, you can imagine my build process uploading http://example.com/latestversion.exe. If two happen in parallel, the last one finishing wins and there's no way to know that the one that actually wrote it there was from the most recent checkin.

Serializing the builds would be the easiest way to eliminate such issues.

This seems like a weird thing you're doing anyway, but why not just have your build process emit versioned artifacts then have another job that marks them 'latest'? I know with TeamCity or Jenkins it should be pretty trivial to figure out which is more recent based on the source revision and name them appropriately.

# ¿ Aug 9, 2017 13:37

kitten emergency: Jan 13, 2008; get meow this wack-ass crystal prison

Mr. Crow posted:

Ok so you are being argumentative over a one-off anecdote of apparently dubious quality and then arriving at the same conclusion.

I'm genuinely surprised so many of y'all dislike ansible and call it lacking, when was the last time you used it? Everyone else I've talked on and offline has loved it, myself included. I feel it has chef and puppet beat in almost all cases; usability, readability, getting new people or servers up and running etc.

What specifically is it lacking?

we use ansible and it's pretty great. it's even good at dealing with windows servers now.

# ¿ Dec 13, 2017 23:27

kitten emergency: Jan 13, 2008; get meow this wack-ass crystal prison

Bhodi posted:

I find ansible great for one offs and initial configuration but completely hopeless when it comes to consistency and conformity remediation (did someone edit a file where they shouldn't? revert it), it's not great at working through baton hosts, and the static nature of the hosts file is at odds with the fast moving and mercurial nature of cloud vms. Including files depending on variables is it's token nod to modularization, but puppet has an entire dependency tree with resolution. They're mostly different tools and i don't feel like they overlap all that much, which is why every company using ansible is ansible plus something else.

why do you have servers where this is a problem? cattle, not pets.

# ¿ Dec 15, 2017 03:07

kitten emergency: Jan 13, 2008; get meow this wack-ass crystal prison

Warbird posted:

I always feel like I have a handle on what I do until I read this thread.

So, we�re talked with automating the install and config of a software stack. Problem is, it appears the msi switches don�t exist for some of the settings that are being tweaked. I�ve got a workaround via making a Chocolatey package and autoIT, but this doesn�t seem to be the �right� way to do things. Any suggestions?

What on earth settings are you "tweaking"?

# ¿ Jan 8, 2018 14:37

kitten emergency: Jan 13, 2008; get meow this wack-ass crystal prison

tbh you'd probably be better off with a python script and a cron job and an http server (if you wanna be fancy, put it all in docker containers)

# ¿ Jan 17, 2018 04:36

kitten emergency: Jan 13, 2008; get meow this wack-ass crystal prison

My 0.02 is that if you�ve got everyone on-board with putting everything into a particular CI tool (like TC) then go for it; you can use build chains, snapshot dependencies, etc. to break your build up into logical chunks so that things can be run in isolation/iterated on.

That said, this is as much a people problem as it is a technical one. If people are used to being able to do things in increments, even if you give them a way to in TC, they�re just gonna click �run� on whatever and let god sort out the details.

I don�t really have an answer for you outside of that. The biggest thing I think helps is making sure that devs understand how to use TC really well.

# ¿ Feb 21, 2018 04:56

kitten emergency: Jan 13, 2008; get meow this wack-ass crystal prison

TeamCity is very reliable and works well for us. Depends on what you�re building at the end of the day though.

# ¿ Mar 9, 2018 05:48

kitten emergency: Jan 13, 2008; get meow this wack-ass crystal prison

if you need windows stuff (and can do cloud), appveyor is pretty decent and it has cloud or on-prem options.

# ¿ Mar 16, 2018 03:33

kitten emergency: Jan 13, 2008; get meow this wack-ass crystal prison

Most cloud CI services are free for public repos.

# ¿ Mar 17, 2018 18:44

kitten emergency: Jan 13, 2008; get meow this wack-ass crystal prison

necrobobsledder posted:

Trying to get a rough idea of what�s expected stress / responsibilities compared to others that have broader experience than myself.

Is it normal for companies to hire �devops� engineers as a hero engineer that are expected to take completely garbage, stateful, poorly documented, unautomated legacy (5 - 15 years old) software and have exactly one engineer out of 8 - 30 engineers take over most of infrastructure ownership, deployments, release management, and deliver a CI/CD pipeline in less than half a year while being on-call? I�ve talked to dozens of companies (large, small, b2c, enterprise - the full gamut) in several non-tech hubs for years and all but 3 companies seem to want / need exactly this (in veiled or not so veiled intent) while paying maybe 20% more for said engineer(s). It�s getting super old being deployment dave when I spend 30% of my time documenting and making deployments push-button easy for others and getting stuck with marching orders like Dockerizing super stateful, brittle software intended to be pushed into a K8S cluster.

yes

ask for a raise

# ¿ Apr 16, 2018 01:02

kitten emergency: Jan 13, 2008; get meow this wack-ass crystal prison

i dealt with that exact same problem; you can't import that pfx in a non-interactive fashion in any way, so your only option is to create a new keyfile (the extension MS uses for this is snk, i believe), change the path in the csproj to point to the snk file, then commit that to vcs.

ed: you can always host the snk file on s3 or the azure equiv. then download it as part of the build if you're unable to check it into source control.

# ¿ May 23, 2018 13:44

kitten emergency: Jan 13, 2008; get meow this wack-ass crystal prison

Hadlock posted:

If Prometheus/Grafana is the open source monitoring solution

What is the log management equivalent these days

Bonus points if there's already a helm chart for it

Looked at distributed tracing? OpenTracing and Jaeger?

# ¿ Jul 28, 2018 03:21

kitten emergency: Jan 13, 2008; get meow this wack-ass crystal prison

Vulture Culture posted:

If your big problem around distributed tracing is context propagation (it's ours for sure), consider OpenCensus instead of trying to deal with OpenTracing directly

Can you elaborate on this? I'm only starting to get into distributed tracing and I'd be interested to hear about your experiences with it.

# ¿ Jul 31, 2018 12:28

kitten emergency: Jan 13, 2008; get meow this wack-ass crystal prison

I've used ProGet which was pretty OK. Nice if you've got .NET stuff in the mix.

But yeah, depending on your size I'd probably point you to the SaaS versions of anything you seriously want to use unless you have some real compelling reason to run the server in-house.

# ¿ Sep 12, 2018 13:33

kitten emergency: Jan 13, 2008; get meow this wack-ass crystal prison

Stringent posted:

*Macs Mini

thanks

# ¿ Oct 14, 2018 05:44

kitten emergency: Jan 13, 2008; get meow this wack-ass crystal prison

necrobobsledder posted:

. If there was no database connection made within 30 seconds, the code presumed you were a developer and granted root view ACLs for the session. Nobody
in management knew such a mode even existed until they were on a bridge and a developer said "oh yeah, that's the mode we use all the time for local work, X put it in 4 years ago." The resulting contract breach and lawsuit has sunk the company (on top of the other bad things going on, granted).

what the gently caress

# ¿ Nov 24, 2018 19:54

kitten emergency: Jan 13, 2008; get meow this wack-ass crystal prison

i dont know why people would go to reinvent to learn things, its a vendor conference :thunk:

# ¿ Nov 29, 2018 00:50

kitten emergency: Jan 13, 2008; get meow this wack-ass crystal prison

dont blame me, i'm going to kubecon

# ¿ Nov 29, 2018 02:48

kitten emergency: Jan 13, 2008; get meow this wack-ass crystal prison

last time i messed with them i found all of the azure management SDK and APIs to be absolutely terrible and generally slower than using the UI or powershell for some reason.

# ¿ Jan 10, 2019 03:41

kitten emergency: Jan 13, 2008; get meow this wack-ass crystal prison

honestly, just use k8s.

# ¿ Jan 17, 2019 16:32

kitten emergency: Jan 13, 2008; get meow this wack-ass crystal prison

you probably don�t need to know statefulsets and all the vagaries of the podspec to get started with k8s imo. like, no poo poo it does a lot of things, it�s an object db that someone built a container orchestration platform on. but most applications aren�t really that complicated, and learning how to decompose them into containers for k8s imo makes more sense than trying to use swarm at all.

that�s just my 0.02, ymmv.

# ¿ Jan 18, 2019 17:18

kitten emergency: Jan 13, 2008; get meow this wack-ass crystal prison

Hadlock posted:

So we had that golden turning moment when the vp of engineering and cto have been playing with my prototypes for a couple of months and finally decided to convert the entire stack over to k8s and switch our dev/qa systems (well, half of qa was on k8s already) over to aws in it cloud/k8s, and ditch our third rate bare metal hosting provider.

The problem is that they want to get rid of config completely, every "stack" gets it's own namespace, and uses hard coded dns/user/pass... And then in production, we're going to use a different method of supplying dns and credentials.

This goes pretty much against the whole idea of "same container in dev, same in qa and prod"... Vp of engineering has no experience with ops and wants to simplify things up improve deployment speed/reliability in dev, I can't seem to convince him that we should use the same dns/cred mechanisms in dev as we do prod.

Thoughts?

use terraform

# ¿ Jan 21, 2019 02:59

kitten emergency: Jan 13, 2008; get meow this wack-ass crystal prison

Warbird posted:

So my new workplace is pretty nice and I got a pay raise from the last position. Everything's great except the fact that Github is blocked on the network for christ knows what reason so is the specific tool I was hired to work with. I've been advised to do research on my personal laptop and just email myself the code snippets I'm interested in.

Fukkin what. This is still a net improvement, but what are we doing here people?

what on earth? quit.

# ¿ Feb 7, 2019 22:01

kitten emergency: Jan 13, 2008; get meow this wack-ass crystal prison

Rocko Bonaparte posted:

No they're standalone applications. Some of them have GUI automation around them and I want to make sure it all still works. It's definitely unusual so I can't even get mad that keeps coming up. I couldn't use Docker in one hop because of that. I don't know if I could instead, say, boot up a container and run Virtualbox from there. Maybe? I don't think that gives me too much there.

I think we have some kind of access to Azure and the question is what are the magic words to pass along through various IT requests to get a fighting chance of somebody instantly recognizing what I'm trying to do here. I was thinking something like "dynamic node requests" or something.

you need azure virtual machines (if you're using teamcity or jenkins or w/e there's probably a way to have it automagically create a vm for you when a build starts), and presumably this is a windows application so you'll need to perform some fuckery to have automatically log in to an interactive session. at that point you'd use whatever your test automation bullshit is to run the tests.

# ¿ Jun 8, 2019 05:19

kitten emergency: Jan 13, 2008; get meow this wack-ass crystal prison

Rocko Bonaparte posted:

The OSes I'm using for testing aren't necessarily the best for a server deployment so I don't think I could just make the VM nodes themselves run the OSes. Rather, I imagine I would bring up some robust, server OS and then subvirtualize a VM for each OS/Python permutation to do what I need to do. I know this metavirtualization thing makes things more complicated.

yeah no. you�re not gonna be able to do cute metavirtualization poo poo on public cloud, and it�s going to be a pain in the dick on private cloud. You can make your own windows images using non-server skus if you care, but I doubt it actually matters much and if it does matter then you�re well into the territory of needing to spend $$$ on a test lab and human beings

# ¿ Jun 11, 2019 05:25

kitten emergency: Jan 13, 2008; get meow this wack-ass crystal prison

IMO depending on the size of your team/org running your own metrics infra is going to be pain and sadness

# ¿ Sep 27, 2019 00:54

kitten emergency: Jan 13, 2008; get meow this wack-ass crystal prison

i have a lot of opinions about monitoring nee observability but since i work for a vendor they're obviously biased. here's my 0.02 in general though -

datadog is extremely expensive for what you actually get out of it.
signalfx is probably going to get bad post-splunk acquisition
most of the big monitoring/apm companies are more interested in selling skus so they'll come in cheap then get you as you scale, but a lot of newer stuff scales better.

the best thing you can do is build around oss instrumentation (opentelemetry if you can wait a bit, opencensus/opentracing today if you can't) because it'll let you transition from self-run stuff to saas solutions. if you don't have anything, start instrumenting your crap and pipe traces to jaeger/metrics to prometheus with like a week's retention and see how much better it makes daily ops life. ultimately the real power of the saas tools is that they can do a lot of stuff that you can't or won't build yourself - figuring out what's important automatically, helping you define SLO/SLI's, etc.

# ¿ Sep 27, 2019 02:06

kitten emergency: Jan 13, 2008; get meow this wack-ass crystal prison

12 rats tied together posted:

One thing I want to do that I'm having trouble finding mention of in marketing materials is something akin to complex event processing. Basically I don't really care about average CPU utilization across a cluster of compute nodes, but while operating this cluster of compute nodes, and the application running on them, we've noticed a number of events that occur throughout the application lifecycle.

It would be really sick if I could ring my phone when we see one type of event happen and then we don't see any followup events of a different type inside the next 24 hours, for example.

I can�t think of anyone doing this (hell, I�d be vaguely surprised if google et. al. were doing this)

Cancelbot posted:

On Datadog/APM in general we're currently trialling AppDynamics and Dynatrace as a thing to replace NewRelic; does anyone have opinions on these?

So far AppDynamics hooks you in with its "AI" baselines and fancy service map, but a lot of the tech seems old and creaky like having to run the JVM to monitor Windows/.NET hosts and weird phantom alerts where we got woke at 2am. Dynatrace seems like a significantly more complete product and its frontend integration is stupidly good.

What also seems to work in Dynatraces favour is its per hour model vs AppDynamics per host for at least a year model. I can roll the licensing in with our AWS bills for pay-per-use, but I need to see about getting the incentives applied to our account. The only thing I can't seem to find is an insights-equivalent product.

I�m a bit more up on dynatrace than appd but a lot of it really depends on what you�re trying to do.

I�m curious if anyone itt has looked at Honeycomb/LightStep/Omnition.

# ¿ Sep 27, 2019 13:43

kitten emergency: Jan 13, 2008; get meow this wack-ass crystal prison

Blinkz0rz posted:

my team is currently having an absolutely awful time with our logging solution and are considering moving to honeycomb. i wasn't with the team when they did the demo but from what i heard it was great tech that was just a little too pricey to consider switching to

My understanding is that honeycomb is pretty inexpensive, but I don�t know their pricing model.

# ¿ Sep 28, 2019 03:36

Adbot: ADBOT LOVES YOU

# ¿ Apr 29, 2024 13:27

kitten emergency: Jan 13, 2008; get meow this wack-ass crystal prison

on one hand security theater is intrusive and doesn�t help, on the other, a bunch of contractors and consultants and so forth got a lot of lucrative contracts out of the deal so who can say if it�s bad or not

I�ll leave it to the reader to discover the applicability of this metaphor to working in software

# ¿ Nov 27, 2019 15:32

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Continuous Integration/build engineering/devops thread