fletcher posted:I've been using Chef for managing my personal server (leased dedicated server) which I use for a few simple things: I may have spoken too soon...looks like the Nomad/Docker support on Windows is pretty crappy right now. The usual experience of running into an issue doing something very basic and finding a post on their forums describing the same issue, with no activity for awhile. I came across an open ticket for running Linux containers in Docker on Windows: https://github.com/hashicorp/nomad/issues/2633. drat!
|
|
# ? Dec 15, 2020 05:04 |
|
|
# ? May 16, 2024 03:45 |
|
Every salt install/shop I’ve run onto has been a poo poo show of complicated and broken. I’m not sure it’s the tool or if it is just the feature set that appeals to places that have made it dysfunctional. Last place had an acquisition that had invested in it heavily and added a bunch of automation on top, and it was one of the most complicated pieces of poo poo I’ve ever seen, and I run kubernetes! We basically gave up and wrote ansible playbooks as an emergency stopgap while we migrated them to K8s.
|
# ? Dec 15, 2020 14:51 |
|
I have an interview on Tuesday that has ELK as good to have, is it feasible to learn enough about it to not sound like an idiot discussing it if I start studying it today? I'm not going to try to sound like an expert, I just want to know enough that I can carry on a conversation. Failing that, what about Puppet or Octopus, those are also on the "nice to have" list. I only know Terraform as far as deployment tools at this point.
|
# ? Dec 18, 2020 21:13 |
|
22 Eargesplitten posted:I have an interview on Tuesday that has ELK as good to have, is it feasible to learn enough about it to not sound like an idiot discussing it if I start studying it today? I'm not going to try to sound like an expert, I just want to know enough that I can carry on a conversation. Play around with it and pick up as much as you can. In general, if an interview says something is nice to have, and you don't know it well, just mention that you're not intimately familiar with it and don't try to pass yourself off as an expert and you'll be fine. Also, the lines can be kind of blurry, but Puppet, Octopus, and Terraform are all designed to solve different problems, don't lump them together as "deployment tools".
|
# ? Dec 18, 2020 21:31 |
|
22 Eargesplitten posted:I have an interview on Tuesday that has ELK as good to have, is it feasible to learn enough about it to not sound like an idiot discussing it if I start studying it today? I'm not going to try to sound like an expert, I just want to know enough that I can carry on a conversation. do a tutorial or two, say "you've played around with it once or twice" mention something casually offhand in the interview about not forgetting inter-node encryption, and you'll be fine. with luck they'll think you're being modest. for puppet, remember that it uses an agent and will enforce config changes to prevent drift, which is different from, eg, ansible. octopus does CI/CD, if you can talk intelligently about what CI/CD is for and what problems it's trying to solve, who cares about the tooling? Tooling in general is maybe the least important thing about devops/sre. Focus on the concepts and the problems you're trying to solve. When you're interviewing, speak about those concepts. Don't lose sight of the forest for the trees. and if you haven't yet, read the google SRE handbook. e: also it's a remote interview - have a set of prewritten notes with a few lines about each technology in their stack. If you're good at public speaking, you can even write out full sentences. In one of my recent interviews, I basically quoted one of methenar's slack rants about kubernetes and got an offer. The Iron Rose fucked around with this message at 21:45 on Dec 18, 2020 |
# ? Dec 18, 2020 21:43 |
|
22 Eargesplitten posted:I have an interview on Tuesday that has ELK as good to have, is it feasible to learn enough about it to not sound like an idiot discussing it if I start studying it today? I'm not going to try to sound like an expert, I just want to know enough that I can carry on a conversation. I wouldn't deep dive, but you can become pretty conversant just by reading about it. The basics are all pretty simple, especially if you're just talking about the basic Elastic stack (ELK is the old name). You've got the elasticsearch service, which is a very versatile search platform. You can use it for a bunch of stuff that isn't logs. We use it as a generic search platform for a whole bunch of data across multiple apps. Then you've got logstash and beats, which are both used to feed data into elasticsearch. Logstash is a tool that will pull data from one source and push it into logstash. It's all manual and everything has to be defined by the user. It's good for custom logs/files or places where you want to transform data before pushing it to elasticsearch or you only want to import specific messages. You have a lot of control over how and where the data gets ingested. I don't have a ton of experience with beats but I know it can support of bunch of different types of data out of the box, which makes some of the configuration easier. It can be used for stuff like system and network metrics in addition to log files. So it handles a lot of things that don't make sense for logstash. Kibana is the UI on top of it all. It provides a nice interface for building queries. There's also some administrative stuff you can do. There's other tools like curator, which is for index management. It's been a while since I've had to work with Kibana but I know they've integrated more index management stuff into it. So I don't know if curator still has a place, but it use to be the goto tool for managing indexes. There's more obviously, but it doesn't take much to be conversant in how the stack works. Especially if they're just using it for logging.
|
# ? Dec 18, 2020 21:45 |
|
The Iron Rose posted:basically quoted one of methanar's slack rants about kubernetes and got an offer. This is the 2nd time I've been told this, almost verbatim.
|
# ? Dec 18, 2020 22:24 |
|
Curator isn’t used anymore as ILM (index lifecycle management) had been introduced. What would be useful to know is how elasticsearch stores it’s data. So read up on indices, primary/replica shards, shard allocation. Another thing that’s fairly simple to look into and makes it look like you know a bit about is the Elasticsearch node types: master, data, coordinator, machine learning, ingest. Also look at how hot/warm/cold storage works. Also look into the feaures X-Pack provides. These are mostly enterprise features. All of them require a license but there’s different license tiers and the basic one is free which unlocks a few security features. Other features require more expensive licensing tiers. You can completely ignore the SIEM, Endpoint Protection and ML modules for your job interview. If you have any questions feel free to send me a DM.
|
# ? Dec 18, 2020 22:27 |
|
LochNessMonster posted:Curator isn’t used anymore as ILM (index lifecycle management) had been introduced. That’s actually good to know. Managing indexes and shards was the most painful part of my experience when I had to manage an elastic cluster a few years ago. It’s nice to know they made that part better.
|
# ? Dec 18, 2020 22:43 |
|
Alright, thanks for all the advice. I'll go through tutorials on everything so I can say I played around with it a bit. Would it be a bad thing to say that the reason I haven't done it in my current job is that we have one senior admin dedicated to monitoring and another to Puppet and Jenkins (or whatever the thing we use aside from Puppet is, I can look it up)? I mean those things are true, but it's also because I got bait and switched into being a NOC tech with a SysAdmin title and NOC pay. But I'm not saying that part to a potential employer.
|
# ? Dec 18, 2020 23:11 |
|
deedee megadoodoo posted:That’s actually good to know. Managing indexes and shards was the most painful part of my experience when I had to manage an elastic cluster a few years ago. It’s nice to know they made that part better. Yeah in the early days I this was the only thing you, together with elasticsearch head. The management part has become a lot better and with 7.10 it seems they are finally addressing the Watcher shitshow.
|
# ? Dec 18, 2020 23:22 |
|
22 Eargesplitten posted:Alright, thanks for all the advice. I'll go through tutorials on everything so I can say I played around with it a bit. Would it be a bad thing to say that the reason I haven't done it in my current job is that we have one senior admin dedicated to monitoring and another to Puppet and Jenkins (or whatever the thing we use aside from Puppet is, I can look it up)? I mean those things are true, but it's also because I got bait and switched into being a NOC tech with a SysAdmin title and NOC pay. But I'm not saying that part to a potential employer. Why would you mention that you haven’t done it in my current job? Much less go into that much detail. “I’ve only had the opportunity to use [x] a little bit, now here’s all I know about it in a sentence that sounds smart.”
|
# ? Dec 19, 2020 04:30 |
|
Does anyone have recommendations for managing builds/releases for a large number of public SDKs/plugins/integrations? I've been put in charge of coordinating and managing builds/releases for a growing number of language SDKs and plugins (like a Terraform provider) that we make available to customers. The releases will be done manually for now. Someone will run a script on the main branch to tag/add changelog/bump version/push the tag. My team wants to centralize everything with an internal tool, though. The version bumping and build steps for each repository would be defined in central, internal repository and each SDK/plugin/integration we offer would have a "hook" script to manage building and version-bumping releases. The tool would read from this internal repo of config files, generate changelogs, and call the hook script against the target repository. I was thinking it would be easier to either: a) drop a config file in each repo and have a tool read that b) let each repo implement its own build/release strategy so it can use tools built for each language (npm scripts, goreleaser, etc) After looking around at other company-maintained SDKs it seems like having each repo define its own build/release process is most common. Am I overlooking a best practice here if I think the distributed approach is the way to go?
|
# ? Dec 19, 2020 06:20 |
|
The Iron Rose posted:Why would you mention that you haven’t done it in my current job? Much less go into that much detail. See, that's why I asked, I'm bad at figuring out how stuff comes across. As far as Puppet and Octopus, after doing a little bit of reading up would it be accurate to say that Puppet is more suited for saying "I want this piece of equipment/list of equipment to look like this" without a set order that it's done, while Octopus is what you would use when you need to say "Do x, then y, then z, and then if a is true, b, else c" all in order? For example if you're starting up an web-based application you would start the database, then the application server(s), then the web front end services, that would be a case for Octopus, while imaging or patching half a dozen (or a few hundred) application servers would be suited to Puppet?
|
# ? Dec 20, 2020 05:09 |
|
If you have time, and there's a guy in your company who does Puppet, could you get away with asking for 15 minutes of his time to talk about it? Then you could say "I haven't used Puppet, that's another group, but I think it's a neat idea and I've spent time with those guys because I wanted to learn about it." Nobody has to know that you only did so for the interview, and it makes it clear that you're a guy with initiative and drive to learn. I'm the Puppet evangelist in my company and we use it as an intermediate step in a server build. We start with a base image, Puppet manages configuration items in the OS or standard infrastructure tools, and then it gets handed to an application team. It's not intrinsically a task runner, though now there's a separate Puppet Tasks tool that does what it says on the tin. With the enterprise version of Puppet, you can get deeper into server provisioning. I think there's a lot of overlap in this space. Ansible and Puppet have a lot of overlap but there are edge cases where you'd pick one over the other. Puppet (the company) is busy trying to turn Puppet (the tool) into a CI/CD system, but based on what I've done with the free Puppet, if I was using it with Octopus, the idea would be that Puppet configured the server so that it had everything needed for Octopus to deploy to it, and Octopus deployed the application.
|
# ? Dec 20, 2020 06:02 |
|
Thanks, that ansible-puppet comparison is helpful. I might be able to get a chance to talk to the guy for a few minutes, the team is understaffed and he's a semi-BOFH, but it seems like it's less misanthropy and more frustration with how much of a clusterfuck this company is. In trying to find the cause of an outage he found mongoDB credentials stored in plaintext, an external gitlab reference (we do not have a company gitlab), and the outage was caused by an application having hardcoded DNS addresses which caused it to fail when we decommed those servers years later, although it took hours to figure that out because being an MSP, we eschew proper documentation in favor of more billable work. I'm potentially up for a promotion at this company onto the Linux admin team as well, but it probably wouldn't pay as much as this one and it would be more junior Linux admin rather than full system admin with a hint of DevOps responsibilities.
|
# ? Dec 20, 2020 07:25 |
I'm using env_file in my docker-compose.yml and in my Dockerfile I have "RUN whatever.sh" and the contents of whatever.sh relies on those environment variables being set, but they do not seem to be set at that point. What am I misunderstanding here?
|
|
# ? Jan 1, 2021 22:18 |
|
fletcher posted:I'm using env_file in my docker-compose.yml and in my Dockerfile I have "RUN whatever.sh" and the contents of whatever.sh relies on those environment variables being set, but they do not seem to be set at that point. What am I misunderstanding here? Perhaps you're supposed to set them at build time with --build-arg?
|
# ? Jan 1, 2021 23:24 |
fletcher posted:I'm using env_file in my docker-compose.yml and in my Dockerfile I have "RUN whatever.sh" and the contents of whatever.sh relies on those environment variables being set, but they do not seem to be set at that point. What am I misunderstanding here? I think my issue was due to a misunderstanding between RUN and CMD, this helped clear it up: https://goinbigdata.com/docker-run-vs-cmd-vs-entrypoint/ My script belonged in CMD, it didn't make sense to bake what it was doing into a layer.
|
|
# ? Jan 1, 2021 23:28 |
Alright so for my migration away from Chef I've now containerized all my apps I want to run and now I've got two docker compose files. Since this is just for personal/hobby stuff I was gonna just use docker compose to run it in "production" as well to keep things simple. Now I've got multiple domains I want to host on the same machine through https 443 so I started looking at Traefik. Not really liking what I'm seeing so far with it though. Instead, I was thinking of just running nginx in another container as my reverse proxy for the other apps/domains I want to run. Since it's two separate docker compose files I need this nginx container to know what ip/hostname to proxy_pass the requests to. Seems I can do this either with static ips for my containers and then just hard code them in the nginx config, or use something like https://github.com/mageddo/dns-proxy-server to be able to reference them by name. Any other suggestions for how to solve this?
|
|
# ? Jan 13, 2021 20:44 |
|
A compose file creates a network for you implicitly, but you can also define that network yourself, and then containers from multiple files can be assigned to that same network. This will allow them to reach each other by their name, as if you had them all in the same file. This approach will work up until you have multiple hosts, where you're looking at some sort of multi host networking solution like swam, kubernetes, whatever. You could also just put them all in the same file, too.
|
# ? Jan 13, 2021 20:49 |
|
What's the accepted best practice for least privilege IAM roles when it comes to terraform? Am I supposed to laboriously give it Crud permissions on only exactly the API calls it needs on a per-infra basis? It seems like a nightmare.
|
# ? Jan 13, 2021 21:46 |
|
IMO: Terraform should run as basically full admin permissions from a user, ideally using the "assume_role" directive of the aws provider block to assume an admin role in a particular AWS account. I don't think it is a good idea to trigger Terraform runs from automation (as part of build pipelines or whatever).
|
# ? Jan 13, 2021 21:52 |
|
If you must do least privilege, then you can "easily" calculate the required perms in the same way people do it for SELinux: Run Terraform once with full admin privs, then scrape CloudTrail logs with Athena to get a unique list of all the API calls it made, and throw all of them into a custom role that Terraform can use.
|
# ? Jan 13, 2021 22:35 |
|
12 rats tied together posted:I don't think it is a good idea to trigger Terraform runs from automation (as part of build pipelines or whatever). We do, but in order to actually apply you have to log in to TFE and click a button
|
# ? Jan 13, 2021 22:45 |
|
GCP will, after a week or two, let you know what privileges your custom roles have that they're not using, which you can then filter through their role documentation and modify We started our doing least privilege, but eventually got it security to sign off on God/Root privileges and velocity went up by 900%
|
# ? Jan 13, 2021 22:47 |
|
Happiness Commando posted:What's the accepted best practice for least privilege IAM roles when it comes to terraform? Am I supposed to laboriously give it Crud permissions on only exactly the API calls it needs on a per-infra basis? It seems like a nightmare.
|
# ? Jan 13, 2021 23:06 |
|
Hadlock posted:We started our doing least privilege, but eventually got it security to sign off on God/Root privileges and velocity went up by 900% This is especially true for developer accounts where devs are experimenting a lot. But I think it's still wise to put some guardrails on (via SCP) to restrict dev activities to specific regions and AWS services, to reduce the blast radius if God/Root creds get leaked.
|
# ? Jan 13, 2021 23:10 |
|
Yeah the way we set it up, we had the god project with the god role, and it generated sub projects, with their own project specific God role. So even if you hacked that role in that project your blast radius was limited to that project. And we did a seperate project per environment. Dev/nonprod environments lived in one subfolder, and prod/pii environments lived in a seperate subfolder that further limited blast radius. All that multi-level project terraform lived in one repo Then at the project level, those ran terraform that lived in a seperate project-level-specific repo. Developers could modify stuff in the dev environments, but the prod stuff has custom terraform modules with version numbers, and you needed an approved pr to change that stuff TL;dr developers had God mode in lower environments (at a per project level), and it was all stateless containers so if they hosed it up we would just re roll their GCP project from code, developers could check in terraform code that would be deployed to prod but other than submitting or couldn't footgun the whole company without approval from someone on the ops team Projects are actually really cheap to setup and provide awesome security firewalls, I recommend splitting up your stuff into as many logical projects as possible. You can link their networks fairly easily Hadlock fucked around with this message at 23:56 on Jan 13, 2021 |
# ? Jan 13, 2021 23:52 |
|
I have a team that needs some significant, io intensive disk scratch space. https://kubernetes.io/docs/concepts/storage/volumes/#emptydir Doing this properly with sizeable tmpfs directories requires k8s 1.20 alpha features which I do not have. So instead of that, I'm thinking of creating a tmpfs as part of the container's entrypoint. code:
|
# ? Jan 14, 2021 21:19 |
|
if that's the actual implementation you should just give them 10gb of ram and have them write whatever they're writing to the heap instead of to disk, otherwise, that seems fine to me
|
# ? Jan 14, 2021 21:23 |
|
12 rats tied together posted:if that's the actual implementation you should just give them 10gb of ram and have them write whatever they're writing to the heap instead of to disk, otherwise, that seems fine to me That would be best but I've heard several insistences that they definitely need proper on-disk posix semantics. Thanks for the +1
|
# ? Jan 14, 2021 21:31 |
|
Methanar posted:I have a team that needs some significant, io intensive disk scratch space. Why not just use /dev/shm ?
|
# ? Jan 14, 2021 21:46 |
|
PCjr sidecar posted:Why not just use /dev/shm ? Looks like that's the same thing. The default size of the shm is 64M inside of a container so I'd need to remount that anyway, probably using the same mechanism of entrypoint overrides. code:
|
# ? Jan 14, 2021 22:01 |
12 rats tied together posted:A compose file creates a network for you implicitly, but you can also define that network yourself, and then containers from multiple files can be assigned to that same network. This will allow them to reach each other by their name, as if you had them all in the same file. Thanks for the tips! The pre-existing network was exactly what I needed.
|
|
# ? Jan 15, 2021 10:02 |
|
Hadlock posted:Yeah the way we set it up, we had the god project with the god role, and it generated sub projects, with their own project specific God role. So even if you hacked that role in that project your blast radius was limited to that project. And we did a seperate project per environment. For GCP I 100% agree, for AWS, despite control towers promises I think this is a nightmare.
|
# ? Jan 15, 2021 12:02 |
|
freeasinbeer posted:For GCP I 100% agree, for AWS, despite control towers promises I think this is a nightmare. Wrangling a large number of AWS accounts does kind of suck, but what's the alternative? The only thing worse is dumping all your poo poo into one account and trading the huge pile of accounts for a huge pile of laser focused IAM policies and VPC's.
|
# ? Jan 15, 2021 17:09 |
|
There's not really an alternative unless you work in an org that is comfortable allowing your team to be permanent stewards of the entirety of AWS and you always have time to sit there and analyze least-workable-privilege for every 3rd party software suite someone buys that runs in AWS and also needs iam:CreateUser for some reason. Control Tower is pretty bad but Organizations is fine and it's not especially difficult to write some automation for account turn-on procedures: creation of IAM roles and trusts, standard billing alerts, etc.
|
# ? Jan 15, 2021 17:12 |
|
I've been looking into Disposable Cloud Environments as one possible solution or partial solution for some of these issues for AWS accounts at least. Time and/or budget limited for interactive and CI/CD use.
|
# ? Jan 15, 2021 17:16 |
|
|
# ? May 16, 2024 03:45 |
|
12 rats tied together posted:There's not really an alternative unless you work in an org that is comfortable allowing your team to be permanent stewards of the entirety of AWS and you always have time to sit there and analyze least-workable-privilege for every 3rd party software suite someone buys that runs in AWS and also needs iam:CreateUser for some reason. So I’m fine with breaking stuff up into tiers. Or segmentation I’m dealing with another part of the infra org that wants microaccounts for everything and it’s driving me mad. My cloudfront distros now sit in another account from my runtimes, and are both sides are segregated by tiers. Or wants ECR to live in its own account. So in any one incident I might have to juggle 3-5 accounts.
|
# ? Jan 15, 2021 18:15 |