|
Last I could tell from the outside, half the CloudFormation blogposts and engineers appeared to be based out of India which makes me wonder if it's strategically important enough for AWS to put more higher profile engineers on them. Doubtful you'd see that happen to IAM, in contrast.
|
# ? Feb 22, 2020 06:32 |
|
|
# ? May 21, 2024 15:48 |
|
I put up an ops manager job listing on the jobs thread. If you do Linux + containers, and have not terrible opinions, hit me up. Full time remote.
|
# ? Feb 22, 2020 06:40 |
|
Hadlock posted:I put up an ops manager job listing on the jobs thread. If you do Linux + containers, and have not terrible opinions, hit me up. Full time remote. quote:Looking for someone with actual management experience, not just "well I've been here longest and now I'm senior because reasons" lead devops engineer. Don't doxx me bro.
|
# ? Feb 22, 2020 16:14 |
|
Hadlock posted:I put up an ops manager job listing on the jobs thread. If you do Linux + containers, and have not terrible opinions, hit me up. Full time remote. Hadlock posted:Pay: low to mid 100s Lmao, try $180k+ if you want a decent candidate
|
# ? Feb 22, 2020 16:27 |
|
I make at the top end of what you’re offering and I’m just a remote “"well I've been here longest and now I'm senior because reasons" lead devops engineer”
|
# ? Feb 22, 2020 19:18 |
|
So I've got an AWS/boto3/Rancher conundrum I'm hoping y'all can help out with. I've been trying to put together a python script to do the following: 1. List all IAM access keys that are marked "active" and are more than 90 days old 2. Create a new key for that IAM user, then invalidate the old one 3. For each key being invalidated, check to see if it's being used as an environment variable in a pod in Rancher. 4. If it *is* being used, replace said environment variable with the newly-created key (we have an internal tool to do this that I'll just invoke) The long and short of it is I need to automate the rotation of API keys in AWS to prepare us for SOC2 and other audits coming down the pipe. I can query the keys and create new ones / invalidate old ones just fine, but my code at this point is approaching Frankenstein status and I'm hoping somebody has found a more elegant solution. The main issues I'm trying to solve are automating this tedious nonsense and mitigating the issue where we have no idea where an API key is actually active in our system, which makes blowing things up way too easy. EDIT: I can post this in the Python thread if that would get more results. I'm a little rusty because honestly I've only been asked to write Terraform for most of the time at this new job. Necronomicon fucked around with this message at 22:10 on Feb 25, 2020 |
# ? Feb 25, 2020 22:05 |
|
Necronomicon posted:So I've got an AWS/boto3/Rancher conundrum I'm hoping y'all can help out with. I've been trying to put together a python script to do the following:
|
# ? Feb 25, 2020 23:48 |
|
Necronomicon posted:So I've got an AWS/boto3/Rancher conundrum I'm hoping y'all can help out with. I've been trying to put together a python script to do the following: Use IAM roles instead of users and keys
|
# ? Feb 25, 2020 23:57 |
|
kiam is crap and kube2iam is worse
|
# ? Feb 26, 2020 00:05 |
|
Nomnom Cookie posted:kiam is crap and kube2iam is worse Use EKS and the OIDC provider https://docs.aws.amazon.com/eks/latest/userguide/enable-iam-roles-for-service-accounts.html
|
# ? Feb 26, 2020 00:08 |
|
Necronomicon posted:So I've got an AWS/boto3/Rancher conundrum I'm hoping y'all can help out with. I've been trying to put together a python script to do the following: Firstly, I don't think you absolutely need an automatic rotation process for SOC2, so you might be able to avoid this problem altogether by working with your compliance team to reword whatever control this is supposed to satisfy. If you do need to do this, any kind of "for each IAM key," script is going to be a gross monster, especially if you're also checking it against some other arbitrarily discovered list. A good way to approach this problem (actually, all problems) from a compliance standpoint is to see if you can satisfy the control from another perspective. For example, define a single source of truth/source of authorization for your API keys, and audit that. Because there is no way to put API keys into rancher except for by using $something, we only need to audit the operations of $something. And then, in something, you use some kind of standard git workflow with required approvals and checklists. Since you probably have to do this anyway, you can tackle two birds with one stone, and honestly in my experience auditors like it better when you have well defined single-tool approaches to things anyway. It's easier to handwave (and hide behind) "this is satisfied by the same control as $other_process, you already saw proof of the control in action and our verification that the control is not lapsing" than it is to introduce another script, another reporting process, more controls around the script (who can run it? where does it log when it runs? etc), generally speaking anyway.
|
# ? Feb 26, 2020 00:17 |
|
Blinkz0rz posted:Use EKS and the OIDC provider Migrating to EKS is our perpetual good to do but other stuff is higher priority project, so we use api keys.
|
# ? Feb 26, 2020 00:20 |
|
Nomnom Cookie posted:Migrating to EKS is our perpetual good to do but other stuff is higher priority project, so we use api keys. In that case, while kube2iam isn't great, it's heaps better than static creds.
|
# ? Feb 26, 2020 00:25 |
|
Nomnom Cookie posted:kiam is crap and kube2iam is worse kiam is... okay kube2iam is trash I really dislike these kinds of things that abuse IPtables to mitm traffic. nodeLocalDNS is another bad offender for this but you basically need it after your clusters start growing because DNS is a huge bottleneck. There is a new thing coming up soon that will allow pods to directly assume EC2 credentials without any middleman or weirdo host role thing, but I forget everything about it. Methanar fucked around with this message at 02:43 on Feb 26, 2020 |
# ? Feb 26, 2020 02:39 |
|
That thing is called "just using Amazon ECS" I believe
|
# ? Feb 26, 2020 03:20 |
|
Blinkz0rz posted:In that case, while kube2iam isn't great, it's heaps better than static creds. It’s not though! It’s racy and abandoned. A thing that works is always better than a thing that doesn’t work.
|
# ? Feb 26, 2020 09:31 |
Hi all, wondering if someone can lend a hand with some terraform stuff. I have a module that creates a Cloud Watch Log Group. And im in the process of writing another module that needs to call the name of that log group and wondering the best way to do it. As the log group is created first and will be persistant throughout other CW metric filters & alarms. The code is code:
|
|
# ? Feb 26, 2020 11:29 |
|
Nomnom Cookie posted:It’s not though! It’s racy and abandoned. A thing that works is always better than a thing that doesn’t work. Static creds are probably one of the most dangerous things to have floating in your environment. At least tell me you're using policy conditions to bind them to specific ec2 instances rather than usable by anyone from anywhere.
|
# ? Feb 26, 2020 12:46 |
|
Blinkz0rz posted:Static creds are probably one of the most dangerous things to have floating in your environment. At least tell me you're using policy conditions to bind them to specific ec2 instances rather than usable by anyone from anywhere. You’re not wrong, but if forced to choose between spraying api keys everywhere or deploying kube2iam I’d take the spraying. Also, and this is key, I don’t give a poo poo about security. We have a security guy. If he spends all his time on compliance and doesn’t have any left to spend on actually securing things, that’s not my problem. I got actual useful work to do.
|
# ? Feb 26, 2020 14:20 |
|
Nomnom Cookie posted:You’re not wrong, but if forced to choose between spraying api keys everywhere or deploying kube2iam I’d take the Yikes
|
# ? Feb 26, 2020 14:40 |
Nomnom Cookie posted:You’re not wrong, but if forced to choose between spraying api keys everywhere or deploying kube2iam I’d take the Security is everyones problem my dude. As a security engineer this post have me a loving aneurysm.
|
|
# ? Feb 26, 2020 14:55 |
|
Nomnom Cookie posted:You’re not wrong, but if forced to choose between spraying api keys everywhere or deploying kube2iam I’d take the This will end well, I'm sure
|
# ? Feb 26, 2020 15:10 |
"I don't lock my doors because the police should be doing their job catching criminals"
|
|
# ? Feb 26, 2020 15:25 |
|
Nomnom Cookie posted:You’re not wrong, but if forced to choose between spraying api keys everywhere or deploying kube2iam I’d take the Which one of my company's developers are you? Because this is loving all of them My old boss basically opened up a couple AWS accounts and gave everyone *:* in them in the name of speed and it went as well as you'd expect. He was also vehemently anti-cloud and I kind of suspect he sabotaged this on purpose so it would fail and everyone would come crawling back to his data center fiefdom When he left we started over in a new set of accounts and have been rebuilding services in them via Terraform, converting them to use roles/instance profiles whenever possible (which is almost always). But it's very much pulling teeth explaining to devs why the 1000 day old API keys they've committed to git in plain text with god mode access for everything are wildly unacceptable. Security is everyone's job but most people do NOOOOOOT give a single poo poo and will do exactly what cookie monster over there said. Your security and ops folks should be finding ways to make the secure path also be the easy and default path. Because otherwise people will 100% do the most slapdash, YOLO poo poo possible to close their current tickets.
|
# ? Feb 26, 2020 15:27 |
|
Nomnom Cookie posted:You’re not wrong, but if forced to choose between spraying api keys everywhere or deploying kube2iam I’d take the Look at this scrub who hasn't discovered secrets management. "Not my problem". Gtfo
|
# ? Feb 26, 2020 18:38 |
|
Sometimes you need to spray keys. OP's post doesn't inspire me with a ton of confidence that the keys are actually a hard requirement this time, but it can be done safely and responsibly, and it is indeed sometimes a hard requirement. IMHO implementing things safely and in line with your sec/compliance goals is the only hard thing about AWS. You can hire juniors (and you should, if your organization allows) to read documentation and create ec2 instances and security groups.
|
# ? Feb 26, 2020 19:01 |
|
I’d lean towards spraying keys for critical things in k8s if they are gonna be stored in secrets. Now that has its own set of issues, but so does kube2iam and kiam both of which require a fair bit more setup and are finicky. I’d also punt to the security team as well if they wanted to do something better.
|
# ? Feb 26, 2020 19:04 |
Asking your security team for a better option is a good idea. Doing whatever you want because "the security team needs to do their job" is a crock of poo poo.
|
|
# ? Feb 26, 2020 19:31 |
|
CyberPingu posted:Hi all, wondering if someone can lend a hand with some terraform stuff. you'll need to get the output of the log group name from the module. syntax is something like module.<module-name>.<output-name> https://www.terraform.io/docs/configuration/modules.html#accessing-module-output-values if you're in an order of operations thing, you cannot use the depends_on attribute on modules yet, for some reason. Then you can get cute with either null-exec timers/sleeps or something else. I'd suggest making an aws-cli call to get the log group name and not gently caress around with timers. but this is def where terraform starts to fall on its rear end. Twlight fucked around with this message at 22:50 on Feb 26, 2020 |
# ? Feb 26, 2020 22:48 |
|
Vulture Culture posted:First, ask yourself why you're reinventing Vault I recently spent a lot of time talking with my team about how to talk down a coworker who wanted to write up a reverse proxy in Go, because he was just reinventing nginx, and here I am doing the same poo poo by accident. I've been really jazzed about Hashicorp stuff so far, so I'll check out Vault for sure. 12 rats tied together posted:If you do need to do this, any kind of "for each IAM key," script is going to be a gross monster, especially if you're also checking it against some other arbitrarily discovered list. Oh goodness yes - I am not a Python developer, nor will I ever be, and this script and workflow is getting really chunky and gross. The other arbitrarily discovered list is the output from a custom-built in-house CLI tool. Really the more I talk about this the worse the whole thing is starting to smell.
|
# ? Feb 26, 2020 22:53 |
|
CyberPingu posted:Security is everyones problem my dude. As a security engineer this post have me a loving aneurysm. Get my boss to put it in my OKRs. This place is a shithole full of bad practices, security-wise, one security guy isn’t enough to keep up with the compliance reports let alone accomplish anything, and “this thing you care about a lot and all the devs want...actually I’m gonna need another month because the Internet says our secrets management is bad” is not a career enhancing move. I’m not getting paid to do it, no one is asking me to do it, no one gives a poo poo if I do it. So no, until one of those things changes, security isn’t my problem.
|
# ? Feb 26, 2020 23:02 |
|
Do note that when IAM key management was my choice at a previous job, I chose kube2iam and suffered for it. Deploying anything significant to prod was delayed for months while we dealt with pods periodically coming up hosed and getting bad credentials handed to them, even across restarts. At the time the guys at this job came up with the key spraying, kiam didn’t exist, so I’m not going to blame them for avoiding the kube2iam dumpster fire.
Nomnom Cookie fucked around with this message at 23:16 on Feb 26, 2020 |
# ? Feb 26, 2020 23:12 |
|
I've heard bad things about kube2iam but we ran it in prod for a year and a half in 5 regions with only one, transient issue I can remember that went away when we did a rolling restart. Not to say it's not a dumpster fire but when faced with that or key spray I'll gladly take the dumpster fire any day of the week.
|
# ? Feb 26, 2020 23:37 |
|
We routinely had to delete pods up to a dozen times to get one that kube2iam would hand valid cress to. This was barely sustainable with about a hundred pods on a dozen nodes, deploys every two weeks, and low turnover otherwise. At my current job we have a few hundred nodes and constant pod churn from deploys and CI builds. If kube2iam flaked out as much here as it did at my last job, the whole business would seize up in a few hours. Vault is coming. I’ll be happy when it gets here (IAM users everywhere isn’t even the worst of it, ofc) but the timeline is decided by my boss’s boss and he’s not in a hurry, against the recommendation of myself and everyone on my team. I am nearly-explicitly being paid to not care about security, and in the end I care more about my paycheck than the security of my employer’s low-integrity systems.
|
# ? Feb 26, 2020 23:49 |
|
kube2iam has a cool race condition bug that breaks it with flannel as your CNI. Until a pod is scheduled that is non-host network on a node, the tunnel interface doesn't exist and kube2iam be broken until restarted once the interface is actually present.
|
# ? Feb 26, 2020 23:57 |
|
We were using VPC CNI, but maybe that issue affects kube2iam on VPC CNI as well. It would fit with what I recall of the problems we had. Here’s some CI content for the CI thread: big push lately to make our CI builds faster, which...yeah, they’re dog slow. I got them down to about 25 minutes worst case last year and build times have crept up to 45-60 minutes since then. A little poking at build scans shows the problem is twofold: dev is writing tests that take several minutes to run, mainly due to timeouts and sleep, and something is causing gradle to never consider unit tests up to date, so every build that runs tests runs all the tests. We are dealing with the problem by purchasing Gradle Enterprise and “improving caching”.
|
# ? Feb 27, 2020 00:09 |
|
Twlight posted:you'll need to get the output of the log group name from the module. syntax is something like module.<module-name>.<output-name> You can depend on the entire sub module tho.
|
# ? Feb 27, 2020 00:34 |
|
freeasinbeer posted:You can depend on the entire sub module tho. for sure, I had forgotten this!
|
# ? Feb 27, 2020 00:44 |
Twlight posted:you'll need to get the output of the log group name from the module. syntax is something like module.<module-name>.<output-name> Yeah I went down the output and dependency route. Cheers.
|
|
# ? Feb 27, 2020 10:00 |
|
|
# ? May 21, 2024 15:48 |
Nomnom Cookie posted:Get my boss to put it in my OKRs. This place is a shithole full of bad practices, security-wise, one security guy isn’t enough to keep up with the compliance reports let alone accomplish anything, and “this thing you care about a lot and all the devs want...actually I’m gonna need another month because the Internet says our secrets management is bad” is not a career enhancing move. That's still a super bad posture to take imo. But I guess I'm just super glad I don't work at the place you do.
|
|
# ? Feb 27, 2020 10:02 |