Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
necrobobsledder
Mar 21, 2005
Lay down your soul to the gods rock 'n roll
Nap Ghost
Last I could tell from the outside, half the CloudFormation blogposts and engineers appeared to be based out of India which makes me wonder if it's strategically important enough for AWS to put more higher profile engineers on them. Doubtful you'd see that happen to IAM, in contrast.

Adbot
ADBOT LOVES YOU

Hadlock
Nov 9, 2004

I put up an ops manager job listing on the jobs thread. If you do Linux + containers, and have not terrible opinions, hit me up. Full time remote.

PBS
Sep 21, 2015

Hadlock posted:

I put up an ops manager job listing on the jobs thread. If you do Linux + containers, and have not terrible opinions, hit me up. Full time remote.

quote:

Looking for someone with actual management experience, not just "well I've been here longest and now I'm senior because reasons" lead devops engineer.

Don't doxx me bro.

Blinkz0rz
May 27, 2001

MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

Hadlock posted:

I put up an ops manager job listing on the jobs thread. If you do Linux + containers, and have not terrible opinions, hit me up. Full time remote.

Hadlock posted:

Pay: low to mid 100s

Looking for someone with actual management experience

Lmao, try $180k+ if you want a decent candidate

Nomnom Cookie
Aug 30, 2009



I make at the top end of what you’re offering and I’m just a remote “"well I've been here longest and now I'm senior because reasons" lead devops engineer”

Necronomicon
Jan 18, 2004

So I've got an AWS/boto3/Rancher conundrum I'm hoping y'all can help out with. I've been trying to put together a python script to do the following:

1. List all IAM access keys that are marked "active" and are more than 90 days old
2. Create a new key for that IAM user, then invalidate the old one
3. For each key being invalidated, check to see if it's being used as an environment variable in a pod in Rancher.
4. If it *is* being used, replace said environment variable with the newly-created key (we have an internal tool to do this that I'll just invoke)

The long and short of it is I need to automate the rotation of API keys in AWS to prepare us for SOC2 and other audits coming down the pipe. I can query the keys and create new ones / invalidate old ones just fine, but my code at this point is approaching Frankenstein status and I'm hoping somebody has found a more elegant solution. The main issues I'm trying to solve are automating this tedious nonsense and mitigating the issue where we have no idea where an API key is actually active in our system, which makes blowing things up way too easy.

EDIT: I can post this in the Python thread if that would get more results. I'm a little rusty because honestly I've only been asked to write Terraform for most of the time at this new job.

Necronomicon fucked around with this message at 22:10 on Feb 25, 2020

Vulture Culture
Jul 14, 2003

I was never enjoying it. I only eat it for the nutrients.

Necronomicon posted:

So I've got an AWS/boto3/Rancher conundrum I'm hoping y'all can help out with. I've been trying to put together a python script to do the following:

1. List all IAM access keys that are marked "active" and are more than 90 days old
2. Create a new key for that IAM user, then invalidate the old one
3. For each key being invalidated, check to see if it's being used as an environment variable in a pod in Rancher.
4. If it *is* being used, replace said environment variable with the newly-created key (we have an internal tool to do this that I'll just invoke)

The long and short of it is I need to automate the rotation of API keys in AWS to prepare us for SOC2 and other audits coming down the pipe. I can query the keys and create new ones / invalidate old ones just fine, but my code at this point is approaching Frankenstein status and I'm hoing somebody has found a more elegant solution. The main issues I'm trying to solve are automating this tedious nonsense and mitigating the issue where we have no idea where an API key is actually active in our system, which makes blowing things up way too easy.

EDIT: I can post this in the Python thread if that would get more results. I'm a little rusty because honestly I've only been asked to write Terraform for most of the time at this new job.
First, ask yourself why you're reinventing Vault and not even moving to ephemeral keys in the process

Blinkz0rz
May 27, 2001

MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

Necronomicon posted:

So I've got an AWS/boto3/Rancher conundrum I'm hoping y'all can help out with. I've been trying to put together a python script to do the following:

1. List all IAM access keys that are marked "active" and are more than 90 days old
2. Create a new key for that IAM user, then invalidate the old one
3. For each key being invalidated, check to see if it's being used as an environment variable in a pod in Rancher.
4. If it *is* being used, replace said environment variable with the newly-created key (we have an internal tool to do this that I'll just invoke)

The long and short of it is I need to automate the rotation of API keys in AWS to prepare us for SOC2 and other audits coming down the pipe. I can query the keys and create new ones / invalidate old ones just fine, but my code at this point is approaching Frankenstein status and I'm hoping somebody has found a more elegant solution. The main issues I'm trying to solve are automating this tedious nonsense and mitigating the issue where we have no idea where an API key is actually active in our system, which makes blowing things up way too easy.

EDIT: I can post this in the Python thread if that would get more results. I'm a little rusty because honestly I've only been asked to write Terraform for most of the time at this new job.

Use IAM roles instead of users and keys

Nomnom Cookie
Aug 30, 2009



kiam is crap and kube2iam is worse

Blinkz0rz
May 27, 2001

MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

Nomnom Cookie posted:

kiam is crap and kube2iam is worse

Use EKS and the OIDC provider


https://docs.aws.amazon.com/eks/latest/userguide/enable-iam-roles-for-service-accounts.html

12 rats tied together
Sep 7, 2006

Necronomicon posted:

So I've got an AWS/boto3/Rancher conundrum I'm hoping y'all can help out with. I've been trying to put together a python script to do the following:

1. List all IAM access keys that are marked "active" and are more than 90 days old
2. Create a new key for that IAM user, then invalidate the old one
3. For each key being invalidated, check to see if it's being used as an environment variable in a pod in Rancher.
4. If it *is* being used, replace said environment variable with the newly-created key (we have an internal tool to do this that I'll just invoke)

The long and short of it is I need to automate the rotation of API keys in AWS to prepare us for SOC2 and other audits coming down the pipe. I can query the keys and create new ones / invalidate old ones just fine, but my code at this point is approaching Frankenstein status and I'm hoping somebody has found a more elegant solution. The main issues I'm trying to solve are automating this tedious nonsense and mitigating the issue where we have no idea where an API key is actually active in our system, which makes blowing things up way too easy.

EDIT: I can post this in the Python thread if that would get more results. I'm a little rusty because honestly I've only been asked to write Terraform for most of the time at this new job.

Firstly, I don't think you absolutely need an automatic rotation process for SOC2, so you might be able to avoid this problem altogether by working with your compliance team to reword whatever control this is supposed to satisfy.

If you do need to do this, any kind of "for each IAM key," script is going to be a gross monster, especially if you're also checking it against some other arbitrarily discovered list. A good way to approach this problem (actually, all problems) from a compliance standpoint is to see if you can satisfy the control from another perspective. For example, define a single source of truth/source of authorization for your API keys, and audit that. Because there is no way to put API keys into rancher except for by using $something, we only need to audit the operations of $something.

And then, in something, you use some kind of standard git workflow with required approvals and checklists. Since you probably have to do this anyway, you can tackle two birds with one stone, and honestly in my experience auditors like it better when you have well defined single-tool approaches to things anyway. It's easier to handwave (and hide behind) "this is satisfied by the same control as $other_process, you already saw proof of the control in action and our verification that the control is not lapsing" than it is to introduce another script, another reporting process, more controls around the script (who can run it? where does it log when it runs? etc), generally speaking anyway.

Nomnom Cookie
Aug 30, 2009




Migrating to EKS is our perpetual good to do but other stuff is higher priority project, so we use api keys.

Blinkz0rz
May 27, 2001

MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

Nomnom Cookie posted:

Migrating to EKS is our perpetual good to do but other stuff is higher priority project, so we use api keys.

In that case, while kube2iam isn't great, it's heaps better than static creds.

Methanar
Sep 26, 2013

by the sex ghost

Nomnom Cookie posted:

kiam is crap and kube2iam is worse

kiam is... okay
kube2iam is trash

I really dislike these kinds of things that abuse IPtables to mitm traffic. nodeLocalDNS is another bad offender for this but you basically need it after your clusters start growing because DNS is a huge bottleneck.

There is a new thing coming up soon that will allow pods to directly assume EC2 credentials without any middleman or weirdo host role thing, but I forget everything about it.

Methanar fucked around with this message at 02:43 on Feb 26, 2020

12 rats tied together
Sep 7, 2006

That thing is called "just using Amazon ECS" I believe :shobon:

Nomnom Cookie
Aug 30, 2009



Blinkz0rz posted:

In that case, while kube2iam isn't great, it's heaps better than static creds.

It’s not though! It’s racy and abandoned. A thing that works is always better than a thing that doesn’t work.

CyberPingu
Sep 15, 2013


If you're not striving to improve, you'll end up going backwards.
Hi all, wondering if someone can lend a hand with some terraform stuff.

I have a module that creates a Cloud Watch Log Group. And im in the process of writing another module that needs to call the name of that log group and wondering the best way to do it. As the log group is created first and will be persistant throughout other CW metric filters & alarms.

The code is

code:
resource "aws_cloudwatch_log_metric_filter" "rootEvent" {
  name           = "Root_Account_Login"
  pattern        = <<EOF
{$.userIdentity.type="Root" && $.userIdentity.invokedBy NOT EXISTS && $.eventType !="AwsServiceEvent"}
EOF
  log_group_name = "cloudtrail"     <<<<<THIS HERE NEEDS TO BE GRABBED FROM THE OTHER MODULE

  metric_transformation {
    name      = "${var.event_name}"
    namespace = "${var.name_space}"
    value     = "1"
  }
}

Blinkz0rz
May 27, 2001

MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

Nomnom Cookie posted:

It’s not though! It’s racy and abandoned. A thing that works is always better than a thing that doesn’t work.

Static creds are probably one of the most dangerous things to have floating in your environment. At least tell me you're using policy conditions to bind them to specific ec2 instances rather than usable by anyone from anywhere.

Nomnom Cookie
Aug 30, 2009



Blinkz0rz posted:

Static creds are probably one of the most dangerous things to have floating in your environment. At least tell me you're using policy conditions to bind them to specific ec2 instances rather than usable by anyone from anywhere.

You’re not wrong, but if forced to choose between spraying api keys everywhere or deploying kube2iam I’d take the
spraying. Also, and this is key, I don’t give a poo poo about security. We have a security guy. If he spends all his time on compliance and doesn’t have any left to spend on actually securing things, that’s not my problem. I got actual useful work to do.

Blinkz0rz
May 27, 2001

MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

Nomnom Cookie posted:

You’re not wrong, but if forced to choose between spraying api keys everywhere or deploying kube2iam I’d take the
spraying. Also, and this is key, I don’t give a poo poo about security. We have a security guy. If he spends all his time on compliance and doesn’t have any left to spend on actually securing things, that’s not my problem. I got actual useful work to do.

Yikes

CyberPingu
Sep 15, 2013


If you're not striving to improve, you'll end up going backwards.

Nomnom Cookie posted:

You’re not wrong, but if forced to choose between spraying api keys everywhere or deploying kube2iam I’d take the
spraying. Also, and this is key, I don’t give a poo poo about security. We have a security guy. If he spends all his time on compliance and doesn’t have any left to spend on actually securing things, that’s not my problem. I got actual useful work to do.

Security is everyones problem my dude. As a security engineer this post have me a loving aneurysm.

trem_two
Oct 22, 2002

it is better if you keep saying I'm fat, as I will continue to score goals
Fun Shoe

Nomnom Cookie posted:

You’re not wrong, but if forced to choose between spraying api keys everywhere or deploying kube2iam I’d take the
spraying. Also, and this is key, I don’t give a poo poo about security. We have a security guy. If he spends all his time on compliance and doesn’t have any left to spend on actually securing things, that’s not my problem. I got actual useful work to do.

This will end well, I'm sure

CyberPingu
Sep 15, 2013


If you're not striving to improve, you'll end up going backwards.
"I don't lock my doors because the police should be doing their job catching criminals"

Docjowles
Apr 9, 2009

Nomnom Cookie posted:

You’re not wrong, but if forced to choose between spraying api keys everywhere or deploying kube2iam I’d take the
spraying. Also, and this is key, I don’t give a poo poo about security. We have a security guy. If he spends all his time on compliance and doesn’t have any left to spend on actually securing things, that’s not my problem. I got actual useful work to do.

Which one of my company's developers are you? Because this is loving all of them :argh:

My old boss basically opened up a couple AWS accounts and gave everyone *:* in them in the name of speed and it went as well as you'd expect. He was also vehemently anti-cloud and I kind of suspect he sabotaged this on purpose so it would fail and everyone would come crawling back to his data center fiefdom :tinfoil:

When he left we started over in a new set of accounts and have been rebuilding services in them via Terraform, converting them to use roles/instance profiles whenever possible (which is almost always). But it's very much pulling teeth explaining to devs why the 1000 day old API keys they've committed to git in plain text with god mode access for everything are wildly unacceptable.

Security is everyone's job but most people do NOOOOOOT give a single poo poo and will do exactly what cookie monster over there said. Your security and ops folks should be finding ways to make the secure path also be the easy and default path. Because otherwise people will 100% do the most slapdash, YOLO poo poo possible to close their current tickets.

Gyshall
Feb 24, 2009

Had a couple of drinks.
Saw a couple of things.

Nomnom Cookie posted:

You’re not wrong, but if forced to choose between spraying api keys everywhere or deploying kube2iam I’d take the
spraying. Also, and this is key, I don’t give a poo poo about security. We have a security guy. If he spends all his time on compliance and doesn’t have any left to spend on actually securing things, that’s not my problem. I got actual useful work to do.

Look at this scrub who hasn't discovered secrets management. "Not my problem". Gtfo

12 rats tied together
Sep 7, 2006

Sometimes you need to spray keys. OP's post doesn't inspire me with a ton of confidence that the keys are actually a hard requirement this time, but it can be done safely and responsibly, and it is indeed sometimes a hard requirement. :shobon:

IMHO implementing things safely and in line with your sec/compliance goals is the only hard thing about AWS. You can hire juniors (and you should, if your organization allows) to read documentation and create ec2 instances and security groups.

freeasinbeer
Mar 26, 2015

by Fluffdaddy
I’d lean towards spraying keys for critical things in k8s if they are gonna be stored in secrets. Now that has its own set of issues, but so does kube2iam and kiam both of which require a fair bit more setup and are finicky.

I’d also punt to the security team as well if they wanted to do something better.

CyberPingu
Sep 15, 2013


If you're not striving to improve, you'll end up going backwards.
Asking your security team for a better option is a good idea. Doing whatever you want because "the security team needs to do their job" is a crock of poo poo.

Twlight
Feb 18, 2005

I brag about getting free drinks from my boss to make myself feel superior
Fun Shoe

CyberPingu posted:

Hi all, wondering if someone can lend a hand with some terraform stuff.

I have a module that creates a Cloud Watch Log Group. And im in the process of writing another module that needs to call the name of that log group and wondering the best way to do it. As the log group is created first and will be persistant throughout other CW metric filters & alarms.

The code is

code:
resource "aws_cloudwatch_log_metric_filter" "rootEvent" {
  name           = "Root_Account_Login"
  pattern        = <<EOF
{$.userIdentity.type="Root" && $.userIdentity.invokedBy NOT EXISTS && $.eventType !="AwsServiceEvent"}
EOF
  log_group_name = "cloudtrail"     <<<<<THIS HERE NEEDS TO BE GRABBED FROM THE OTHER MODULE

  metric_transformation {
    name      = "${var.event_name}"
    namespace = "${var.name_space}"
    value     = "1"
  }
}


you'll need to get the output of the log group name from the module. syntax is something like module.<module-name>.<output-name>

https://www.terraform.io/docs/configuration/modules.html#accessing-module-output-values

if you're in an order of operations thing, you cannot use the depends_on attribute on modules yet, for some reason. Then you can get cute with either null-exec timers/sleeps or something else. I'd suggest making an aws-cli call to get the log group name and not gently caress around with timers. but this is def where terraform starts to fall on its rear end.

Twlight fucked around with this message at 22:50 on Feb 26, 2020

Necronomicon
Jan 18, 2004

Vulture Culture posted:

First, ask yourself why you're reinventing Vault

I recently spent a lot of time talking with my team about how to talk down a coworker who wanted to write up a reverse proxy in Go, because he was just reinventing nginx, and here I am doing the same poo poo by accident. I've been really jazzed about Hashicorp stuff so far, so I'll check out Vault for sure.

12 rats tied together posted:

If you do need to do this, any kind of "for each IAM key," script is going to be a gross monster, especially if you're also checking it against some other arbitrarily discovered list.

Oh goodness yes - I am not a Python developer, nor will I ever be, and this script and workflow is getting really chunky and gross. The other arbitrarily discovered list is the output from a custom-built in-house CLI tool. Really the more I talk about this the worse the whole thing is starting to smell.

Nomnom Cookie
Aug 30, 2009



CyberPingu posted:

Security is everyones problem my dude. As a security engineer this post have me a loving aneurysm.

Get my boss to put it in my OKRs. This place is a shithole full of bad practices, security-wise, one security guy isn’t enough to keep up with the compliance reports let alone accomplish anything, and “this thing you care about a lot and all the devs want...actually I’m gonna need another month because the Internet says our secrets management is bad” is not a career enhancing move.

I’m not getting paid to do it, no one is asking me to do it, no one gives a poo poo if I do it. So no, until one of those things changes, security isn’t my problem.

Nomnom Cookie
Aug 30, 2009



Do note that when IAM key management was my choice at a previous job, I chose kube2iam and suffered for it. Deploying anything significant to prod was delayed for months while we dealt with pods periodically coming up hosed and getting bad credentials handed to them, even across restarts. At the time the guys at this job came up with the key spraying, kiam didn’t exist, so I’m not going to blame them for avoiding the kube2iam dumpster fire.

Nomnom Cookie fucked around with this message at 23:16 on Feb 26, 2020

Blinkz0rz
May 27, 2001

MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS
I've heard bad things about kube2iam but we ran it in prod for a year and a half in 5 regions with only one, transient issue I can remember that went away when we did a rolling restart.

Not to say it's not a dumpster fire but when faced with that or key spray I'll gladly take the dumpster fire any day of the week.

Nomnom Cookie
Aug 30, 2009



We routinely had to delete pods up to a dozen times to get one that kube2iam would hand valid cress to. This was barely sustainable with about a hundred pods on a dozen nodes, deploys every two weeks, and low turnover otherwise. At my current job we have a few hundred nodes and constant pod churn from deploys and CI builds. If kube2iam flaked out as much here as it did at my last job, the whole business would seize up in a few hours.

Vault is coming. I’ll be happy when it gets here (IAM users everywhere isn’t even the worst of it, ofc) but the timeline is decided by my boss’s boss and he’s not in a hurry, against the recommendation of myself and everyone on my team. I am nearly-explicitly being paid to not care about security, and in the end I care more about my paycheck than the security of my employer’s low-integrity systems.

Methanar
Sep 26, 2013

by the sex ghost
kube2iam has a cool race condition bug that breaks it with flannel as your CNI. Until a pod is scheduled that is non-host network on a node, the tunnel interface doesn't exist and kube2iam be broken until restarted once the interface is actually present.

Nomnom Cookie
Aug 30, 2009



We were using VPC CNI, but maybe that issue affects kube2iam on VPC CNI as well. It would fit with what I recall of the problems we had.

Here’s some CI content for the CI thread: big push lately to make our CI builds faster, which...yeah, they’re dog slow. I got them down to about 25 minutes worst case last year and build times have crept up to 45-60 minutes since then. A little poking at build scans shows the problem is twofold: dev is writing tests that take several minutes to run, mainly due to timeouts and sleep, and something is causing gradle to never consider unit tests up to date, so every build that runs tests runs all the tests. We are dealing with the problem by purchasing Gradle Enterprise and “improving caching”.

freeasinbeer
Mar 26, 2015

by Fluffdaddy

Twlight posted:

you'll need to get the output of the log group name from the module. syntax is something like module.<module-name>.<output-name>

https://www.terraform.io/docs/configuration/modules.html#accessing-module-output-values

if you're in an order of operations thing, you cannot use the depends_on attribute on modules yet, for some reason. Then you can get cute with either null-exec timers/sleeps or something else. I'd suggest making an aws-cli call to get the log group name and not gently caress around with timers. but this is def where terraform starts to fall on its rear end.

You can depend on the entire sub module tho.

Twlight
Feb 18, 2005

I brag about getting free drinks from my boss to make myself feel superior
Fun Shoe

freeasinbeer posted:

You can depend on the entire sub module tho.

for sure, I had forgotten this!

CyberPingu
Sep 15, 2013


If you're not striving to improve, you'll end up going backwards.

Twlight posted:

you'll need to get the output of the log group name from the module. syntax is something like module.<module-name>.<output-name>

https://www.terraform.io/docs/configuration/modules.html#accessing-module-output-values

if you're in an order of operations thing, you cannot use the depends_on attribute on modules yet, for some reason. Then you can get cute with either null-exec timers/sleeps or something else. I'd suggest making an aws-cli call to get the log group name and not gently caress around with timers. but this is def where terraform starts to fall on its rear end.

Yeah I went down the output and dependency route. Cheers.

Adbot
ADBOT LOVES YOU

CyberPingu
Sep 15, 2013


If you're not striving to improve, you'll end up going backwards.

Nomnom Cookie posted:

Get my boss to put it in my OKRs. This place is a shithole full of bad practices, security-wise, one security guy isn’t enough to keep up with the compliance reports let alone accomplish anything, and “this thing you care about a lot and all the devs want...actually I’m gonna need another month because the Internet says our secrets management is bad” is not a career enhancing move.

I’m not getting paid to do it, no one is asking me to do it, no one gives a poo poo if I do it. So no, until one of those things changes, security isn’t my problem.

That's still a super bad posture to take imo. But I guess I'm just super glad I don't work at the place you do.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply