Amazon Web Services - Cloud Giant Hits Hard - The Something Awful Forums

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Amazon Web Services - Cloud Giant Hits Hard

«‹›61 »

Matt Zerella: Oct 7, 2002; Norris'es are back baby. It's good again. Awoouu (fox Howl)

Yeah we've been kicking that around too. My other idea is to use terraform.

Ideally this would be done serverless with lambda and boto.

Anyway, thanks for the input everyone.

I'm gonna go yell at R&D to get with the drat program. Or yell at someone else to stop calling our stuff cloud ready.

# ? Feb 19, 2020 02:41

Adbot: ADBOT LOVES YOU

# ? May 15, 2024 19:39

crazypenguin: Mar 9, 2005; nothing witty here, move along

Maybe look at AWS CDK. It's basically a tool to generate CloudFormation.

It seems to me like you're going for ASGs because "that's the thing that spins up a lot of servers at once" but you can just write a loop in CDK and make whatever you want.

# ? Feb 19, 2020 04:45

Agrikk: Oct 17, 2003; Take care with that! We have not fully ascertained its function, and the ticking is accelerating.

Matt Zerella posted:

The idea is to spin this up, do a bunch of processing and tear it down. The Linux to Windows link is due to a service that only runs on Windows and consumes data from the Linux machine and sends it back.

Neither side can be load balanced.

Yes. This sucks. Out software is definitely square peg for the clouds round hole but we are trying to work around this.

We are just in the brainstorming phase of things right now. The idea is to fire a command to the ASG to spin up instances which will consume a SQS queue to generate documents (tens of thousands) then somehow when the job is done it will fire a command to spin the ASG down to zero.

If this is a batch processing job, it makes it much easier and you don't need a ASG for this at all. But this is a simple fixed infrastructure design:

Your workflow looks something like this:

- A workload gets dumped into SQS
- a lambda is triggered that spins up and tags a preset number of linux boxes and the same number of windows boxes
- each windows box finds its linux "mate" via tagging and creates a secure connection
- linux boxes drain the queue and shove workloads to its windows mate for processing
- upon completion of the queue instances shut down and terminate

note that termination is optional. Since this appears to be a recurring task, you could just as easily save yourself some baking time by shutting them down but preserving the EC2 instances until they are powered up by the next batch.

# ? Feb 19, 2020 05:22

Matt Zerella: Oct 7, 2002; Norris'es are back baby. It's good again. Awoouu (fox Howl)

Agrikk posted:

If this is a batch processing job, it makes it much easier and you don't need a ASG for this at all. But this is a simple fixed infrastructure design:

Your workflow looks something like this:

- A workload gets dumped into SQS
- a lambda is triggered that spins up and tags a preset number of linux boxes and the same number of windows boxes
- each windows box finds its linux "mate" via tagging and creates a secure connection
- linux boxes drain the queue and shove workloads to its windows mate for processing
- upon completion of the queue instances shut down and terminate

note that termination is optional. Since this appears to be a recurring task, you could just as easily save yourself some baking time by shutting them down but preserving the EC2 instances until they are powered up by the next batch.

The idea is that part of the trigger process will spin up multiple pairs based on time need.

We currently have one node pair testing batch consumption to get a general feel for where we need resources (memory, disk io, could) so we can give a general "if you need it this fast you need this many nodes" estimate. Though I guess we could set a ceiling and just manually spin up pairs.

And yeah, shutting down could work too since we would be doing this on a weekly basis.

Thanks for this. And thank you to everyone else who's chimed in.

# ? Feb 19, 2020 05:30

freeasinbeer: Mar 26, 2015; by Fluffdaddy

I can also think of lovely ways to do this with mesos, nomad or Kubernetes if you want to get even weirder.

# ? Feb 19, 2020 05:49

Nomnom Cookie: Aug 30, 2009

Yeah if mangling your infrastructure is in scope, you just need two StatefulSets and a CronJob. Once your mixed-OS kube cluster is up and running.

# ? Feb 19, 2020 07:34

Cerberus911: Dec 26, 2005; Guarding the damned since '05

What�s the general opinion on XRay?

I�m looking at tracing and was leaning towards opentracing compatible tools and vendors, but XRay is very attractive price wise. Anyone have any experience with it, and wants to tell me why it�s a bad idea?

# ? Feb 19, 2020 07:37

Nomnom Cookie: Aug 30, 2009

The devs at work are in love with tracing everything all the time. They're completely dependent on being able to poke at Jaeger to see what happened when poo poo goes wrong. So we ingest about a billion traces per day. Total spend is maybe 5% what xray would cost at that scale. If you're calculating the price assuming sampling, I recommend don't. Trace everything, put it all in Jaeger or whatever, enjoy life again.

If you're going to have low volume even without sampling, take a look at https://github.com/opentracing-contrib/java-xray-tracer. I haven't used it but it's probably fine, whatever, and then you're not tied to XRay.

# ? Feb 19, 2020 09:04

Docjowles: Apr 9, 2009

Agrikk posted:

I feel better about my harebrained design now that a TAM ~~stole my sweet idea~~ came up with the same thing :v:

# ? Feb 19, 2020 14:14

Agrikk: Oct 17, 2003; Take care with that! We have not fully ascertained its function, and the ticking is accelerating.

Docjowles posted:

I feel better about my harebrained design now that a TAM ~~stole my sweet idea~~ came up with the same thing

Careful, though. My idea came while in full vacation mode while waiting for my buffalo wings to arrive. I reserve the right to make fun of my own idea when I ~~get back to caring~~ start work mode again.

# ? Feb 20, 2020 05:01

Cancelbot: Nov 22, 2006; Canceling spam since 1928

I'm at work, and the design is good. It's not been compromised by buffalo wings hunger.

# ? Feb 20, 2020 14:23

Cerberus911: Dec 26, 2005; Guarding the damned since '05

Nomnom Cookie posted:

The devs at work are in love with tracing everything all the time. They're completely dependent on being able to poke at Jaeger to see what happened when poo poo goes wrong. So we ingest about a billion traces per day. Total spend is maybe 5% what xray would cost at that scale. If you're calculating the price assuming sampling, I recommend don't. Trace everything, put it all in Jaeger or whatever, enjoy life again.

If you're going to have low volume even without sampling, take a look at https://github.com/opentracing-contrib/java-xray-tracer. I haven't used it but it's probably fine, whatever, and then you're not tied to XRay.

Thanks, that's good info.

We're going to start up with Jaeger anyways so we have some idea on the maintenance effort of doing it ourselves. The bigger concern is the maintenance required for the ES or Cassandra backend.

# ? Feb 20, 2020 21:12

Nomnom Cookie: Aug 30, 2009

Cerberus911 posted:

Thanks, that's good info.

We're going to start up with Jaeger anyways so we have some idea on the maintenance effort of doing it ourselves. The bigger concern is the maintenance required for the ES or Cassandra backend.

Our experience with the C* backend was poor and I don't recommend it. Switching from C* to ES for span storage cut the storage CPU/memory usage approximately in half, cut collector CPU by about 90%, and fixed persistent congestion in the collectors' queues for our production Jaeger instance. Recent ES versions can do recovery in a reasonable way and clean up old indexes automatically, so in the few months since switching we've had zero problems after dialing in cluster and index settings. I can DM you the settings we ended up with (do recommend if you haven't used ES before, the out of the box experience with Jaeger is not good).

# ? Feb 21, 2020 05:52

Agrikk: Oct 17, 2003; Take care with that! We have not fully ascertained its function, and the ticking is accelerating.

Nomnom Cookie posted:

Our experience with the C* backend was poor and I don't recommend it. Switching from C* to ES for span storage cut the storage CPU/memory usage approximately in half, cut collector CPU by about 90%, and fixed persistent congestion in the collectors' queues for our production Jaeger instance. Recent ES versions can do recovery in a reasonable way and clean up old indexes automatically, so in the few months since switching we've had zero problems after dialing in cluster and index settings. I can DM you the settings we ended up with (do recommend if you haven't used ES before, the out of the box experience with Jaeger is not good).

Can you send me the ES settings as well?

I�m always curious to know more about ES v. C activity.

# ? Feb 21, 2020 06:05

Cerberus911: Dec 26, 2005; Guarding the damned since '05

Nomnom Cookie posted:

I can DM you the settings we ended up with (do recommend if you haven't used ES before, the out of the box experience with Jaeger is not good).

Thanks, would really appreciate that!

# ? Feb 21, 2020 17:51

necrobobsledder: Mar 21, 2005; Lay down your soul to the gods rock 'n roll; Nap Ghost

I�m an ES stack contributor and would appreciate the settings to make the docs for tracing not suck

# ? Feb 22, 2020 16:57

CyberPingu: Sep 15, 2013; If you're not striving to improve, you'll end up going backwards.

Is this the place to ask about how to do something particular with Terraform ?

# ? Feb 25, 2020 18:08

vanity slug: Jul 20, 2010

CyberPingu posted:

Is this the place to ask about how to do something particular with Terraform ?

Sure why not

# ? Feb 25, 2020 20:04

Docjowles: Apr 9, 2009

There's also a lot of people with ~~STRONG OPINIONS~~ Terraform experience in the CI/CD thread if you don't get the answers you need here. It's kind of the de facto containers and infrastructure-as-code thread.

# ? Feb 25, 2020 20:21

CyberPingu: Sep 15, 2013; If you're not striving to improve, you'll end up going backwards.

Ah ok might be better asking there but I'll pop it here too.

We build our infrastructure as IaC, using terragrunt/Terraform. Ive just finished building a module that creates cloudtrail and associated logging for CT. If I wanted to use a log group that gets created by that module when running terragrunt how would I go about that? Would it need a dependency?

# ? Feb 25, 2020 20:56

Bhodi: Dec 9, 2007; Oh, it's just a cat.; Pillbug

Is there a way to nest customer and aws managed IAM policies?

We have a ton of default permissions that we need to add to every machine role to enable ssm, cloudwatch, various other security logging and monitoring tools that write to s3, stuff like that, and I wanted to create a single customer-managed "default" policy that can then be attached our an instance profile / machine roles for various instances. I could create a combined json of all of the access needed but that doesn't allow us to leverage aws managed policies and it'd be static (and much harder to read). I didn't see a way to embed the ARN of aws managed policies, but it's possible I missed it. The alternative seems to be to attach maybe 6-7 different aws managed policies directly to every single role we have, plus our custom ones, which seems a little cumbersome - maybe deliberately so, since extensive use of machine roles isn't exactly best practice?

Could probably ask our TAM as well, but I don't know really the line between proserv and support.

Bhodi fucked around with this message at 00:16 on Feb 26, 2020

# ? Feb 26, 2020 00:12

12 rats tied together: Sep 7, 2006

We attach tons of customer managed and aws managed policies to stuff all the time. It's better IMO because it decouples the creation of the policy from the application of the policy to a principal. For auditing, for example, you can have controls around the contents of the policies (especially, restricting updates to them) and you can have separate controls and permissions around attaching them. To verify that machines are meeting compliance, you just list the attached policies and compare. If you do any work with complex cross-account permissions you'll probably end up doing stuff like this anyway with permissions boundaries, so the possibility of code re-use is pretty high.

Embedding policies is something we also do, but we do it with jinja2. It sucks in exactly the way you describe and I found the list of attached, managed policies to be a lot easier to work with, but it's very possible to just render a policy document from some other source.

CyberPingu posted:

Ah ok might be better asking there but I'll pop it here too.

We build our infrastructure as IaC, using terragrunt/Terraform. Ive just finished building a module that creates cloudtrail and associated logging for CT. If I wanted to use a log group that gets created by that module when running terragrunt how would I go about that? Would it need a dependency?

Just to double check, are you familiar with terraform outputs and how you consume them from modules? I don't use terragrunt, but I took a quick look at the docs, and it seems like consuming module outputs is basically just normal terraform.

# ? Feb 26, 2020 00:23

Docjowles: Apr 9, 2009

terragrunt is just a wrapper around terraform that gives you some convenience features and syntactic sugar.

And yeah I think module outputs are the solution to the original question. Make the id of the resource you need an output and then refer to that in the resource that needs to use it as a parameter.

# ? Feb 26, 2020 01:05

CyberPingu: Sep 15, 2013; If you're not striving to improve, you'll end up going backwards.

Ah yeah that would make sense. I though about outputs but got a bit lost trying to tie it all together.

# ? Feb 26, 2020 08:30

deedee megadoodoo: Sep 28, 2000; Two roads diverged in a wood, and I, I took the one to Flavortown, and that has made all the difference.

If you are consuming terraform output via state file from some other application I like using ssm parameters. We have some terraform stuff that ultimately gets used by python and bash scripts and it�s much easier for me to get it that way. We also do the opposite and use terraform to consume ssm parameters from outside apps.

# ? Feb 26, 2020 13:43

CyberPingu: Sep 15, 2013; If you're not striving to improve, you'll end up going backwards.

deedee megadoodoo posted:

If you are consuming terraform output via state file from some other application I like using ssm parameters. We have some terraform stuff that ultimately gets used by python and bash scripts and it�s much easier for me to get it that way. We also do the opposite and use terraform to consume ssm parameters from outside apps.

I found a way to do it via output that is then fed into a dependency in the .hcl file

# ? Feb 26, 2020 13:58

Agrikk: Oct 17, 2003; Take care with that! We have not fully ascertained its function, and the ticking is accelerating.

Bhodi posted:

Could probably ask our TAM as well, but I don't know really the line between proserv and support.

Not responding directly to your question but more generally:

Always open a support case. You pay for support, why not use it?

Open a low severity general information case and select the web option.

Paste the body of your post into the case, click send and lean back, smug in a job well done.

In 1-2 days you will have a nicely formatted and annotated response with an answer to your question.

If you aren�t sure if you should open a support case, always err on the side of opening one. Please give us a chance to help you. We know a thing or two because we�ve seen a thing or two.

Hell- I work here and I open support cases when I don�t know something. Cloud Support Engineers are pretty good, yo.

# ? Feb 26, 2020 17:32

fluppet: Feb 10, 2009

Always reach out to your TAM your never know what fun toys are tucked away behind a NDA

# ? Feb 26, 2020 18:15

Thanks Ants: May 21, 2004; #essereFerrari

I'm trying to help our development team structure their AWS setup a bit better, and have been reading up on AWS Organizations and SSO since it started supporting SAML and SCIM from external directories rather than having to run a managed AD or an AD connector.

Currently everything happens in one AWS account and the dev team handle their own account creation/revocation which isn't going to be an option as the team grows. From what I've read the way to go is:

- Enable AWS Organisations and enable SSO in the master account, link to Azure AD and assign roles to AD groups
- Create a new AWS account for each purpose (dev, test, prod, anything being developed for a third party so the account can be just passed over to them if required)
- Don't use IAM accounts any more, use temporary IAM role accounts with CLI tools

Is it recommended to create a new AWS account to use for the master account role in Organizations, and then invite the current AWS account as a member? I vaguely remember reading this somewhere but I can't find any reference to it in the AWS docs now.

Will this all end in tears or is Organizations w/SSO a mature offering now? I know I am going to get a load of poo poo from the dev team the first time they're trying to follow some AWS docs and there's a disclaimer about it not working for SSO users (think the amount of Google stuff that doesn't work with G Suite accounts). If I need to hold off to have a smoother implementation experience then I can do that, it's already going to be enough of a struggle to stop this team doing whatever the gently caress they want but it's achievable if the end result is positive.

# ? Feb 27, 2020 21:47

Docjowles: Apr 9, 2009

Sorry for the terse phone reply, but yes, all of what you posted is a good idea and the services are solid. Including making a new account to serve as the organizational master that does nothing except authentication and billing. AWS promotes this pattern all over the place in white papers, reInvent talks, the guidance we got from our account team when starting out, etc.

# ? Feb 27, 2020 23:19

putin is a cunt: Apr 5, 2007; BOY DO I SURE ENJOY TRASH. THERE'S NOTHING MORE I LOVE THAN TO SIT DOWN IN FRONT OF THE BIG SCREEN AND EAT A BIIIIG STEAMY BOWL OF SHIT. WARNER BROS CAN COME OVER TO MY HOUSE AND ASSFUCK MY MOM WHILE I WATCH AND I WOULD CERTIFY IT FRESH, NO QUESTION

Really simple question for the experts here I'm sure, but I have an ECS cluster and the underlying EC2 instance (just one at the moment, it's an in-development project) is using an AMI that now has a more recent version available. I know how to change the AMI and so on, but what is the easiest way to quickly locate the AMI ID for the latest version of this AMI? Searching for the AMI brings up a bunch of different ones and there is no way to sort by version, or even creation date.

# ? Feb 27, 2020 23:54

Skier: Apr 24, 2003; Fuck yeah.; Fan of Britches

a hot gujju bhabhi posted:

Really simple question for the experts here I'm sure, but I have an ECS cluster and the underlying EC2 instance (just one at the moment, it's an in-development project) is using an AMI that now has a more recent version available. I know how to change the AMI and so on, but what is the easiest way to quickly locate the AMI ID for the latest version of this AMI? Searching for the AMI brings up a bunch of different ones and there is no way to sort by version, or even creation date.

https://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs-ami-versions.html says

code:

aws ssm get-parameters --names /aws/service/ecs/optimized-ami/amazon-linux-2/recommended

can get you what you need. Here's what I see with some weird double encoding:

code:

"{\"schema_version\":1,\"image_name\":\"amzn2-ami-ecs-hvm-2.0.20200218-x86_64-ebs\",
\"image_id\":\"ami-0c0415cdff14e2a4a\",\"os\":\"Amazon Linux 2\",
\"ecs_runtime_version\":\"Docker version 18.09.9-ce\",\"ecs_agent_version\":\"1.37.0\"}"

# ? Mar 2, 2020 00:09

putin is a cunt: Apr 5, 2007; BOY DO I SURE ENJOY TRASH. THERE'S NOTHING MORE I LOVE THAN TO SIT DOWN IN FRONT OF THE BIG SCREEN AND EAT A BIIIIG STEAMY BOWL OF SHIT. WARNER BROS CAN COME OVER TO MY HOUSE AND ASSFUCK MY MOM WHILE I WATCH AND I WOULD CERTIFY IT FRESH, NO QUESTION

Skier posted:

https://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs-ami-versions.html says
code:
aws ssm get-parameters --names /aws/service/ecs/optimized-ami/amazon-linux-2/recommended
can get you what you need. Here's what I see with some weird double encoding:
code:
"{\"schema_version\":1,\"image_name\":\"amzn2-ami-ecs-hvm-2.0.20200218-x86_64-ebs\",
\"image_id\":\"ami-0c0415cdff14e2a4a\",\"os\":\"Amazon Linux 2\",
\"ecs_runtime_version\":\"Docker version 18.09.9-ce\",\"ecs_agent_version\":\"1.37.0\"}"

Oh nice, thanks!

# ? Mar 2, 2020 01:12

PierreTheMime: Dec 9, 2004; Hero of hormagaunts everywhere!; Buglord

What's a good way to automatically compress an output file from a Glue Python script? The files could run rather large (~100GB) and I'd prefer to stream the resulting .csv or .parquet or whatever straight into a .gzip file. I've written Java code to take files the other way, extracting data in-memory without using a permanent compute space, but currently streaming a zip and chunking that into multipart upload to S3 properly is hurting my brain so I figured I'd ask here. Obviously a Python solution would be preferable so I can just ball it into the Glue but at this point I'd take any example somehow had and work from there.

Edit:
Apparently using S3DistCp against an EMR cluster is recommended. I looked into the backend code as it essentially just uses temporary disk space to compress the file(s) and then do a standard multipart upload. This works well enough but my dream of having the whole thing done in memory to avoid the need for dedicated space continues.

PierreTheMime fucked around with this message at 14:41 on Mar 10, 2020

# ? Mar 9, 2020 18:11

fluppet: Feb 10, 2009

Agrikk posted:

What’s the position?

Things are moving a bit slowly in Post-new year. I recommend to keep after it and keep bugging HR people for updates. Eagerness is a good thing here.

Persistence does seem to be the key as im now got an interview for a devops consultant role lined up

# ? Mar 10, 2020 19:57

necrobobsledder: Mar 21, 2005; Lay down your soul to the gods rock 'n roll; Nap Ghost

PierreTheMime posted:

Edit:
Apparently using S3DistCp against an EMR cluster is recommended. I looked into the backend code as it essentially just uses temporary disk space to compress the file(s) and then do a standard multipart upload. This works well enough but my dream of having the whole thing done in memory to avoid the need for dedicated space continues.

You're forced to use algorithms that will operate statelessly on chunks of data at a time such as gzip, snappy, LZO. You can also try zstandard giggles on your dataset.

# ? Mar 11, 2020 01:53

Pile Of Garbage: May 28, 2007

Did some Googling of this earlier today but couldn't find an answer: is it possible to delete AWS reserved tags from resources? I've got a bunch of EC2 instances that were spun-up from CFN templates that have since been deleted however the instances all still have the AWS reserved tags for CFN on them. If you try to delete the tags via the console it just spits an error about how you can't delete AWS reserved tags.

# ? Mar 13, 2020 13:04

Umbreon: May 21, 2011

Is cloudacademy a good resource for breaking into AWS cloud stuff? Or is it acloudguru all the way?

# ? Mar 17, 2020 01:02

Nomnom Cookie: Aug 30, 2009

Pile Of Garbage posted:

Did some Googling of this earlier today but couldn't find an answer: is it possible to delete AWS reserved tags from resources? I've got a bunch of EC2 instances that were spun-up from CFN templates that have since been deleted however the instances all still have the AWS reserved tags for CFN on them. If you try to delete the tags via the console it just spits an error about how you can't delete AWS reserved tags.

I'm not an AWS certified professional certified community hero

but i think the error message answered your question

# ? Mar 17, 2020 03:58

Adbot: ADBOT LOVES YOU

# ? May 15, 2024 19:39

Arzakon: Nov 24, 2002; "I hereby retire from Mafia"
Please turbo me if you catch me in a game.

Nomnom Cookie posted:

I'm not an AWS certified professional certified community hero

but i think the error message answered your question

Send me your contact details because I can't nominate a community hero with just an SA username

# ? Mar 17, 2020 04:19

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Amazon Web Services - Cloud Giant Hits Hard

«‹›61 »