Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Hughmoris
Apr 21, 2007
Let's go to the abyss!

The Iron Rose posted:

for the love of god do not use AWS’ managed gateway service. It’s insanely expensive and you can do the same thing for a fraction of the price by running your own NAT instances.

Thanks for the tip, I'll go investigate NAT instances.

Adbot
ADBOT LOVES YOU

luminalflux
May 27, 2005



The issue I have with "you can just..." is that yeah, sure, I can go create my own NAT instances. I'd also have to keep them up to date and patched. I'd also have to deal with failovers or potential outages if the instance suddenly goes away. All this I don't have to deal with by using NAT gateways

Corey Quinn keeps beating this drum and I'm convinced he's never had to run NAT instances in produciton.

The Iron Rose
May 12, 2012

:minnie: Cat Army :minnie:

luminalflux posted:

The issue I have with "you can just..." is that yeah, sure, I can go create my own NAT instances. I'd also have to keep them up to date and patched. I'd also have to deal with failovers or potential outages if the instance suddenly goes away. All this I don't have to deal with by using NAT gateways

Corey Quinn keeps beating this drum and I'm convinced he's never had to run NAT instances in produciton.

spot instances and ASGs pulling from latest with ILBs solve these problems nicely. Just because you’re running your own hardware doesn’t mean it can’t be ephemeral also

luminalflux
May 27, 2005



When you say "ILB", do you mean running an NLB in front of the NAT instances?

nullfunction
Jan 24, 2005

Nap Ghost
There's something to be said for a managed solution that you don't have to screw with ever, and if you aren't paying for it, then yeah, it's probably the way to go!

I went the ASG -> NAT appliance route for what I'm working on, because my bandwidth needs are far below the 5Gbps floor of a NAT Gateway, and I spend ~$5/AZ/mo on the appliances, as opposed to $30+/AZ/mo for the gateway. If I need to scale past 5Gbps it's simple enough to drop a NAT Gateway in my CFN templates and swap out, but I don't see that happening for a long time (or maybe ever for this application).

I'm using the free version of this appliance and it's been very needs-suiting so far.

The Iron Rose
May 12, 2012

:minnie: Cat Army :minnie:

luminalflux posted:

When you say "ILB", do you mean running an NLB in front of the NAT instances?

Yes, but I’ve done more digging and apparently this isn’t possible, so it’s back to reassigning a pool of ENIs annoyingly. Which I would personally do with eventbridge listening to startup events for the relevant instances and using a lambda to assign the ENI with a static IP. Insanely quick and you’ll attach within < 10s of startup.

The Iron Rose fucked around with this message at 19:28 on Jun 3, 2022

luminalflux
May 27, 2005



The Iron Rose posted:

Yes, but I’ve done more digging and apparently this isn’t possible, so it’s back to reassigning a pool of ENIs annoyingly. Which I would personally do with eventbridge listening to startup events for the relevant instances and using a lambda to assign the ENI with a static IP. Insanely quick and you’ll attach within < 10s of startup.

That sounds like a huge rube goldberg machine to build for saving $0.045 per GB, and reading that makes me feel like something will break and leave me with no egress traffic for too long. Sure, it's not nothing but instead it's easier to go down the route of private pricing / EDP and negotiate this down.

luminalflux fucked around with this message at 19:36 on Jun 3, 2022

The Iron Rose
May 12, 2012

:minnie: Cat Army :minnie:

luminalflux posted:

That sounds like a huge rube goldberg machine to build for saving $0.045 per GB, and reading that makes me feel like something will break and leave me with no egress traffic for too long. Sure, it's not nothing but instead it's easier to go down the route of private pricing / EDP and negotiate this down.

Eventbridge used cloudwatch events under the hood and has 99.99% availability vs NAT gateway’s 99.9%. It’s definitely more work, and a more complex architecture, but saving on the 4.5c/hour and the $0.045/GB egress adds up very quickly at scale.

Or just manage the EC2 instance like you would any other service that needs a full VM (which we all have unfortunately).

At the end of the day if it’s not your budget, it’s not your budget. But it can make a big difference if you’re serving a lot of content. There are other ways to alleviate that (CDNs!), but it’s an easy win that can save lots of money with about 20 lines of Python.

The Iron Rose fucked around with this message at 20:37 on Jun 3, 2022

nullfunction
Jan 24, 2005

Nap Ghost
A single managed NAT gateway would add 50% to my current monthly AWS bill, and I'd need three to cover the AZs for the region I'm currently deployed in. The $0.045/GB bandwidth cost is negligible compared to the fixed cost of just having it turned on for my use case.

I'm paying for it, and it's not currently making any money, but if either of those things were different, a NAT gateway would be on the table. So far, the appliance solution has worked well at a fraction of the price, but if my requirements change, it starts generating revenue, or maintenance becomes a problem, I'll be looking at NAT gateways as they are certainly a simpler solution.

My Rube Goldberg approach was to allocate EIPs for the NAT appliances (which aren't spot, but are t3a.micro and very cheap and easy to swap to a more powerful type if needed), then use a tag on the portion of private subnets' routing tables that need outbound access. The presence of the tag causes a Lambda function to look up the inside interface on the NAT appliance in the appropriate AZ and add it as the default route in that routing table, at which point the tag is modified and it's not considered for future runs. I didn't see a way to have EventBridge get events from subnet creation, so I opted for a scheduled rule poking the Lambda regularly to pick up new subnets based on their tags.

The whole setup costs me $10/month for the entire region vs the $90 for the gateway option, pre-bandwidth. The most obnoxious part is that I have to throw a `until curl google.com; do sleep 1; done;` into the UserData of instances if I'm recreating a whole subnet via Cloudformation so that they wait until they have outbound access to do their real work. I've decided I can live with that for $80.

My bandwidth needs are currently measured in Kbps, and should top out in the low Mbps (niche userbase, slowly changing data, heavy use of caching and CDNs). If I'm going to barely utilize a resource, I'm going to barely utilize the cheaper option. It's nice to have options!

12 rats tied together
Sep 7, 2006

It's fair to say "do the math to find out if nat gateway makes sense for you", I wouldn't go further than that in any direction for making any firm rules. If I spend more than 20 minutes thinking about the NAT instances per month in my employer's aws environment, I've have wiped out all of the savings we might have seen for it just in labor, nevermind what would happen if we actually lost a NAT instance and had to deal with the production impact, all of the monitoring we'd need for it, the extra config management and orchestration, and like noted above, the extra impact the custom setup now has on downstream resources in this network.

Being able to simply declare "these instances go out through a nat device" in a yaml file and be charged $0.045/hr and $0.045/GB for it to just work, just scale automatically in the background, and to just get graceful failover for free is an extremely compelling pitch for a service to make.

In the event that you find yourself having to janitor this setup for whatever reason, I have done it with keepalive.d in the past and ENI juggling, and I did not find it to be especially cumbersome. If you need to update route tables when an instance fails, that's not really suitable for a production workload at an employer, IMO.

e: I would also add that the answer to this type of thing at scale is usually "avoid using NAT, especialy source NAT, at all costs".

12 rats tied together fucked around with this message at 21:35 on Jun 3, 2022

Agrikk
Oct 17, 2003

Take care with that! We have not fully ascertained its function, and the ticking is accelerating.

Hughmoris posted:

All that old data would be sweet. I'm going to try and get the foundation of this idea stood up first and I'll take you up on that offer if I get far enough along.

Here:

https://bits.rainwalk.net/EarthquakeData.zip (185mb zipped)

Hughmoris
Apr 21, 2007
Let's go to the abyss!

El dorado! This is great, thanks.

MightyBigMinus
Jan 26, 2020

amazing how trying to get a serverless function to talk to a standard database turns into a multi-day full-thread-page derail into networking crap

just open your rds to the internet. who cares. its earthquake data.

Hughmoris
Apr 21, 2007
Let's go to the abyss!

MightyBigMinus posted:

amazing how trying to get a serverless function to talk to a standard database turns into a multi-day full-thread-page derail into networking crap

just open your rds to the internet. who cares. its earthquake data.

To be fair to everyone else, I'm a dum-dum with this stuff.

Agrikk
Oct 17, 2003

Take care with that! We have not fully ascertained its function, and the ticking is accelerating.

MightyBigMinus posted:

amazing how trying to get a serverless function to talk to a standard database turns into a multi-day full-thread-page derail into networking crap

just open your rds to the internet. who cares. its earthquake data.

Hardest part for me is always permissions stuff. Networking comes east for me but assumed roles? Ugh. Stuff of nightmares.

Hughmoris
Apr 21, 2007
Let's go to the abyss!
Does anyone have recommended forums/blogs for AWS Big Data stuff? I can't find anywhere that things like Glue, EMR, Redshift etc... are really talked about.

Vanadium
Jan 8, 2005

can you skip all the nat gateway stuff by using public subnets and using security groups to keep people from talking to you?

vanity slug
Jul 20, 2010

Yeah pretty much.

Hughlander
May 11, 2005

With google's change to google apps for domains where you need to pay $$$ I have some 15 year old domains that I have gmail accounts for routing to other gmail accounts that I now need to get rid of. So I plan on following this blog post https://aws.amazon.com/blogs/messaging-and-targeting/forward-incoming-email-to-an-external-destination/ about being able to:

- Set up route 53 mx records
- Use SES to save incoming mail to an S3 bucket
- Use a lambda function to trigger on file writing to S3
- To resend outgoing mail via SES to the permenant email address

And set it up for about 5 domains. (IE anything sent to *@hughlander.com goes to hughlander@gmail.com)

Since there's a reasonable number of domains, I figure also to go do that with some infrastructure as code and make it repeatable, maybe get my own blog post or at least a github link out of it. So my question is, what's the appropriate infrastructure as code system for this? I've used puppet and ansible in the past and neither seem appropriate. Since the tech is all AWS specific Cloud Formation sounds like a possibility, though I have some interest in learning terraform but not sure how terraform would work with R53, SES, S3, Lambda.

Anything I'm missing / Any thoughts?

luminalflux
May 27, 2005



Vanadium posted:

can you skip all the nat gateway stuff by using public subnets and using security groups to keep people from talking to you?

Absolutely. However the fun times come when you have you have third parties that either insist that you only access them from certain IPs for security reasons. This is unfortunately somewhat common in the payment provider space.

xpander
Sep 2, 2004

Hughlander posted:

With google's change to google apps for domains where you need to pay $$$ I have some 15 year old domains that I have gmail accounts for routing to other gmail accounts that I now need to get rid of. So I plan on following this blog post https://aws.amazon.com/blogs/messaging-and-targeting/forward-incoming-email-to-an-external-destination/ about being able to:

- Set up route 53 mx records
- Use SES to save incoming mail to an S3 bucket
- Use a lambda function to trigger on file writing to S3
- To resend outgoing mail via SES to the permenant email address

And set it up for about 5 domains. (IE anything sent to *@hughlander.com goes to hughlander@gmail.com)

Since there's a reasonable number of domains, I figure also to go do that with some infrastructure as code and make it repeatable, maybe get my own blog post or at least a github link out of it. So my question is, what's the appropriate infrastructure as code system for this? I've used puppet and ansible in the past and neither seem appropriate. Since the tech is all AWS specific Cloud Formation sounds like a possibility, though I have some interest in learning terraform but not sure how terraform would work with R53, SES, S3, Lambda.

Anything I'm missing / Any thoughts?

I'm a big fan of the CDK - https://aws.amazon.com/cdk/. This lets you use a high-level programming language of choice to generate the necessary CloudFormation templates. You could also check out CDKTF if you want to use terraform under the hood: https://www.terraform.io/cdktf.

12 rats tied together
Sep 7, 2006

I find pulumi to be the best of the "CDK" style products by a fair margin -- it uses terraform under the hood so you don't need an explicit "synth" step, but it's actually a competently developed set of libraries with some great documentation, unlike the CDKTF.

I definitely recommend checking out something in this area though. Please help bring the days of typing manually HCL or YAML closer to an end.

Hughmoris
Apr 21, 2007
Let's go to the abyss!
If I'm committing full-hog to AWS, and big data stuff in particular, I wonder if I should add another language to my repertoire. Right now I hack things together with Python but it seems that lots of resources are either using Node or Java.

Extremely broad question but you think I can get by learning the AWS data tools with just Python?

Docjowles
Apr 9, 2009

Yeah the python boto3 library is excellent. I don’t think you’re hamstringing yourself at all using python to interface with AWS.

luminalflux
May 27, 2005



Especially if you're using something like PySpark to do the data manipulation, there's no real reason to leave python when dealing with AWS unless you do a lot of Kinesis consumption stuff.

12 rats tied together
Sep 7, 2006

if you want to do "big data" in general, python and java are the languages for that, in that they're different enough from each other that you would benefit from learning both on purpose, and wouldn't necessarily be able to pick up one easily because you know the other

it's hard to recommend getting really into node.js for data stuff but i don't pay a ton of attention to it or it's ecosystem, i could be wrong

kalel
Jun 19, 2012

hello thread. as part of my job, I'm trying to learn more about AWS. is there a thread-recommended up-to-date "AWS for dummies"-esque resource that is more succinct and high-level than Amazon's documentation?

cage-free egghead
Mar 8, 2004

kalel posted:

hello thread. as part of my job, I'm trying to learn more about AWS. is there a thread-recommended up-to-date "AWS for dummies"-esque resource that is more succinct and high-level than Amazon's documentation?

Learning material for the Cloud Practitioner exam would probably be good for that.

Hed
Mar 31, 2004

Fun Shoe
I'd like to run a corporate Django site on Fargate, does AWS have anything like Azure App Proxy?

I like the idea of having the Django app sitting behind a scalable LB that authenticates people against a directory (Azure AD in this case) and passes that info back. I've done this in the past in places that had PKI where the nginx -> Django backend would authenticate the user and pass back headers that the Django app used for authentication/authorization.

Should I just look into the AWS LB sets more or is there something else to do this?

JehovahsWetness
Dec 9, 2005

bang that shit retarded

Hed posted:

I'd like to run a corporate Django site on Fargate, does AWS have anything like Azure App Proxy?

I like the idea of having the Django app sitting behind a scalable LB that authenticates people against a directory (Azure AD in this case) and passes that info back. I've done this in the past in places that had PKI where the nginx -> Django backend would authenticate the user and pass back headers that the Django app used for authentication/authorization.

Should I just look into the AWS LB sets more or is there something else to do this?

ALBs can do OIDC "directly" or other options (SAML, LDAP, etc) by bouncing through Cognito: https://docs.aws.amazon.com/elasticloadbalancing/latest/application/listener-authenticate-users.html. It also signs the resulting headers so you can validate it on the app side to ensure the request actually passed through the ALB auth flow.

Hed
Mar 31, 2004

Fun Shoe
Thanks! That link is exactly what I needed, and answers my follow on question about authenticating the headers. :)

Hed
Mar 31, 2004

Fun Shoe
This risks being a bigger SAML question, but what are my options for programmatic access?
If I put an API on the Django website like django-rest-framework, do I make people log in via SAML and get tokens to feed into their API client?

Scrapez
Feb 27, 2004

Reading above about using lambda to manage ENI attachments and adding default routes got me wondering if I could leverage something like that to better accomplish what I'm doing with a combination of userdata and shell scripts on instance creation.

I need specific private static IPs on instances that correspond to services running on said instance. What I've done is to create ENIs with certain tags corresponding to the service. When an instance is created, the userdata nfs mounts an efs and then executes the appropriate shell script.

The shell script uses the cli and metadata to load necessary data into variables. Then attaches an appropriate eni that it has found. It then configures routes on the instance to only use the ENI NIC as we don't want any traffic through the built in.

So is there a best practice for accomplishing this? I'm using cloudformation to create all the infrastructure but obviously the scripts have to be managed outside of cloudformation.

Scrapez fucked around with this message at 04:06 on Jun 10, 2022

The Iron Rose
May 12, 2012

:minnie: Cat Army :minnie:
This is a perhaps a silly question, but is this not something you can do with a reverse proxy, ingress controller, or service-service authentication via an API gateway?

I’m instinctively leery of running a service oriented architecture on a fleet of full VMs, though obviously it’s perfectly possible. The need for reserved IPs (for IP based allowlisting) is also a bit of a red flag.

If you really need to stick with VMs, eventbridge is your friend if you can detect the relevant CRUD event on a given resource. From there you can trigger lambdas to allocate/release/apply your ENI/SG changes, add queueing with SQS, have fun.

If you can swing it though I’d really try to get away from running full VMs, even if they’re all ephemeral spot instances.

Scrapez
Feb 27, 2004

The Iron Rose posted:

This is a perhaps a silly question, but is this not something you can do with a reverse proxy, ingress controller, or service-service authentication via an API gateway?

I’m instinctively leery of running a service oriented architecture on a fleet of full VMs, though obviously it’s perfectly possible. The need for reserved IPs (for IP based allowlisting) is also a bit of a red flag.

If you really need to stick with VMs, eventbridge is your friend if you can detect the relevant CRUD event on a given resource. From there you can trigger lambdas to allocate/release/apply your ENI/SG changes, add queueing with SQS, have fun.

If you can swing it though I’d really try to get away from running full VMs, even if they’re all ephemeral spot instances.

Unfortunately, the software running on these VMs was built with bare metal hosting in mind so VMs are the only real solution. I've not done a lot with eventbridge but will start digging into that to see what I can automate through native AWS services. The way I'm doing it with shell scripts seems very 1995ish.

22 Eargesplitten
Oct 10, 2010



I've got a static site on a private S3 bucket that is being presented by a cloudfront distribution. The problem I'm facing is that going to https://www.contoso.com/foo is giving an error, but going to https://www.contoso.com/foo/index.html is working. There doesn't seem to be a redirect function in S3 for private buckets, just public. I tried doing a Cloudfront function with redirect using some code from AWS's examples but it's not working. Would I need to make it a viewer request function or viewer response function? Is there anything else I would need to add aside from the redirect function? This is basically the first cloudfront work I've done aside from adding some response headers for security.

This is the snippet I used and just updated the redirect to what I need.
https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/example-function-redirect-url.html

12 rats tied together
Sep 7, 2006

For that problem specifically you might consider adjusting the distribution's default root object to be index.html, which would save you some code and the complexity of running it.

Just-In-Timeberlake
Aug 18, 2003

22 Eargesplitten posted:

I've got a static site on a private S3 bucket that is being presented by a cloudfront distribution. The problem I'm facing is that going to https://www.contoso.com/foo is giving an error, but going to https://www.contoso.com/foo/index.html is working. There doesn't seem to be a redirect function in S3 for private buckets, just public. I tried doing a Cloudfront function with redirect using some code from AWS's examples but it's not working. Would I need to make it a viewer request function or viewer response function? Is there anything else I would need to add aside from the redirect function? This is basically the first cloudfront work I've done aside from adding some response headers for security.

This is the snippet I used and just updated the redirect to what I need.
https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/example-function-redirect-url.html

Go to CloudFront > Functions

Create a new function:

code:
function handler(event) {
    var request = event.request;
    var uri = request.uri;
    
    // Check whether the URI is missing a file name.
    if (uri.endsWith('/')) {
        request.uri += 'index.html';
    } 
    // Check whether the URI is missing a file extension.
    else if (!uri.includes('.')) {
        request.uri += '/index.html';
    }

    return request;
}

kalel
Jun 19, 2012

Out of curiosity, would there ever be a reason to use an EC2 with a database image instead of RDS? I have some microservices inside fargate tasks which connect to a MySQL RDS (or at least, I'm trying to and am currently in the process of debugging). My understanding is that the typical industry standard way to manage a database is through RDS for scalability and convenience, but is there ever a motivation not to do that?

Adbot
ADBOT LOVES YOU

Just-In-Timeberlake
Aug 18, 2003

kalel posted:

Out of curiosity, would there ever be a reason to use an EC2 with a database image instead of RDS? I have some microservices inside fargate tasks which connect to a MySQL RDS (or at least, I'm trying to and am currently in the process of debugging). My understanding is that the typical industry standard way to manage a database is through RDS for scalability and convenience, but is there ever a motivation not to do that?

If you wanted complete control and access to all the features of a DB?

I know with an RDS MSSQL instance you can't run CLR assemblies on any version after 2016, and there's a bunch of other things you can't do/access due to security concerns.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply