Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
vanity slug
Jul 20, 2010

22 Eargesplitten posted:

Is there any viable reason to be using MS SQL on an EC2 instance rather than in an RDS instance? My suspicion is it's due to a lift and shift from on-prem to AWS and wanting to just copy everything over from the old on-prem DB setup. I know they were fussed about the price of MSSQL on RDS compared to on Azure but I'm not sure if a license for an on-prem version would transfer over to EC2 and save them the subscription cost.

Then again, if they want HA, they would probably need two licenses for the two different EC2 instances and have to deal with cross-region replication of data between the two which sounds like a pain.

We ran our MSSQL databases on EC2. At the time AOAGs were not available on RDS, and we needed the extra control over the storage that we couldn't get from RDS (basic stuff like running tempdb on ephemeral storage for the performance improvements, running databases on their own disks with their own IOPS allocation, things like that).

Adbot
ADBOT LOVES YOU

freeasinbeer
Mar 26, 2015

by Fluffdaddy
MS SQL on RDS has a bunch of limitations and is kinda a pain to use. I’ve been 2 places that had it; and I’d honestly say it might not be worth it to run it in RDS.

Biggest pain points:

DBA is gonna be pissy because they are locked out of tuning most things; which is actually a pro in the end, but they will all complain

Doing replication to a secondary database for things like BI or multi region is a PITA; you can’t touch any of MS SQLs replication so you need to run DMS, which barely supports MS SQL, or use a janky third party tool that costs $10k a month.

Stuff will failover; for some reason I’ve found that apps that use ms sql to be abnormally bad at reconnecting.

It’s sooooo expensive and doesn’t have support for things like x1 instances which would save a ton because of lower core counts.

Also you can’t BYOL

Docjowles
Apr 9, 2009

freeasinbeer posted:

use a janky third party tool that costs $10k a month.

This is really an excellent database_admin.txt summary. Third party database tools have to have one of the worst price:quality ratios of code in existence. They must be written by the same people who develop applications for banks or healthcare that look like MS-DOS ASCII art and cost 8 figures.

Agrikk
Oct 17, 2003

Take care with that! We have not fully ascertained its function, and the ticking is accelerating.
From a licensing perspective MSSQL on EC2 can be a lot cheaper: if I recall correctly, maybe it’s changed but an org is permitted to run a database instance license free if it is in a dev environment. Meaning that SQL on EC2 is cheaper than RDS SQL.

Also, there are some specific use cases in which a highly customized EC2 storage architecture will run a workload faster than RDS (things like multiple mdf/ndf files spread across multiple EBS volumes to speed up I/O, etc).

But on the downside managing patches and downtime on EC2 has an associated cost and RDS is more resilient and just works* whereas EC2 requires a much more pets-not-cattle approach.






*mostly

luminalflux
May 27, 2005



22 Eargesplitten posted:

Is there any viable reason to be using MS SQL on an EC2 instance rather than in an RDS instance?

In our case: s/MS SQL/MySQL/ and yes, absolutely. We decided to shift from RDS MySQL to MySQL on EC2 due to a combination of things:

* RDS auto-upgraded us to version with a deadlock bug in it (5.7.22) and the version with the bugfix (5.7.25) wasn't available. A downgrade was not easily possible.
* RDS would go into crash recovery on every failover due to issues we didn't realize until a lot later
* RDS wasn't giving us enough IOPS. On EC2, we went from EBS (network block storage) to striping over multiple EBS volumes to striping over multiple local NVMe devices on i3 instances.
* RDS didn't give us some tuneable parameters or access to files on disk for advanced analytics/monitoring
* RDS charges a pretty premium over EC2.

Sometimes the AWS managed service works great and you don't have to care about it. Sometimes the AWS service doesn't and you just run MySQL/Redis/Kafka instead of RDS / Elasticache / MSK

Enshoku
Jun 1, 2013
I am consistently surprised at how expensive every other database option aside from dynamoDB is on AWS. It feels like Amazon is trying to drive people to use it over RMDBS options on price alone. Not that I'm complaining though, it's generally faster for me to slap together some single table database design than to actually have to think about relationships, it generally being cheaper is just a nice side benefit.

Hughmoris
Apr 21, 2007
Let's go to the abyss!
Any advice on getting my foot in the door for an entry-level / junior AWS job? I'm slowly working my way through a SysOps Administrator course but would love to land a gig where I can actually work with AWS, even at a Tier 1 level. LinkedIn searched for "AWS Help Desk" turns up next to nothing. What type of job titles should I be looking for?

Agrikk
Oct 17, 2003

Take care with that! We have not fully ascertained its function, and the ticking is accelerating.
Cloud Support Associate
Cloud Support Engineer

These are the helpdesk positions you are thinking of. A year or two there and you’ll be prepared to move into other roles.

Hughmoris
Apr 21, 2007
Let's go to the abyss!

Agrikk posted:

Cloud Support Associate
Cloud Support Engineer

These are the helpdesk positions you are thinking of. A year or two there and you’ll be prepared to move into other roles.

I figured I was missing something. I'm getting much more results with those two, thanks!

Arzakon
Nov 24, 2002

"I hereby retire from Mafia"
Please turbo me if you catch me in a game.
Look for Associate Solution Architect or Associate Professional Services Consultant as well. AWS Tech U is the program my org uses when we want to hire lots of entry level talent. Good place to keep an eye out for opportunities if you don't have any IT work history but can demonstrate some depth in a few areas. These listings typically start as 1 year paid internships that hire directly into full time associate level roles.

Hughmoris
Apr 21, 2007
Let's go to the abyss!

Arzakon posted:

Look for Associate Solution Architect or Associate Professional Services Consultant as well. AWS Tech U is the program my org uses when we want to hire lots of entry level talent. Good place to keep an eye out for opportunities if you don't have any IT work history but can demonstrate some depth in a few areas. These listings typically start as 1 year paid internships that hire directly into full time associate level roles.

Thanks for this.

Was anyone affected by the (what I've heard) us-east-1 EC2 stumble this morning?

Hughmoris
Apr 21, 2007
Let's go to the abyss!
I believe we have some AWS people floating about. Anyone in the vicinity of this New World launch? If anyone could have nailed an MMO infrastructure launch I figured it would be AWS but queue times have been awful all day.

Disregard, not appropriate to ask in here.

Hughmoris fucked around with this message at 04:31 on Sep 29, 2021

FamDav
Mar 29, 2008
you’re not going to get any details on another customer

Walked
Apr 14, 2003

Anyone run into Redshift snapshots not capturing views? AWS support says its not expected behavior, but have had some scheduling conflicts getting a screenshare going. Curious if anyone has run into this while I try to nail down scheduling.

Just-In-Timeberlake
Aug 18, 2003
Hoping someone in here knows a bit about Application Load Balancer and Lambda functions.

I'm trying to access a Lambda .netcore function via ALB vs the API Gateway (API gateway is the way it's currently accessed, and works) because API Gateway has a max timeout of 30 seconds, and there are times it will take longer, so ALB seems the way to get around this.

Here's what I've done:

1. Changed the .netcore project's Lambda entry point to use ApplicationLoadBalancerFunction instead of APIGatewayHttpApiV2ProxyFunction.
2. Created an application load balancer using the same VPC and subnets the Lambda currently uses (3 AZs). The target group points to the Lambda in question.
3. Changed the DNS entry to point to the ALB.

The target group has a health check url that points to the root of the Lambda and just returns an "ok" 200 message. Looking at the dashboard, the target group shows as healthy, with 1 healthy host, and 0 unhealthy hosts. So, the health check is working as far as I can tell.

The problem is I can't access it via any method (HTTP, HTTPS + Postman, web browser, etc), it just times out.

For the purpose of troubleshooting, I've set the security group the ALB uses to allow all traffic, to no effect.

I'm obviously missing something, but I don't know what it is. Any insight appreciated.

CarForumPoster
Jun 26, 2013

⚡POWER⚡

Just-In-Timeberlake posted:

I'm trying to access a Lambda .netcore function via ALB vs the API Gateway (API gateway is the way it's currently accessed, and works) because API Gateway has a max timeout of 30 seconds, and there are times it will take longer, so ALB seems the way to get around this.

I'm not super knowledgeable about AWS but why would you use ALB to get around the Lambda function time exceeding the API Gateway timeout?

When I've had this problem, instead of returning what I want from my lambda function I return an ID that can be queried for its completed result, which I store somewhere.

Vanadium
Jan 8, 2005

Does the ALB have a public IP address or is it internal? Do the subnets have internet gateways or nat gateways and stuff like that?

Did you configure a listener on the ALB or only a target group?

Just-In-Timeberlake
Aug 18, 2003

Vanadium posted:

Does the ALB have a public IP address or is it internal? Do the subnets have internet gateways or nat gateways and stuff like that?

Did you configure a listener on the ALB or only a target group?

The ALB is configured to use a VPC we have set up for this Lambda that gives it an outgoing static IP address (for restricting access to resources via source IP). In Route 53 I've got the dns entry pointing to the ALB instance. The VPC is configured correctly (Internet gateway + NAT gateway, 3 public facing subnets, 3 private subnets, routing table entries), as this routes traffic just fine when using API Gateway, but not the ALB.

I have HTTP and HTTPS listeners configured on the ALB, forwarded to the Lambda target group

CarForumPoster posted:

I'm not super knowledgeable about AWS but why would you use ALB to get around the Lambda function time exceeding the API Gateway timeout?

When I've had this problem, instead of returning what I want from my lambda function I return an ID that can be queried for its completed result, which I store somewhere.


Mainly because if I can get this working it's a lot less work than refactoring a bunch of code.

Just-In-Timeberlake fucked around with this message at 12:24 on Oct 21, 2021

Bitcoin
Sep 12, 2021

Just-In-Timeberlake posted:

The problem is I can't access it via any method (HTTP, HTTPS + Postman, web browser, etc), it just times out.
For the purpose of troubleshooting, I've set the security group the ALB uses to allow all traffic, to no effect.

Can you reach it from the same VPC? Have you accidentally created an internal ALB which won't be reachable outside no matter what your security groups say because it's not routable?

Just-In-Timeberlake
Aug 18, 2003

Bitcoin posted:

Can you reach it from the same VPC? Have you accidentally created an internal ALB which won't be reachable outside no matter what your security groups say because it's not routable?

I'm like 99% certain it's set up right, the wizard pretty much makes sure you've got everything set correctly when creating an ALB. I'm the farthest thing from an AWS expert, so I'm not sure how to test connecting from within the same VPC.

Agrikk
Oct 17, 2003

Take care with that! We have not fully ascertained its function, and the ticking is accelerating.
Stand up an instance within the VPC and curl the ALB’s DNS from it.

ledge
Jun 10, 2003

Have you run the reachability analyzer? That worked for me when I was having trouble with network nonsense earlier today.

CarForumPoster
Jun 26, 2013

⚡POWER⚡

ledge posted:

Have you run the reachability analyzer? That worked for me when I was having trouble with network nonsense earlier today.

Not the OP but I didnt know about this, thanks for posting it!

Scrapez
Feb 27, 2004

Does AWS provide an Amazon Linux 2 kernel 5.10 AMI somewhere? The only ones I can find for kernel 5.10 are community AMIs.

The quick start Amazon Linux 2 AMI is kernel 4.14.

crazypenguin
Mar 9, 2005
nothing witty here, move along
Yeah. It’s in amazon-linux-extras

I forget what version exactly but definitely a newer 5.x

Pile Of Garbage
May 28, 2007



crazypenguin posted:

Yeah. It’s in amazon-linux-extras

I forget what version exactly but definitely a newer 5.x

kernel-ng iirrc

astral
Apr 26, 2004

https://aws.amazon.com/blogs/aws/aws-free-tier-data-transfer-expansion-100-gb-from-regions-and-1-tb-from-amazon-cloudfront-per-month/
https://aws.amazon.com/about-aws/whats-new/2021/11/aws-price-reduction-data-transfers-internet/

Amazon's feeling some pressure from Cloudflare.

Agrikk
Oct 17, 2003

Take care with that! We have not fully ascertained its function, and the ticking is accelerating.
Or going after Cloudflare.

Scrapez
Feb 27, 2004

Is there a way to spin up an EC2 instance with only an ENI? No built in NIC?

We need static IPs and the ability for an instance that has died and been replaced with autoscaling to get that same static IP back.

Currently accomplishing that by a user-data script that the instance runs to discover some info about itself (region, AZ, purpose of instance) and attaches the appropriate ENI to itself.

The issue is that you then have two NICS. The built in and the ENI. We never use the built in for anything so it's pointless for it to be there and causes issues with some of the software we are running as it tries to default to the built in even when it's down.

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb

Scrapez posted:

Is there a way to spin up an EC2 instance with only an ENI? No built in NIC?

We need static IPs and the ability for an instance that has died and been replaced with autoscaling to get that same static IP back.

Currently accomplishing that by a user-data script that the instance runs to discover some info about itself (region, AZ, purpose of instance) and attaches the appropriate ENI to itself.

The issue is that you then have two NICS. The built in and the ENI. We never use the built in for anything so it's pointless for it to be there and causes issues with some of the software we are running as it tries to default to the built in even when it's down.

Why not stick these instances behind a NAT gateway so they all have the same public IP?

ledge
Jun 10, 2003

fletcher posted:

Why not stick these instances behind a NAT gateway so they all have the same public IP?

Or a load balancer.

12 rats tied together
Sep 7, 2006

I'll take a guess:
- all the load balancers use RR dns that you dont get to pick, so the IP will change out from under you over time and you will always have at least 2 IPs*
- NAT gateway only works for egress traffic since it's dynamic and not static NAT*

*- I think

OP I think we both know this already but the best answer would be to use software that isn't terrible garbage and will respect your configured route table. Since thats probably out of your control, if I had to deal with this, I'd probably bake something into cloud-init that disabled the problematic interface on startup.

crazypenguin
Mar 9, 2005
nothing witty here, move along
If you’re having the instance assign itself an ENI on startup you could instead just have it assign itself an elastic IP, no? Never tried this, so dunno about gotchas, but seems like it’d work. The closest I did was have instances update their DNS on startup

(I assume this is one of those “ASGs of size exactly 1” situations right?)

Scrapez
Feb 27, 2004

crazypenguin posted:

If you’re having the instance assign itself an ENI on startup you could instead just have it assign itself an elastic IP, no? Never tried this, so dunno about gotchas, but seems like it’d work. The closest I did was have instances update their DNS on startup

(I assume this is one of those “ASGs of size exactly 1” situations right?)

TL;DR:
I wasn't very clear earlier that it is the private IP I need to be static, not the public (though I need an elastic as well)

I had a think about this and crazypenguin is exactly right. The solution, it seems, is to setup an ASG/launch config with user-data that assigns the built in NIC a specific static IP based on the Region/AZ/function of the instance and a size of 1. I'll have to do this for 2 regions, 3 AZs per region, and 2 types of instances (12 total ASGs) but it should work.


The software in question is an IVR and it is distributed across instances that perform different functions. There are 3 instances in a Zookeeper quorum that are the "brains" that keep track of calls, queries to backend database, etc. There are 3 instances that are used for telephony that answer the calls and then communicate with the 3 Zookeeper brain instances. That's one of the reasons we need static private IPs. The other reason is that we use SRV records and UDP to route calls. Load balancers don't offer UDP health checks so it's difficult for them to know the state of the telephony instances. These calls come into us via VPN so they're routing to the private IP space.

I was able to configure the instances to use eth0 (the ENI) for everything and leave ens5 (the built-in) down. I just had to update the ifcfg-ens5 ONBOOT=no so that ens5 stays down after a reboot and update the default route to eth0. I just don't like having an unused NIC sitting there all the time. If it somehow were to get enabled, then our traffic defaults out that route and causes problems. So, seems the size 1 ASG method with static private IPs is the answer. The only gotcha I can think of there is how long the AWS DNS servers cache that IP entry. Say I have an instance with IP 192.168.0.5 crash, ASG launches a new instance to take its place and configures it with the 192.168.0.5 IP. Does DNS still have some sort of cached route that tries to send traffic to the old crashed instance that's no longer there? That's the advantage of the ENI, it gets detached and attached to a new instance and the route stays the same.

luminalflux
May 27, 2005



Zookeeper doesn't need static IPs - you can put your current Zookeeper instances in a Route53 record, once the client connects to the first instance in the round-robin DNS it will find the other active instances. We do this for our ZK clusters and we roll them every so often.

Does the IVR software also have something you can hit over TCP to see if it's available? If so, you can use an NLB that forwards traffic over UDP but healthchecks over TCP.

The default VPC DNS resolver respects the TTL you set for DNS records - you'll notice that ELB/NLB/ALBs have 60s TTLs or lower on their records. Your OS and client software, however, might not.
Also if they're registering themselves in ZK, do you have enough control over the client for it to query ZK for which instances are active?

ledge
Jun 10, 2003

Scrapez posted:


The software in question is an IVR and...

So the answer here is rebuild the IVR in Connect :) Then you can forget about maintaining servers and networks ever again.

necrobobsledder
Mar 21, 2005
Lay down your soul to the gods rock 'n roll
Nap Ghost
We use ZK at scale here and don’t use IPs and ensure a lot of availability within a region to ensure quorum even with a loss of 2 AZs simultaneously. Last I used static ZK IPs was for a demo maybe around 2014 out of sheer laziness. It’s not evil to use DNS within a region or something. I can understand being allergic to DNS for a number of cases but ZK doesn’t strike me as one of them.

Scrapez
Feb 27, 2004

Probably didn't explain well. The telephony instances require communication with the zookeeper "brain" instances on static IPs. It would be possible to do this with DNS but then we would have to dynamically update Route53 with the updated IP if one of the zookeeper instances died and respun with a new IP. I have successfully done this with Cloudwatch and a Lambda function previously so it might be doable.

The reason for the need for static IPs on the telephony instances is that we have other companies that connect to our IVR via VPN. Their crappy PBX and switch software often can't handle using a FQDN and if they can, they often cache the entry in perpetuity. This forces us to accommodate them by keeping the same static IPs for the telephony instances. We could solve this with a SIP Proxy but my company has been too cheap to purchase one thus far.

All good info that makes sense so thanks for the responses. At minimum, I could stop using static IPs for the "brain" instances.

luminalflux
May 27, 2005



Scrapez posted:

Probably didn't explain well. The telephony instances require communication with the zookeeper "brain" instances on static IPs. It would be possible to do this with DNS but then we would have to dynamically update Route53 with the updated IP if one of the zookeeper instances died and respun with a new IP. I have successfully done this with Cloudwatch and a Lambda function previously so it might be doable.

Yep, this isn't too hard and ZK nodes don't get replaced too often, and even if they did the round-robin record will have enough live ones for the client to discover the rest of the cluster. Especially if you have more than 5 you should have no issue losing quorum even if you have lose a whole AZ. You can also look in to placement groups to ensure that your ZK nodes aren't all in the same rack / hypervisor. As you said, you can build something with EventBridge (nee Cloudwatch Events) and Lambda, or just update the record when they get replaced. In general, relying on static IPs is an antipattern but especially inside a public cloud provider.

quote:

The reason for the need for static IPs on the telephony instances is that we have other companies that connect to our IVR via VPN. Their crappy PBX and switch software often can't handle using a FQDN and if they can, they often cache the entry in perpetuity. This forces us to accommodate them by keeping the same static IPs for the telephony instances. We could solve this with a SIP Proxy but my company has been too cheap to purchase one thus far.

Yeah, in that case I would look in to using an NLB exposing them over UDP and healthchecking the instances over TCP. I've done this where I've made a simple python service that returns the output of "systemctl status crappy-telephony-product.service" over HTTP which the LB uses to check health of something that has no HTTP or TCP endpoint.

luminalflux fucked around with this message at 06:40 on Dec 3, 2021

Adbot
ADBOT LOVES YOU

Scrapez
Feb 27, 2004

luminalflux posted:

Yeah, in that case I would look in to using an NLB exposing them over UDP and healthchecking the instances over TCP. I've done this where I've made a simple python service that returns the output of "systemctl crappy-telephony-product.service" over HTTP which the LB uses to check health of something that has no HTTP or TCP endpoint.

We had this setup and one time the IVR software inexplicably stopped listening on UDP but was still responding to TCP. So our health checks were passing and we were dropping all calls. It was a one time, one-off event but based on that a VP decided we should never do that again. :rolleyes:

I just need to make another run at explaining how we need a SIP Proxy. Hell Kamailio is free and awesome. But again management decided it's "opensource" and we can't trust it to run all of our traffic through. Meanwhile, we are running CentOS 7 on all the instances. :doh:

Anyway, now I'm just ranting. I do appreciate the input.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply