Amazon Web Services - Cloud Giant Hits Hard - The Something Awful Forums

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Amazon Web Services - Cloud Giant Hits Hard

«‹›61 »

vanity slug: Jul 20, 2010

22 Eargesplitten posted:

Is there any viable reason to be using MS SQL on an EC2 instance rather than in an RDS instance? My suspicion is it's due to a lift and shift from on-prem to AWS and wanting to just copy everything over from the old on-prem DB setup. I know they were fussed about the price of MSSQL on RDS compared to on Azure but I'm not sure if a license for an on-prem version would transfer over to EC2 and save them the subscription cost.

Then again, if they want HA, they would probably need two licenses for the two different EC2 instances and have to deal with cross-region replication of data between the two which sounds like a pain.

We ran our MSSQL databases on EC2. At the time AOAGs were not available on RDS, and we needed the extra control over the storage that we couldn't get from RDS (basic stuff like running tempdb on ephemeral storage for the performance improvements, running databases on their own disks with their own IOPS allocation, things like that).

# ? Sep 21, 2021 09:16

Adbot: ADBOT LOVES YOU

# ? May 15, 2024 19:59

freeasinbeer: Mar 26, 2015; by Fluffdaddy

MS SQL on RDS has a bunch of limitations and is kinda a pain to use. I�ve been 2 places that had it; and I�d honestly say it might not be worth it to run it in RDS.

Biggest pain points:

DBA is gonna be pissy because they are locked out of tuning most things; which is actually a pro in the end, but they will all complain

Doing replication to a secondary database for things like BI or multi region is a PITA; you can�t touch any of MS SQLs replication so you need to run DMS, which barely supports MS SQL, or use a janky third party tool that costs $10k a month.

Stuff will failover; for some reason I�ve found that apps that use ms sql to be abnormally bad at reconnecting.

It�s sooooo expensive and doesn�t have support for things like x1 instances which would save a ton because of lower core counts.

Also you can�t BYOL

# ? Sep 21, 2021 14:30

Docjowles: Apr 9, 2009

freeasinbeer posted:

use a janky third party tool that costs $10k a month.

This is really an excellent database_admin.txt summary. Third party database tools have to have one of the worst price:quality ratios of code in existence. They must be written by the same people who develop applications for banks or healthcare that look like MS-DOS ASCII art and cost 8 figures.

# ? Sep 21, 2021 16:28

Agrikk: Oct 17, 2003; Take care with that! We have not fully ascertained its function, and the ticking is accelerating.

From a licensing perspective MSSQL on EC2 can be a lot cheaper: if I recall correctly, maybe it�s changed but an org is permitted to run a database instance license free if it is in a dev environment. Meaning that SQL on EC2 is cheaper than RDS SQL.

Also, there are some specific use cases in which a highly customized EC2 storage architecture will run a workload faster than RDS (things like multiple mdf/ndf files spread across multiple EBS volumes to speed up I/O, etc).

But on the downside managing patches and downtime on EC2 has an associated cost and RDS is more resilient and just works* whereas EC2 requires a much more pets-not-cattle approach.

*mostly

# ? Sep 21, 2021 20:12

luminalflux: May 27, 2005

22 Eargesplitten posted:

Is there any viable reason to be using MS SQL on an EC2 instance rather than in an RDS instance?

In our case: s/MS SQL/MySQL/ and yes, absolutely. We decided to shift from RDS MySQL to MySQL on EC2 due to a combination of things:

* RDS auto-upgraded us to version with a deadlock bug in it (5.7.22) and the version with the bugfix (5.7.25) wasn't available. A downgrade was not easily possible.
* RDS would go into crash recovery on every failover due to issues we didn't realize until a lot later
* RDS wasn't giving us enough IOPS. On EC2, we went from EBS (network block storage) to striping over multiple EBS volumes to striping over multiple local NVMe devices on i3 instances.
* RDS didn't give us some tuneable parameters or access to files on disk for advanced analytics/monitoring
* RDS charges a pretty premium over EC2.

Sometimes the AWS managed service works great and you don't have to care about it. Sometimes the AWS service doesn't and you just run MySQL/Redis/Kafka instead of RDS / Elasticache / MSK

# ? Sep 21, 2021 21:34

Enshoku: Jun 1, 2013

I am consistently surprised at how expensive every other database option aside from dynamoDB is on AWS. It feels like Amazon is trying to drive people to use it over RMDBS options on price alone. Not that I'm complaining though, it's generally faster for me to slap together some single table database design than to actually have to think about relationships, it generally being cheaper is just a nice side benefit.

# ? Sep 24, 2021 15:44

Hughmoris: Apr 21, 2007; Let's go to the abyss!

Any advice on getting my foot in the door for an entry-level / junior AWS job? I'm slowly working my way through a SysOps Administrator course but would love to land a gig where I can actually work with AWS, even at a Tier 1 level. LinkedIn searched for "AWS Help Desk" turns up next to nothing. What type of job titles should I be looking for?

# ? Sep 26, 2021 03:31

Agrikk: Oct 17, 2003; Take care with that! We have not fully ascertained its function, and the ticking is accelerating.

Cloud Support Associate
Cloud Support Engineer

These are the helpdesk positions you are thinking of. A year or two there and you�ll be prepared to move into other roles.

# ? Sep 26, 2021 03:49

Hughmoris: Apr 21, 2007; Let's go to the abyss!

Agrikk posted:

Cloud Support Associate
Cloud Support Engineer

These are the helpdesk positions you are thinking of. A year or two there and you�ll be prepared to move into other roles.

I figured I was missing something. I'm getting much more results with those two, thanks!

# ? Sep 26, 2021 04:06

Arzakon: Nov 24, 2002; "I hereby retire from Mafia"
Please turbo me if you catch me in a game.

Look for Associate Solution Architect or Associate Professional Services Consultant as well. AWS Tech U is the program my org uses when we want to hire lots of entry level talent. Good place to keep an eye out for opportunities if you don't have any IT work history but can demonstrate some depth in a few areas. These listings typically start as 1 year paid internships that hire directly into full time associate level roles.

# ? Sep 26, 2021 06:03

Hughmoris: Apr 21, 2007; Let's go to the abyss!

Arzakon posted:

Look for Associate Solution Architect or Associate Professional Services Consultant as well. AWS Tech U is the program my org uses when we want to hire lots of entry level talent. Good place to keep an eye out for opportunities if you don't have any IT work history but can demonstrate some depth in a few areas. These listings typically start as 1 year paid internships that hire directly into full time associate level roles.

Thanks for this.

Was anyone affected by the (what I've heard) us-east-1 EC2 stumble this morning?

# ? Sep 27, 2021 19:37

Hughmoris: Apr 21, 2007; Let's go to the abyss!

I believe we have some AWS people floating about. Anyone in the vicinity of this New World launch? If anyone could have nailed an MMO infrastructure launch I figured it would be AWS but queue times have been awful all day.

Disregard, not appropriate to ask in here.

Hughmoris fucked around with this message at 04:31 on Sep 29, 2021

# ? Sep 29, 2021 02:42

FamDav: Mar 29, 2008

you�re not going to get any details on another customer

# ? Sep 29, 2021 04:28

Walked: Apr 14, 2003

Anyone run into Redshift snapshots not capturing views? AWS support says its not expected behavior, but have had some scheduling conflicts getting a screenshare going. Curious if anyone has run into this while I try to nail down scheduling.

# ? Oct 1, 2021 14:45

Just-In-Timeberlake: Aug 18, 2003

Hoping someone in here knows a bit about Application Load Balancer and Lambda functions.

I'm trying to access a Lambda .netcore function via ALB vs the API Gateway (API gateway is the way it's currently accessed, and works) because API Gateway has a max timeout of 30 seconds, and there are times it will take longer, so ALB seems the way to get around this.

Here's what I've done:

1. Changed the .netcore project's Lambda entry point to use ApplicationLoadBalancerFunction instead of APIGatewayHttpApiV2ProxyFunction.
2. Created an application load balancer using the same VPC and subnets the Lambda currently uses (3 AZs). The target group points to the Lambda in question.
3. Changed the DNS entry to point to the ALB.

The target group has a health check url that points to the root of the Lambda and just returns an "ok" 200 message. Looking at the dashboard, the target group shows as healthy, with 1 healthy host, and 0 unhealthy hosts. So, the health check is working as far as I can tell.

The problem is I can't access it via any method (HTTP, HTTPS + Postman, web browser, etc), it just times out.

For the purpose of troubleshooting, I've set the security group the ALB uses to allow all traffic, to no effect.

I'm obviously missing something, but I don't know what it is. Any insight appreciated.

# ? Oct 20, 2021 16:48

CarForumPoster: Jun 26, 2013; â¡POWERâ¡

Just-In-Timeberlake posted:

I'm trying to access a Lambda .netcore function via ALB vs the API Gateway (API gateway is the way it's currently accessed, and works) because API Gateway has a max timeout of 30 seconds, and there are times it will take longer, so ALB seems the way to get around this.

I'm not super knowledgeable about AWS but why would you use ALB to get around the Lambda function time exceeding the API Gateway timeout?

When I've had this problem, instead of returning what I want from my lambda function I return an ID that can be queried for its completed result, which I store somewhere.

# ? Oct 20, 2021 18:05

Vanadium: Jan 8, 2005

Does the ALB have a public IP address or is it internal? Do the subnets have internet gateways or nat gateways and stuff like that?

Did you configure a listener on the ALB or only a target group?

# ? Oct 21, 2021 09:19

Just-In-Timeberlake: Aug 18, 2003

Vanadium posted:

Does the ALB have a public IP address or is it internal? Do the subnets have internet gateways or nat gateways and stuff like that?

Did you configure a listener on the ALB or only a target group?

The ALB is configured to use a VPC we have set up for this Lambda that gives it an outgoing static IP address (for restricting access to resources via source IP). In Route 53 I've got the dns entry pointing to the ALB instance. The VPC is configured correctly (Internet gateway + NAT gateway, 3 public facing subnets, 3 private subnets, routing table entries), as this routes traffic just fine when using API Gateway, but not the ALB.

I have HTTP and HTTPS listeners configured on the ALB, forwarded to the Lambda target group

CarForumPoster posted:

I'm not super knowledgeable about AWS but why would you use ALB to get around the Lambda function time exceeding the API Gateway timeout?

When I've had this problem, instead of returning what I want from my lambda function I return an ID that can be queried for its completed result, which I store somewhere.

Mainly because if I can get this working it's a lot less work than refactoring a bunch of code.

Just-In-Timeberlake fucked around with this message at 12:24 on Oct 21, 2021

# ? Oct 21, 2021 12:15

Bitcoin: Sep 12, 2021

Just-In-Timeberlake posted:

The problem is I can't access it via any method (HTTP, HTTPS + Postman, web browser, etc), it just times out.
For the purpose of troubleshooting, I've set the security group the ALB uses to allow all traffic, to no effect.

Can you reach it from the same VPC? Have you accidentally created an internal ALB which won't be reachable outside no matter what your security groups say because it's not routable?

# ? Oct 28, 2021 01:22

Just-In-Timeberlake: Aug 18, 2003

Bitcoin posted:

Can you reach it from the same VPC? Have you accidentally created an internal ALB which won't be reachable outside no matter what your security groups say because it's not routable?

I'm like 99% certain it's set up right, the wizard pretty much makes sure you've got everything set correctly when creating an ALB. I'm the farthest thing from an AWS expert, so I'm not sure how to test connecting from within the same VPC.

# ? Oct 28, 2021 22:46

Agrikk: Oct 17, 2003; Take care with that! We have not fully ascertained its function, and the ticking is accelerating.

Stand up an instance within the VPC and curl the ALB�s DNS from it.

# ? Oct 29, 2021 08:45

ledge: Jun 10, 2003

Have you run the reachability analyzer? That worked for me when I was having trouble with network nonsense earlier today.

# ? Oct 29, 2021 12:18

CarForumPoster: Jun 26, 2013; â¡POWERâ¡

ledge posted:

Have you run the reachability analyzer? That worked for me when I was having trouble with network nonsense earlier today.

Not the OP but I didnt know about this, thanks for posting it!

# ? Oct 29, 2021 13:57

Scrapez: Feb 27, 2004

Does AWS provide an Amazon Linux 2 kernel 5.10 AMI somewhere? The only ones I can find for kernel 5.10 are community AMIs.

The quick start Amazon Linux 2 AMI is kernel 4.14.

# ? Nov 17, 2021 18:56

crazypenguin: Mar 9, 2005; nothing witty here, move along

Yeah. It�s in amazon-linux-extras

I forget what version exactly but definitely a newer 5.x

# ? Nov 17, 2021 19:09

Pile Of Garbage: May 28, 2007

crazypenguin posted:

Yeah. It’s in amazon-linux-extras

I forget what version exactly but definitely a newer 5.x

kernel-ng iirrc

# ? Nov 18, 2021 11:29

astral: Apr 26, 2004

https://aws.amazon.com/blogs/aws/aws-free-tier-data-transfer-expansion-100-gb-from-regions-and-1-tb-from-amazon-cloudfront-per-month/
https://aws.amazon.com/about-aws/whats-new/2021/11/aws-price-reduction-data-transfers-internet/

Amazon's feeling some pressure from Cloudflare.

# ? Nov 28, 2021 09:24

Agrikk: Oct 17, 2003; Take care with that! We have not fully ascertained its function, and the ticking is accelerating.

Or going after Cloudflare.

# ? Nov 30, 2021 21:15

Scrapez: Feb 27, 2004

Is there a way to spin up an EC2 instance with only an ENI? No built in NIC?

We need static IPs and the ability for an instance that has died and been replaced with autoscaling to get that same static IP back.

Currently accomplishing that by a user-data script that the instance runs to discover some info about itself (region, AZ, purpose of instance) and attaches the appropriate ENI to itself.

The issue is that you then have two NICS. The built in and the ENI. We never use the built in for anything so it's pointless for it to be there and causes issues with some of the software we are running as it tries to default to the built in even when it's down.

# ? Dec 1, 2021 23:54

fletcher: Jun 27, 2003; ken park is my favorite movie; Cybernetic Crumb

Scrapez posted:

Is there a way to spin up an EC2 instance with only an ENI? No built in NIC?

We need static IPs and the ability for an instance that has died and been replaced with autoscaling to get that same static IP back.

Currently accomplishing that by a user-data script that the instance runs to discover some info about itself (region, AZ, purpose of instance) and attaches the appropriate ENI to itself.

The issue is that you then have two NICS. The built in and the ENI. We never use the built in for anything so it's pointless for it to be there and causes issues with some of the software we are running as it tries to default to the built in even when it's down.

Why not stick these instances behind a NAT gateway so they all have the same public IP?

# ? Dec 2, 2021 00:39

ledge: Jun 10, 2003

fletcher posted:

Why not stick these instances behind a NAT gateway so they all have the same public IP?

Or a load balancer.

# ? Dec 2, 2021 01:13

12 rats tied together: Sep 7, 2006

I'll take a guess:
- all the load balancers use RR dns that you dont get to pick, so the IP will change out from under you over time and you will always have at least 2 IPs*
- NAT gateway only works for egress traffic since it's dynamic and not static NAT*

*- I think

OP I think we both know this already but the best answer would be to use software that isn't terrible garbage and will respect your configured route table. Since thats probably out of your control, if I had to deal with this, I'd probably bake something into cloud-init that disabled the problematic interface on startup.

# ? Dec 2, 2021 01:33

crazypenguin: Mar 9, 2005; nothing witty here, move along

If you�re having the instance assign itself an ENI on startup you could instead just have it assign itself an elastic IP, no? Never tried this, so dunno about gotchas, but seems like it�d work. The closest I did was have instances update their DNS on startup

(I assume this is one of those �ASGs of size exactly 1� situations right?)

# ? Dec 2, 2021 02:32

Scrapez: Feb 27, 2004

crazypenguin posted:

If you�re having the instance assign itself an ENI on startup you could instead just have it assign itself an elastic IP, no? Never tried this, so dunno about gotchas, but seems like it�d work. The closest I did was have instances update their DNS on startup

(I assume this is one of those �ASGs of size exactly 1� situations right?)

TL;DR:
I wasn't very clear earlier that it is the private IP I need to be static, not the public (though I need an elastic as well)

I had a think about this and crazypenguin is exactly right. The solution, it seems, is to setup an ASG/launch config with user-data that assigns the built in NIC a specific static IP based on the Region/AZ/function of the instance and a size of 1. I'll have to do this for 2 regions, 3 AZs per region, and 2 types of instances (12 total ASGs) but it should work.

The software in question is an IVR and it is distributed across instances that perform different functions. There are 3 instances in a Zookeeper quorum that are the "brains" that keep track of calls, queries to backend database, etc. There are 3 instances that are used for telephony that answer the calls and then communicate with the 3 Zookeeper brain instances. That's one of the reasons we need static private IPs. The other reason is that we use SRV records and UDP to route calls. Load balancers don't offer UDP health checks so it's difficult for them to know the state of the telephony instances. These calls come into us via VPN so they're routing to the private IP space.

I was able to configure the instances to use eth0 (the ENI) for everything and leave ens5 (the built-in) down. I just had to update the ifcfg-ens5 ONBOOT=no so that ens5 stays down after a reboot and update the default route to eth0. I just don't like having an unused NIC sitting there all the time. If it somehow were to get enabled, then our traffic defaults out that route and causes problems. So, seems the size 1 ASG method with static private IPs is the answer. The only gotcha I can think of there is how long the AWS DNS servers cache that IP entry. Say I have an instance with IP 192.168.0.5 crash, ASG launches a new instance to take its place and configures it with the 192.168.0.5 IP. Does DNS still have some sort of cached route that tries to send traffic to the old crashed instance that's no longer there? That's the advantage of the ENI, it gets detached and attached to a new instance and the route stays the same.

# ? Dec 2, 2021 06:05

luminalflux: May 27, 2005

Zookeeper doesn't need static IPs - you can put your current Zookeeper instances in a Route53 record, once the client connects to the first instance in the round-robin DNS it will find the other active instances. We do this for our ZK clusters and we roll them every so often.

Does the IVR software also have something you can hit over TCP to see if it's available? If so, you can use an NLB that forwards traffic over UDP but healthchecks over TCP.

The default VPC DNS resolver respects the TTL you set for DNS records - you'll notice that ELB/NLB/ALBs have 60s TTLs or lower on their records. Your OS and client software, however, might not.
Also if they're registering themselves in ZK, do you have enough control over the client for it to query ZK for which instances are active?

# ? Dec 2, 2021 07:05

ledge: Jun 10, 2003

Scrapez posted:

The software in question is an IVR and...

So the answer here is rebuild the IVR in Connect

Then you can forget about maintaining servers and networks ever again.

# ? Dec 2, 2021 09:23

necrobobsledder: Mar 21, 2005; Lay down your soul to the gods rock 'n roll; Nap Ghost

We use ZK at scale here and don�t use IPs and ensure a lot of availability within a region to ensure quorum even with a loss of 2 AZs simultaneously. Last I used static ZK IPs was for a demo maybe around 2014 out of sheer laziness. It�s not evil to use DNS within a region or something. I can understand being allergic to DNS for a number of cases but ZK doesn�t strike me as one of them.

# ? Dec 2, 2021 20:10

Scrapez: Feb 27, 2004

Probably didn't explain well. The telephony instances require communication with the zookeeper "brain" instances on static IPs. It would be possible to do this with DNS but then we would have to dynamically update Route53 with the updated IP if one of the zookeeper instances died and respun with a new IP. I have successfully done this with Cloudwatch and a Lambda function previously so it might be doable.

The reason for the need for static IPs on the telephony instances is that we have other companies that connect to our IVR via VPN. Their crappy PBX and switch software often can't handle using a FQDN and if they can, they often cache the entry in perpetuity. This forces us to accommodate them by keeping the same static IPs for the telephony instances. We could solve this with a SIP Proxy but my company has been too cheap to purchase one thus far.

All good info that makes sense so thanks for the responses. At minimum, I could stop using static IPs for the "brain" instances.

# ? Dec 2, 2021 22:05

luminalflux: May 27, 2005

Scrapez posted:

Probably didn't explain well. The telephony instances require communication with the zookeeper "brain" instances on static IPs. It would be possible to do this with DNS but then we would have to dynamically update Route53 with the updated IP if one of the zookeeper instances died and respun with a new IP. I have successfully done this with Cloudwatch and a Lambda function previously so it might be doable.

Yep, this isn't too hard and ZK nodes don't get replaced too often, and even if they did the round-robin record will have enough live ones for the client to discover the rest of the cluster. Especially if you have more than 5 you should have no issue losing quorum even if you have lose a whole AZ. You can also look in to placement groups to ensure that your ZK nodes aren't all in the same rack / hypervisor. As you said, you can build something with EventBridge (nee Cloudwatch Events) and Lambda, or just update the record when they get replaced. In general, relying on static IPs is an antipattern but especially inside a public cloud provider.

quote:

The reason for the need for static IPs on the telephony instances is that we have other companies that connect to our IVR via VPN. Their crappy PBX and switch software often can't handle using a FQDN and if they can, they often cache the entry in perpetuity. This forces us to accommodate them by keeping the same static IPs for the telephony instances. We could solve this with a SIP Proxy but my company has been too cheap to purchase one thus far.

Yeah, in that case I would look in to using an NLB exposing them over UDP and healthchecking the instances over TCP. I've done this where I've made a simple python service that returns the output of "systemctl status crappy-telephony-product.service" over HTTP which the LB uses to check health of something that has no HTTP or TCP endpoint.

luminalflux fucked around with this message at 06:40 on Dec 3, 2021

# ? Dec 2, 2021 22:12

Adbot: ADBOT LOVES YOU

# ? May 15, 2024 19:59

Scrapez: Feb 27, 2004

luminalflux posted:

Yeah, in that case I would look in to using an NLB exposing them over UDP and healthchecking the instances over TCP. I've done this where I've made a simple python service that returns the output of "systemctl crappy-telephony-product.service" over HTTP which the LB uses to check health of something that has no HTTP or TCP endpoint.

We had this setup and one time the IVR software inexplicably stopped listening on UDP but was still responding to TCP. So our health checks were passing and we were dropping all calls. It was a one time, one-off event but based on that a VP decided we should never do that again. :rolleyes:

I just need to make another run at explaining how we need a SIP Proxy. Hell Kamailio is free and awesome. But again management decided it's "opensource" and we can't trust it to run all of our traffic through. Meanwhile, we are running CentOS 7 on all the instances. :doh:

Anyway, now I'm just ranting. I do appreciate the input.

# ? Dec 3, 2021 03:40

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Amazon Web Services - Cloud Giant Hits Hard

«‹›61 »