Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Hughmoris
Apr 21, 2007
Let's go to the abyss!
This is a rookie DE/ETL question but I'm trying to wrap my head around the AWS tools I should be using...

As an exercise, I want to first download twelve gzipped CSV files from this LEGO database. Then I want to move that data in to a new AWS RDS for MySQL using the given schema. The end result being I can write queries against that db.

https://rebrickable.com/downloads/

What is a "modern AWS " way to do this? Azure Data Factory has "low code" pipelines that makes it relatively simple but I'm not sure how to go about it with AWS.

*Here is the Azure Data Factory project that I'm trying to reproduce using AWS tools: https://www.cathrinewilhelmsen.net/series/beginners-guide-azure-data-factory/page/2/

Adbot
ADBOT LOVES YOU

Happiness Commando
Feb 1, 2002
$$ joy at gunpoint $$

Put the CSVs in S3, use a glue crawler to read the CSVs and output it to RDS, I think.

Alternatively use Athena to query the tabular data in S3 directly.

Hughmoris
Apr 21, 2007
Let's go to the abyss!

Happiness Commando posted:

Put the CSVs in S3, use a glue crawler to read the CSVs and output it to RDS, I think.

Alternatively use Athena to query the tabular data in S3 directly.

Thanks for the ideas. I'm guessing I can use AWS Lambda and write a python function that could get/copy the CSV files from the website and then place on to S3, then roll from there.

Agrikk
Oct 17, 2003

Take care with that! We have not fully ascertained its function, and the ticking is accelerating.
Create a lambda function to pull the files into S3

Then either

Point Athena at the bucket

Or

Data Pipeline (or your own ETL script on a t3.micro instance) to load the CSV from S3 to RDS

And:

A lambda function to turn on/off the EC2 instance when not processing the CSV

BaseballPCHiker
Jan 16, 2006

Oh man speaking of Glue/Athena/etc.

How cool is that new CloudTrail DataLake service! For my poo poo show of an org that will be a huge benefit. If I could only convince them to pay for it now....

EDIT: And while I'm at it. All the EKS alerts for GuardDuty are huge! Seriously nice work by that team and I hope more are in the pipeline.

Hughmoris
Apr 21, 2007
Let's go to the abyss!

Agrikk posted:

Create a lambda function to pull the files into S3

Then either

Point Athena at the bucket

Or

Data Pipeline (or your own ETL script on a t3.micro instance) to load the CSV from S3 to RDS

And:

A lambda function to turn on/off the EC2 instance when not processing the CSV

Exactly what I was looking for, thanks!

BaseballPCHiker posted:

Oh man speaking of Glue/Athena/etc.

How cool is that new CloudTrail DataLake service! For my poo poo show of an org that will be a huge benefit. If I could only convince them to pay for it now....

EDIT: And while I'm at it. All the EKS alerts for GuardDuty are huge! Seriously nice work by that team and I hope more are in the pipeline.

The amount of services/tools available is staggering to me. Just looking at Data Engineering, there is a shitload of services/tools to wrap your head around. I can imagine Security is even more so.

We live in cool times.

Hughmoris
Apr 21, 2007
Let's go to the abyss!
My new employer has pockets and is willing to pay for good training. Is there any well-respected AWS training courses/companies that I should check out? They'd need to be data focused courses.

I'm thinking something like what SANS does for cyber training.

ledge
Jun 10, 2003

Hughmoris posted:

My new employer has pockets and is willing to pay for good training. Is there any well-respected AWS training courses/companies that I should check out? They'd need to be data focused courses.

I'm thinking something like what SANS does for cyber training.

These are the AWS courses for data, just need to find a training provider, they are linked from the specific courses I think. All the providers are following the same course if you go through this.

https://www.aws.training/LearningLibrary?query=&filters=Domain%3A107%20Language%3A1&from=0&size=15&sort=_score

I've used Bespoke Training in Australia who were good.

JHVH-1
Jun 28, 2002

Agrikk posted:

Create a lambda function to pull the files into S3

Then either

Point Athena at the bucket

Or

Data Pipeline (or your own ETL script on a t3.micro instance) to load the CSV from S3 to RDS

And:

A lambda function to turn on/off the EC2 instance when not processing the CSV

If you can run it from ECS they have tasks that just run and exit. I am using it like that to do the reverse, dump a database to an s3 location to create a file that is always the latest data so a 3rd party can analyze it. It just took creating a docker file and setting up the task.

Agrikk
Oct 17, 2003

Take care with that! We have not fully ascertained its function, and the ticking is accelerating.

I always forget about containers. It's like I grew up in a servers-as-pets world and then skipped straight to serverless-as-Lambda.




For those looking for free and interesting datasets to build workloads from, here's a list of the data sources I use:

United States Geological Survey earthquake catalog - get running list of all the reported and detected earthquakes wordwide
https://earthquake.usgs.gov/fdsnws/event/1/

Visual Crossing Weather data - there's a free tier for pulling weather data from locations around the world
https://www.visualcrossing.com/weather-api

also WeatherUnderground
http://api.wunderground.com/api

Geyser eruption times for all the geysers in Yellowstone National Park. This one is exceptionally fun to use with Machine Learning: Try to create an AI/ML that will predect eruption times!
https://www.geysertimes.org/api/v5/docs/index.php

Lego database:
https://www.kaggle.com/rtatman/lego-database

Eve Online market database. Pull every market order in Eve online in near real time
https://esi.evetech.net/ui/

couple that with the Eve Online static data export to build nifty web apps
https://developers.eveonline.com/resource

Folding@home - Forgot my favorite for Big Data stuff:
https://apps.foldingathome.org/daily_user_summary.txt.bz2
https://apps.foldingathome.org/daily_team_summary.txt.bz2

The F@H user data is a fun one because there's something like two million rows in a single file. Pull it every time it refreshes (like every 90 minutes or so) and you can have a billion records in a table after about four months (My Users data table has 5.7 billion rows and is about 900gigs in size which is really helpful for managing and manipulating data at scale.

Agrikk fucked around with this message at 02:57 on Feb 4, 2022

LtDan
May 1, 2004


Whoa that is good stuff, thanks!

Thanks Ants
May 21, 2004

#essereFerrari


The TfL data sets can be interesting to work with as well:

https://tfl.gov.uk/info-for/open-data-users/our-open-data?intcmp=3671

Scrapez
Feb 27, 2004

Goal: Have an autoscaling group that launches 4 instances with ENIs as the primary and only network interface.
Reason: We have purchased carrier IP space and do not want to use 2 IPs for every instance when we only need one.

Having a hard time coming up with a way to do this. I can create an autoscaling group, launch instances with a dynamic IP on eth0 and then use user-data to attach an ENI as eth1. But that uses up two of our IPs per instance.
I can create a launch template and define an ENI in it. I can then launch a single instance with the launch template and it comes up with the ENI as the only network interface. I don't see a way to do this with autoscaling, though.
I'd be fine creating 4 autoscaling groups with a min/max of 1 instance but when I try to create an autoscaling group with a launch template that has an ENI definition in it, it fails with "Incompatible launch template: Network interface ID cannot be specified as console support to use an existing network interface with Auto Scaling is not available."

Essentially, what I need is a way to have a pool of 4 ENIs and tell autoscaling to use that pool when launching an instance. Does such a thing exist in AWS currently?

Scrapez fucked around with this message at 20:14 on Feb 7, 2022

12 rats tied together
Sep 7, 2006

The autoscaling group resource has a field for specifying a launch template, the same kind that you would use for an EC2 instance. This is distinct from and mutually exclusive with the usual "ASG Launch Configuration" config item.

I don't have great access to the ASG web interface at the moment but this setting should be hiding in there somewhere.

cage-free egghead
Mar 8, 2004
I am completely new to this but currently studying for the SOA-C02 so forgive me if I'm far off base here but I wanna take a stab at it.

In this case, would attaching the ASG to a VPC and a gateway work?

Hed
Mar 31, 2004

Fun Shoe
I'm dumping files to S3 and on a schedule need to take all the new ones and convert them to custom avro and make avro files for every N files.

I have a Python function do bottle up and convert, is there a more elegant way to do this than s3 sync to a computer with a real file system and push it back? I've used Kinesis firehouse on ingest but don't see anything that could accomplish what I want.

CarForumPoster
Jun 26, 2013

⚡POWER⚡

Hed posted:

I'm dumping files to S3 and on a schedule need to take all the new ones and convert them to custom avro and make avro files for every N files.

I have a Python function do bottle up and convert, is there a more elegant way to do this than s3 sync to a computer with a real file system and push it back? I've used Kinesis firehouse on ingest but don't see anything that could accomplish what I want.

Could schedule an AWS lambda function (e.g. cron or rate) to do it if the 15min timeout isn't an issue in your application.

It sounds like you dont want to trigger on each new file in the S3 bucket but if you did, AWS lets you trigger a lambda by adding a file to S3 p easily.

ledge
Jun 10, 2003

CarForumPoster posted:

Could schedule an AWS lambda function (e.g. cron or rate) to do it if the 15min timeout isn't an issue in your application.

It sounds like you dont want to trigger on each new file in the S3 bucket but if you did, AWS lets you trigger a lambda by adding a file to S3 p easily.

Yeah, lambda on a schedule would be the way to go. Just setup a rule in EventBridge to call the lambda.

Edit: It will be easier to do this file by file if you trigger the lambda for each created file in the s3 bucket, as that way you get passed the details of the item (bucket arn and object key) when the lambda is triggered. If you run it on a scheduled you'll have to call the s3 api to list the objects and iterate through them. But if the avro grouping is a requirement that isn't an option.

ledge fucked around with this message at 05:08 on Feb 18, 2022

Hughmoris
Apr 21, 2007
Let's go to the abyss!
Does anyone have any experience, or heard of experiences, for working at an AWS DoD gig?

ClearanceJobs has a ton of openings for AWS gigs that look interesting.

lazerwolf
Dec 22, 2009

Orange and Black
I have a React app I am looking to host on AWS. I have a few constraints:
I can’t use S3 to host because the bucket must be completely private.
Access to the app is only intranet or company VPN.

Basically all public facing solutions are out.

I was exploring cloud front serving the private S3 files and putting a WAF on top limiting IP ranges.

Is there a better more sustainable solution?

Ideally I’d like to template this with Terraform so I can spin up the same stack for the next series of web apps.

Happiness Commando
Feb 1, 2002
$$ joy at gunpoint $$

You don't need a WAF to limit ingress IP - you can do that with regular security groups. It's a fine thing to add if you want, though

whats for dinner
Sep 25, 2006

IT TURN OUT METAL FOR DINNER!

Happiness Commando posted:

You don't need a WAF to limit ingress IP - you can do that with regular security groups. It's a fine thing to add if you want, though

If he's going via CloudFront he will because you can't attach security groups to a CloudFront distribution.


lazerwolf posted:

I was exploring cloud front serving the private S3 files and putting a WAF on top limiting IP ranges.

We use basically this setup on our end, all in Terraform as well. But because we all work remote and none of us have static IPs it requires us to route requests for CloudFront's public IP addresses through our VPN. So, if that's something you have to worry about you might get a lot more traffic on your VPN than you bargained for.

Woodsy Owl
Oct 27, 2004

lazerwolf posted:

I have a React app I am looking to host on AWS. I have a few constraints:
I can’t use S3 to host because the bucket must be completely private.
Access to the app is only intranet or company VPN.

Basically all public facing solutions are out.

I was exploring cloud front serving the private S3 files and putting a WAF on top limiting IP ranges.

Is there a better more sustainable solution?

Ideally I’d like to template this with Terraform so I can spin up the same stack for the next series of web apps.

S3 proxy integration with a private API gateway (requires VPC endpoint, and I presume you can already tunnel in to your VPC) could achieve what you want here if you want to be fully serverless.

Otherwise you could just use nginx on an EC2 instance or if you need some redundancy you could do a service on Fargate.

crazypenguin
Mar 9, 2005
nothing witty here, move along
Maybe an S3 Access Point? I haven’t tried it so there could be quirks, but looks feasible.

Edit: Maybe don’t even need the access point. I had an S3 bucket with just a resource policy allowing based on “aws:sourceVpc”. Might work if the vpcs have S3 gateway endpoints, and you just need to allow by vpc and not some general IP thing.

crazypenguin fucked around with this message at 23:59 on Mar 8, 2022

nullfunction
Jan 24, 2005

Nap Ghost
AWS newbie here. For giggles, I went about setting up a GitHub Action that pushes changes to a Lambda function on commit to a certain branch, following this tutorial from AWS: https://aws.amazon.com/blogs/compute/using-github-actions-to-deploy-serverless-applications/

After a lot of struggling with permissions, I've gotten it to work, but I have a question regarding prereq #3, "An AWS account with permissions to create the necessary resources."

I wanted to grant as little privilege as possible to the user I have associated with this workflow, but there doesn't seem to be any information on what should be granted. I'm sure some of this will depend on what exactly ends up in the CloudFormation template, but I was able to figure some of it out from the error messages along the way and have tightened down policies for:

- Putting objects into just the S3 bucket that contains the template and deployment file
- Granting GET/POST/PATCH on the API Gateway to point it to my Lambda

The errors I was getting out of the SAM CLI weren't always specific, so the easiest way for me to make progress was to apply full access for CloudFormation, Lambda, and IAM on that user, which I know is the wrong thing to do. I'm not sure how to drill down to just the permissions needed to run the deployment and nothing else based on the error messages, so I thought to set up a CloudTrail event log and filter down to the deployment user once I got everything working, and then work backward from those logs to define a policy, which I could then apply to other users that correspond to different repos on the Github side.

There's a better way to do this... right?


Just found the thing where you can generate a policy from CloudTrail events. I knew it had to exist somewhere.

nullfunction fucked around with this message at 01:09 on Mar 9, 2022

Happiness Commando
Feb 1, 2002
$$ joy at gunpoint $$

I just got asked about Aurora multi-region multi-master, which doesn't exist. Now I get to have a whole bunch of meetings to determine business requirements and figure out what architecture will actually suffice

Woodsy Owl
Oct 27, 2004

Happiness Commando posted:

I just got asked about Aurora multi-region multi-master, which doesn't exist. Now I get to have a whole bunch of meetings to determine business requirements and figure out what architecture will actually suffice

You can do master-master replication across regions between Aurora MySQL if you're willing to have an EC2 instance in each region to fix egress and set auto_increment_increment and auto_increment_offset appropriately

Plank Walker
Aug 11, 2005
I'm working on migrating some services from being manually provisioned via the AWS console to using CDK instead. The application architecture is a web-facing service running on ECS to put jobs into an SQS queue and a backend service running on ECS to retrieve jobs from the queue and process them. So far, I'm implementing this in 3 tiers of stacks, 1 top level stack for resources shared company-wide across multiple applications (VPC, an S3 scratch bucket, etc), 1 application level "shared" stack that sets up the SQS queues, permissions, and ECR repositories for both aspects of the application code, and finally a stack each for the web API and backend processing ECS deployments.

The stack to deploy ECS requires a task definition that points to the image in ECR, so when the application code changes, we create and tag a new docker image and push to ECR. But afterwards, what is the "correct" way to update the running tasks? Should the ECS task definition be updated via running cdk deploy or running the aws update-service CLI command? We had a consultant help set this up initially, but they left us with deployment stage using both methods, which seems like overkill, plus deploying via the ECS stack resets the number of desired instances so I feel like going CLI only for application version updates is the correct way.

Regardless of which deployment method, I've found that I also need to store the latest version tag in ssm so that if we do update anything in the CDK stack (things like instance type, scaling parameters, etc), the task definition can find the correct latest version, but I guess my main question is how close is this setup to "standard" and is it supposed to feel this convoluted.

Woodsy Owl
Oct 27, 2004

Plank Walker posted:

Should the ECS task definition be updated via running cdk deploy

Yes you can do it this way. Cdk will build a new image and push the image to the bootstrap ECR repository or your specified repository, then publish a new task definition with the new image, and start an ECS deployment of your service.

Are you using any of the aws-cdk/ecs-patterns constructs?

Scrapez
Feb 27, 2004

Edit: I found the answer on AWS mappings site: "You can't include parameters, pseudo parameters, or intrinsic functions in the Mappings section."

So, does someone have a suggestion on how I would import the VPCID for the region I'm executing the security group cloudformation template in? I don't want to hardcode the value of the VPC into the template because then I'll have to have a different template for each region. I'd like to only have a single network template and security group template that I can execute in multiple regions.

Cloudformation Outputs, Imports, Mappings question:

I have a cloudformation that builds the network pieces VPC, Subnets, etc etc. I've executed this in two different regions and it has an output section that outputs various resource values. For instance: it outputs the VPC ID:
code:
        "VPCID": {
            "Value": {
                "Ref": "VPC"
            },
            "Description": "VPC ID",
            "Export": {
                "Name": {
                    "Fn::Sub": "${AWS::StackName}-VPCID"
                }
            }
        }
I have a separate cloudformation template that will create security groups and rules and I'm trying to import the appropriate VPCID for the region I'm executing the security group template in.

I've tried to define a mapping that pulls in the appropriate value of the VPC ID like so:
code:
   "Region" : {
      "us-east-2" : {
		"regionVPC" : {"Fn::ImportValue" : {"Fn::Sub" : "${Production-OH-Network}-VPCID"}},
However, when I execute the template, I receive the error message: "Template format error: Every Mappings attribute must be a String or a List."

Once it does the sub and imports the value, it should be a single string so I'm a bit stumped as to why it doesn't like that. Anyone have an idea?

Scrapez fucked around with this message at 21:19 on Mar 28, 2022

12 rats tied together
Sep 7, 2006

I don't think that you can use any of the cfn intrinsic functions in Mappings, but it's a little hard to say if that is the exact issue here because I'm not super clear on where the Mappings key starts in your second example.

Without any other context, in this scenario, I would recommend two things:

1- If you can avoid prefixing your "production vpc stack" with a per-region name, you can just import it directly. You can just call it production, export it as production-VpcId and then instead of using Mappings, just Fn::ImportValue production-VpcId. Since the stack must exist in only a single region, the region is implied (and available elsewhere in the API), and you don't need the mapping.

2- Since you can't change the VpcId of a security group without deleting it, I would embed the security groups in the VPC stack and just use !Ref. In my experience it's a good idea to avoid introducing cross-stack references alongside a "replacement" update behavior, if you can.

Comedy option 3: If you use ansible for this, the "template_parameters" field is recursively parsed, so you can pass arbitrarily complex maps to cloudformation with it.

Comedy option 4: If I'm misunderstanding what "Production-OH-Network" means, and you do have this kind of double-dynamic relationship where any given VPC consumer stack needs to consume an output from a stack that you, for some reason, can't know the name of, I would probably use a nested stack instead and then pass the input params through AWS::CloudFormation::Stack.

Scrapez
Feb 27, 2004

12 rats tied together posted:

I don't think that you can use any of the cfn intrinsic functions in Mappings, but it's a little hard to say if that is the exact issue here because I'm not super clear on where the Mappings key starts in your second example.

Without any other context, in this scenario, I would recommend two things:

1- If you can avoid prefixing your "production vpc stack" with a per-region name, you can just import it directly. You can just call it production, export it as production-VpcId and then instead of using Mappings, just Fn::ImportValue production-VpcId. Since the stack must exist in only a single region, the region is implied (and available elsewhere in the API), and you don't need the mapping.

2- Since you can't change the VpcId of a security group without deleting it, I would embed the security groups in the VPC stack and just use !Ref. In my experience it's a good idea to avoid introducing cross-stack references alongside a "replacement" update behavior, if you can.

Comedy option 3: If you use ansible for this, the "template_parameters" field is recursively parsed, so you can pass arbitrarily complex maps to cloudformation with it.

Comedy option 4: If I'm misunderstanding what "Production-OH-Network" means, and you do have this kind of double-dynamic relationship where any given VPC consumer stack needs to consume an output from a stack that you, for some reason, can't know the name of, I would probably use a nested stack instead and then pass the input params through AWS::CloudFormation::Stack.

I appreciate the response greatly. Option 1 is clearly what I need to do. I was severely overthinking things. If I just export the value with a non-regional specific name then I can import it with the security group template. As you said, I'll only be executing a particular stack in a single region so that will work fine.

To your point in option 2, I'd wanted to make the security group template separate simply due to the number of security groups, ingress, and egress rules contained within. It's quite bulky and then ends up making the Networking template quite large as a result. Just thought maintenance might be easier having smaller separate templates. I'm pretty new to cloudformation, though. Is it more common to have a larger template file than breaking pieces out into their own? Thanks again for the help.

12 rats tied together
Sep 7, 2006

If it makes the template cumbersome to read and understand, you're absolutely right to split it up like this. Security Groups are a super overloaded concept in AWS so what I generally prefer to see is that you make a distinction between "network" SGs and "membership" SGs.

Membership SGs are for when you have something like "the chat service" which is comprised of a bunch of other AWS crap. The chat service SG, which contains every applicable member of the chat service, lives in the chat service template just for convenience. You mostly use this SG for its members, for example, a load balancer config where you need to allow traffic to every member of the chat service.

Network SGs are for when you have something like "allow inbound traffic from the office". It's not tied to a particular service, so it doesn't have a service stack to live in, your options are basically to have a Network SG stack or to embed it somewhere that logically relates to things in AWS that have network connectivity to things not in AWS. I usually end up deciding that the vpc stack is the best place and I throw them all in there, but I rarely have more than like 5 of these "Network SGs" so it is not especially cumbersome.

If I had 50, I would absolutely put them in their own stack, and that stack would probably also be a good place for network ACLs to live if I had any.

Scrapez
Feb 27, 2004

12 rats tied together posted:

If it makes the template cumbersome to read and understand, you're absolutely right to split it up like this. Security Groups are a super overloaded concept in AWS so what I generally prefer to see is that you make a distinction between "network" SGs and "membership" SGs.

Membership SGs are for when you have something like "the chat service" which is comprised of a bunch of other AWS crap. The chat service SG, which contains every applicable member of the chat service, lives in the chat service template just for convenience. You mostly use this SG for its members, for example, a load balancer config where you need to allow traffic to every member of the chat service.

Network SGs are for when you have something like "allow inbound traffic from the office". It's not tied to a particular service, so it doesn't have a service stack to live in, your options are basically to have a Network SG stack or to embed it somewhere that logically relates to things in AWS that have network connectivity to things not in AWS. I usually end up deciding that the vpc stack is the best place and I throw them all in there, but I rarely have more than like 5 of these "Network SGs" so it is not especially cumbersome.

If I had 50, I would absolutely put them in their own stack, and that stack would probably also be a good place for network ACLs to live if I had any.

Appreciate that suggestion. We do have quite a lot of different SGs and rules within them. Redistributing them into separate templates based on the service they apply to would definitely make more sense and help manage them.

Scrapez
Feb 27, 2004

Is there any way to do multi-region with Aurora Serverless? We have a database that has very low utilization with occasional small spikes so it's perfect for serverless but I need to have it be multi-regional.

Happiness Commando
Feb 1, 2002
$$ joy at gunpoint $$

Scrapez posted:

Is there any way to do multi-region with Aurora Serverless? We have a database that has very low utilization with occasional small spikes so it's perfect for serverless but I need to have it be multi-regional.



They claim serverless v2 is compatible with Aurora global, which is multi-region. Mysql only, and it's in preview.


https://aws.amazon.com/rds/aurora/serverless/ posted:


Aurora Serverless v2 (Preview) supports all manner of database workloads, from development and test environments, websites, and applications that have infrequent, intermittent, or unpredictable workloads to the most demanding, business critical applications that require high scale and high availability. It supports the full breadth of Aurora features, including Global Database, Multi-AZ deployments, and read replicas. Aurora Serverless v2 (Preview) is currently available in preview for Aurora with MySQL compatibility only.

Cheston
Jul 17, 2012

(he's got a good thing going)
I'm trying to understand cloud pricing so I'm not such a mook. Data transfer out of us-east-1 costs $0.09 per GB. Cheaper regions cost $0.05 per GB. Backblaze charges $0.01 per GB. Both services claim eleven nines of durability. Why such a big price difference?

ledge
Jun 10, 2003

Cheston posted:

I'm trying to understand cloud pricing so I'm not such a mook. Data transfer out of us-east-1 costs $0.09 per GB. Cheaper regions cost $0.05 per GB. Backblaze charges $0.01 per GB. Both services claim eleven nines of durability. Why such a big price difference?

Because they can? :shrug:

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb

Cheston posted:

I'm trying to understand cloud pricing so I'm not such a mook. Data transfer out of us-east-1 costs $0.09 per GB. Cheaper regions cost $0.05 per GB. Backblaze charges $0.01 per GB. Both services claim eleven nines of durability. Why such a big price difference?

Yup what the other guy says. AWS charges a premium because they are the market leader and can. It's way more expensive than the competition, but also way more complete in terms of all the services available in AWS.

Adbot
ADBOT LOVES YOU

Just-In-Timeberlake
Aug 18, 2003
If all you need is cheap storage, use Backblaze.

Amazon charges the premium because of all the poo poo that works together with that storage.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply