Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
putin is a cunt
Apr 5, 2007

BOY DO I SURE ENJOY TRASH. THERE'S NOTHING MORE I LOVE THAN TO SIT DOWN IN FRONT OF THE BIG SCREEN AND EAT A BIIIIG STEAMY BOWL OF SHIT. WARNER BROS CAN COME OVER TO MY HOUSE AND ASSFUCK MY MOM WHILE I WATCH AND I WOULD CERTIFY IT FRESH, NO QUESTION
I reckon there'll be someone here who will say "you loving idiot, obviously this is how you do it" because this seems like a pretty common scenario.

With ECS, if you have a pipeline set up to deploy I'm finding that I keep hitting an issue where the new task won't start if it's on the same underlying instance because it uses the same port. How am I supposed to work around this? I did some reading that suggests using 0 as the host port so it'll use the ephemeral range, but if you do that how do you tell the load balancer to use the right port on a new deploy?

Adbot
ADBOT LOVES YOU

Cancelbot
Nov 22, 2006

Canceling spam since 1928

You link your ECS task/service to an ALB: https://aws.amazon.com/premiumsupport/knowledge-center/dynamic-port-mapping-ecs/

This article solves your exact issue: https://medium.com/@mohitshrestha02/understanding-dynamic-port-mapping-in-amazon-ecs-with-application-load-balancer-bf705ee0ca8e

Cancelbot fucked around with this message at 10:07 on Jan 31, 2020

JHVH-1
Jun 28, 2002
Started writing some stuff up in CDK and plan on making my ECS stacks in it. Its kinda weird it generates cloudformation, but you get the benefits of having your stacks deployed that way without having to write hundreds of lines of YAML or JSON and template or map out everything.
Been kinda confusing to debug with it being so new and functions changing, but theres a good discussion area on this gitter site someone pointed me to. The was someone from AWS on there answering questions.

I wrote up something yesterday that I plan on using to create cloudfront distros for some sites so I don't have to do it by hand.

putin is a cunt
Apr 5, 2007

BOY DO I SURE ENJOY TRASH. THERE'S NOTHING MORE I LOVE THAN TO SIT DOWN IN FRONT OF THE BIG SCREEN AND EAT A BIIIIG STEAMY BOWL OF SHIT. WARNER BROS CAN COME OVER TO MY HOUSE AND ASSFUCK MY MOM WHILE I WATCH AND I WOULD CERTIFY IT FRESH, NO QUESTION

Precisely what I needed, this is why I love you guys! Thanks mate

unpacked robinhood
Feb 18, 2013

by Fluffdaddy
Why do the S3 related operations I'm trying to run from my lambda timeout silently ?

I have an existing bit of code that runs ok, but it quietly stops if I add that snippet:

Python code:
import boto3
sss = boto3.client('s3')
r = sss.list_buckets() # execution stops here
# similar calls that are explicitely allowed by policy timeout too
If I create another lambda with the same role and policy and only include those three lines it runs correctly.
Tangent question: I have a dependency layer for extra libraries, should it include boto3 (so far it seems like this isn't necessary but I don't find clear answers online)

Context is I'm doing python dev with no aws experience.

e: it seems disabling the VPC setting on the lambda lets it talk with S3 ? (but now it can't message the other components)


In my situation we had endpoints set-up, according to the aws guy.
I understand the fix was to edit the security group the lambda belongs to, and add an explicit outbound rule to the buckets... prefix ?

unpacked robinhood fucked around with this message at 12:06 on Feb 5, 2020

Cancelbot
Nov 22, 2006

Canceling spam since 1928

S3 is internet facing, when you have a lambda that's part of a VPC it'll lose internet access unless its subnet is public, i.e. has an Internet & NAT gateway ($$: https://aws.amazon.com/premiumsupport/knowledge-center/internet-access-lambda-function/)

A better (read: cheaper) solution is an S3 Gateway endpoint which gives you S3 DNS & access within a private VPC:
https://docs.aws.amazon.com/vpc/latest/userguide/vpce-gateway.html
https://docs.aws.amazon.com/vpc/latest/userguide/vpce-gateway.html#create-gateway-endpoint
and
https://docs.aws.amazon.com/vpc/latest/userguide/vpc-endpoints-s3.html

Cancelbot fucked around with this message at 20:31 on Feb 4, 2020

Scrapez
Feb 27, 2004

Is it possible to setup an S3 bucket to host a static website but not allow public access to it? I would want to be able to access it only from machines in a specific VPC via an S3 Endpoint.

Thanks Ants
May 21, 2004

#essereFerrari


Does this help you?

https://aws.amazon.com/premiumsupport/knowledge-center/accessible-restricted-s3-website/

The Fool
Oct 16, 2003


Scrapez posted:

Is it possible to setup an S3 bucket to host a static website but not allow public access to it? I would want to be able to access it only from machines in a specific VPC via an S3 Endpoint.

I know this is possible in Azure with blob storage.

putin is a cunt
Apr 5, 2007

BOY DO I SURE ENJOY TRASH. THERE'S NOTHING MORE I LOVE THAN TO SIT DOWN IN FRONT OF THE BIG SCREEN AND EAT A BIIIIG STEAMY BOWL OF SHIT. WARNER BROS CAN COME OVER TO MY HOUSE AND ASSFUCK MY MOM WHILE I WATCH AND I WOULD CERTIFY IT FRESH, NO QUESTION

The Fool posted:

I know this is possible in Azure with blob storage.

Ew

putin is a cunt
Apr 5, 2007

BOY DO I SURE ENJOY TRASH. THERE'S NOTHING MORE I LOVE THAN TO SIT DOWN IN FRONT OF THE BIG SCREEN AND EAT A BIIIIG STEAMY BOWL OF SHIT. WARNER BROS CAN COME OVER TO MY HOUSE AND ASSFUCK MY MOM WHILE I WATCH AND I WOULD CERTIFY IT FRESH, NO QUESTION
Can we please have a rule against suggesting self harm?

Cancelbot
Nov 22, 2006

Canceling spam since 1928

Scrapez posted:

Is it possible to setup an S3 bucket to host a static website but not allow public access to it? I would want to be able to access it only from machines in a specific VPC via an S3 Endpoint.

S3 website with a bucket policy to allow access only to a specific IP or IP range: https://aws.amazon.com/premiumsupport/knowledge-center/block-s3-traffic-vpc-ip/

The DNS will be public-resolvable but they'll get an error trying to reach the content if they're not in the IP/VPC range.

Scrapez
Feb 27, 2004



Cancelbot posted:

S3 website with a bucket policy to allow access only to a specific IP or IP range: https://aws.amazon.com/premiumsupport/knowledge-center/block-s3-traffic-vpc-ip/

The DNS will be public-resolvable but they'll get an error trying to reach the content if they're not in the IP/VPC range.

Thank you! That appears to be what I need.

Scrapez
Feb 27, 2004

The above worked for connecting to the site with S3 endpoint however, I'm seeing something odd. When I try to curl a file within the bucket from an ec2 instance in the VPC that has a route to the s3 endpoint, I receive Access Denied unless I specify the -o flag and a local filename to write to.

This fails:
code:
curl [url]https://mybucket.s3.amazonaws.com/Welcome.wav[/url]
<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>AccessDenied</Code><Message>Access Denied</Message><RequestId>791E66E21A5AE06F</RequestId><HostId>+swglb3b3VOFqmxOzaaJSVL9noQ+y8BwuqlxUql36pFVoy6VLYzgwD+FUmE1QW3XRRQ9Q=</HostId></Error>
This works:
code:
curl [url]https://mybucket.s3.amazonaws.com/Welcome.wav[/url] -o Welcome.wav
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   243    0   243    0     0   1166      0 --:--:-- --:--:-- --:--:--  1168
The bucket policy is as follows:
code:
{
    "Version": "2012-10-17",
    "Id": "VPCe and SourceIP",
    "Statement": [
        {
            "Sid": "VPCe and SourceIP",
            "Effect": "Deny",
            "Principal": "*",
            "Action": "s3:*",
            "Resource": [
                "arn:aws:s3:::mybucket",
                "arn:aws:s3:::mybucket/*"
            ],
            "Condition": {
                "StringNotLike": {
                    "aws:sourceVpce": "vpce-074e12dab9d12e4eb"
                },
                "NotIpAddress": {
                    "aws:SourceIp": "216.16.128.242/32"
                }
            }
        }
    ]
}
Any ideas?

Scrapez fucked around with this message at 20:20 on Feb 6, 2020

Docjowles
Apr 9, 2009

Have you checked that the output file contains the correct content, and isn't just that Access Denied XML? :v:

Scrapez
Feb 27, 2004

Docjowles posted:

Have you checked that the output file contains the correct content, and isn't just that Access Denied XML? :v:

:negative:

It definitely does contain that content. Thank you!

Docjowles
Apr 9, 2009

I'm glad it was just that because any other option would be worrisome, lol

You probably want the private IP of the instance in your condition, not public (if that's the real IP you posted), if you are accessing it over a VPC endpoint. I think that should do the trick.

12 rats tied together
Sep 7, 2006

The "deny notlike" is super giving me a headache. I would suggest turning it into an "Allow, StringLike" if you can.

Scrapez
Feb 27, 2004

Docjowles posted:

I'm glad it was just that because any other option would be worrisome, lol

You probably want the private IP of the instance in your condition, not public (if that's the real IP you posted), if you are accessing it over a VPC endpoint. I think that should do the trick.

That IP is just my PC. The instance is in a VPC/subnet that is routed to it via the VPC Endpoint. My actual issue is that retrieving the file via http protocol gets a 403 but pulling it down via s3 cp works fine:
code:
curl [url]https://mybucket.s3.amazonaws.com/Welcome.wav[/url]
<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>AccessDenied</Code><Message>Access Denied</Message><RequestId>36447CC5D79808FD</RequestId><HostId>6v6naApWr55vwgne7ZnO8b28QOgOAvv4qJ3pZnS7ADrjDUnBd8v958U95t8=</HostId></Error>
code:
aws s3 cp s3://mybucket/Welcome.wav Welcome.wav
download: s3://mybucket/Welcome.wav to ./Welcome.wav

Pile Of Garbage
May 28, 2007



edit: nvm

Pile Of Garbage fucked around with this message at 02:16 on Feb 7, 2020

putin is a cunt
Apr 5, 2007

BOY DO I SURE ENJOY TRASH. THERE'S NOTHING MORE I LOVE THAN TO SIT DOWN IN FRONT OF THE BIG SCREEN AND EAT A BIIIIG STEAMY BOWL OF SHIT. WARNER BROS CAN COME OVER TO MY HOUSE AND ASSFUCK MY MOM WHILE I WATCH AND I WOULD CERTIFY IT FRESH, NO QUESTION

Scrapez posted:

That IP is just my PC. The instance is in a VPC/subnet that is routed to it via the VPC Endpoint. My actual issue is that retrieving the file via http protocol gets a 403 but pulling it down via s3 cp works fine:
code:
curl [url]https://mybucket.s3.amazonaws.com/Welcome.wav[/url]
<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>AccessDenied</Code><Message>Access Denied</Message><RequestId>36447CC5D79808FD</RequestId><HostId>6v6naApWr55vwgne7ZnO8b28QOgOAvv4qJ3pZnS7ADrjDUnBd8v958U95t8=</HostId></Error>
code:
aws s3 cp s3://mybucket/Welcome.wav Welcome.wav
download: s3://mybucket/Welcome.wav to ./Welcome.wav

I'm no expert, but I think using the CLI within the EC2 instance would use the instance role, as opposed to HTTP which wouldn't have any associated role-granted permissions. Check the permissions policies on your bucket, in particular GetItem.

Edit: nvm, I somehow missed that you posted the policy already

putin is a cunt fucked around with this message at 04:09 on Feb 7, 2020

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb
Anybody try to do anything meaningful with Aurora Serverless yet? Wondering what sort of unknown horrors I'm about to encounter...

JHVH-1
Jun 28, 2002

fletcher posted:

Anybody try to do anything meaningful with Aurora Serverless yet? Wondering what sort of unknown horrors I'm about to encounter...

I used it for a couple moodle server instances and a couple functions the installer used to create the tables weren’t quite accessible. I had to do an install on MySQL and then import it to get going. No problems afterwards.

Haven’t yet moved any production workloads to it but that might happen as we migrate to more serverless and containers.

JehovahsWetness
Dec 9, 2005

bang that shit retarded
I use Aurora Serverless for a couple of temporary, rarely-used reporting instances. The only problem we've had is the timeouts on to scale-from-zero / resume operations. Most clients have a default timeout that's shorter than Aurora's refresh spin-up time. Make sure client timeouts are set to 30s and it's usually fine.

We do also run into an occasional resume error from the server when it takes too long: "Database was unable to resume within timeout period", so you may need bake in a connection retry in your client. Pooled connections would probably "just work", but we're using SQLAlchemy engines in one-shot ETLs.

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb
Cool, thanks for the feedback! For our use case we probably won't have scale to zero enabled, at least in production. Good to know though, may come in handy for internal dev/qa instances.

PierreTheMime
Dec 9, 2004

Hero of hormagaunts everywhere!
Buglord
What’s the least painful way to implement OAuth for API Gateway calls? There seems to be a number of different options and I’d welcome insight to avoid headaches.

Edit: I got a Cognito user pool set up and working, but only using client credentials. I'm a bit confused as to how to validate and use a specific user. I added my own email address as a verified SES account and then sent a verification email to myself but there's no link to use. I'm gathering I need to use an SDK to do this portion or can I put it to a REST call?

PierreTheMime fucked around with this message at 21:30 on Feb 10, 2020

Ramrod Hotshot
May 30, 2003

Anybody know what Machine Learning on AWS does?

Our databases at work are hosted on AWS, and the boss showed us the backend of it the other day, mentioning all the extras like machine learning that he nor anyone else at the company has any idea how to use.

I ask because I've got a data entry problem that I'd like to automate a solution to somehow. So I've got two lists of names of companies. Some of them may match and others won't. But a lot of them may have a partial match. For example, one list might have something like: Acme Holdings Inc. while the other one, which is inexplicably riddled with numbers and random characters, might have 12949 ACME INC##. I want something to find any matches on what I, or any human, recognize as the part of the record that is the key word in the name of the company ("Acme").

I know a little bit of coding in Python but not much. I don't really want to spend weeks writing a crazy program for this either. Is there some kind of app, or hell, even a functionality in excel that could do this? AWS Machine Learning maybe?

Thanks Ants
May 21, 2004

#essereFerrari


Is this any use?

https://www.microsoft.com/en-us/download/details.aspx?id=15011

Ramrod Hotshot
May 30, 2003


This could be good, thanks!

CarForumPoster
Jun 26, 2013

⚡POWER⚡

Ramrod Hotshot posted:

Our databases at work are hosted on AWS,

I've got two lists of names of companies.
a lot of them may have a partial match.
For example: Acme Holdings Inc. vs 12949 ACME INC##.

I know a little bit of coding in Python but not much.

How big is your list? If its on the order of ~100K rows fuzzy logic scoring might be a better solution. i.e. score the similarity of one string to another string. The fuzzywuzzy python package does that trivially, and in a couple lines of code, then you can rank them by similarity.

To get an idea of how that works:

https://datascience.stackexchange.com/questions/12575/similarity-between-two-words posted:

Informally, the Levenshtein distance between two words is the minimum number of single-character edits (i.e. insertions, deletions or substitutions) required to change one word into the other.


NLTK (also python) has some (IMO harder to use and understand) tools that do very similar things such as: edit_distance

Machine learning isn't magic and requires input data to model on. Fuzzywuzzy could help generate that input data if you need to do this on 100,000,000 rows and need something more robust (and perhaps a predictive model)

TLDR try fuzzywuzzy first loading your DB into memory using pandas.

FWIW I have an extremely similar problem, where two different web scraper data sources will report with slight variations of the same name, e.g. "SOMETHING AWFUL INC" and "Something Awful" and I use fuzzywuzzy with a minimum score to resolve that. Your two example strings are very different by Levenshtein distance and as someone who has used crowdsourced data labeling for this problem, I'd argue that most humans would NOT classify those as similar names.


EDIT:
If you want to know if any substring is located within any other string (e.g. "Acme"), thats actually pretty easy and fast.You could split the string using " " as a separator, then check for each of the string parts are they located in any of the columns containing

Something like (pulled out of my rear end)
code:
import pandas as pd

df = pd.read_sql("your data")
s = df["company_names_to_check_against"] #Make a pandas series that you can check for string matches
for index, row in df.iterrows():
	match_string = row["company"].split(" ")
		for substring in match_string:
			if s.isin([substring]).any(): # Check if any come back True as matching the substring
				#TODO: If they do, do something with that fact
			else: #Do something else
Could also check with fuzzywuzzy first then check if theres a partials string match and then do something if it both meets a threshold and has a partial string match.

CarForumPoster fucked around with this message at 17:08 on Feb 11, 2020

YanniRotten
Apr 3, 2010

We're so pretty,
oh so pretty
I built out a metrics collector for a personal project, does this sound sane? The bucket brigade is API Gateway -> Lambda -> CloudWatch Custom Metric.

I’m just counting events (each live client submits their own count every minute or so to the API and resets their counter). Being able to graph the event count without any particular persistence is all I want.

Probably cheaper to plop down an actual tiny sized app on to Elastic Beanstalk or something if I get too many clients, but as is it seems ok for several clients without costing a ton.

12 rats tied together
Sep 7, 2006

That sounds fine to me. An alternative would be distributing credentials or otherwise allowing direct access to cloudwatch.put_metric_data() in the relevant AWS account, which IMO would be preferable if all of the clients were things that you controlled.

It may seem like a lot of moving parts but api gateway to lambda function is standard enough that it won't especially be an ongoing support nightmare or cause other people who may work in your AWS account too much annoyance.

crazypenguin
Mar 9, 2005
nothing witty here, move along
If you’re using Put Metric, you might want to look into Embedded Metric Format with logs instead. IIRC, it can be cheaper from lambdas because lambdas can’t do batching or something like that.

That might do away with that cost problem.

Matt Zerella
Oct 7, 2002

Norris'es are back baby. It's good again. Awoouu (fox Howl)
Maybe I'm a dummy, or my google skills suck.

I'm trying to figure out a theoretical AWS problem.

Lets say I want to have 2 ASGs, no load balancer in front of them.

They'll be doing bulk processing.

In one ASG will be Linux servers

On another, Windows servers.

The windows servers will be running a service which connects to only 1 of the ASG linux servers. The ASGs will always match in number of nodes, 1 to 1.

Has anyone figured out a way to get index number of the ASG instance so I can generate something like.

ASG1 (Linux)
Name: ENV-LINUX-1
Name: ENV-LINUX-2

ASG2 (Windows)
Name: ENV-WIN-1
Name: ENV-WIN-2

Then I can pull those values in to set hostname, generate certificates, etc in the userdata and use the predictive naming to pair the Windows and Linux server 1 to 1?

Nomnom Cookie
Aug 30, 2009



Matt Zerella posted:

Maybe I'm a dummy, or my google skills suck.

I'm trying to figure out a theoretical AWS problem.

Lets say I want to have 2 ASGs, no load balancer in front of them.

They'll be doing bulk processing.

In one ASG will be Linux servers

On another, Windows servers.

The windows servers will be running a service which connects to only 1 of the ASG linux servers. The ASGs will always match in number of nodes, 1 to 1.

Has anyone figured out a way to get index number of the ASG instance so I can generate something like.

ASG1 (Linux)
Name: ENV-LINUX-1
Name: ENV-LINUX-2

ASG2 (Windows)
Name: ENV-WIN-1
Name: ENV-WIN-2

Then I can pull those values in to set hostname, generate certificates, etc in the userdata and use the predictive naming to pair the Windows and Linux server 1 to 1?

ASG nodes don't have identity like that. If the number of instances you need is fixed, you could do one ASG per instance and set the hostname from an ASG tag.

Matt Zerella
Oct 7, 2002

Norris'es are back baby. It's good again. Awoouu (fox Howl)

Nomnom Cookie posted:

ASG nodes don't have identity like that. If the number of instances you need is fixed, you could do one ASG per instance and set the hostname from an ASG tag.

I'm legit shocked I can't set the name tag in user data like this. All I need is an instance number!!!

One thing we are thinking is some kind of queue or DB where each instance consumes a value (which disappears) and we can just set them that way?

Methanar
Sep 26, 2013

by the sex ghost
Maybe you could use the ami launch index for that
curl -s http://169.254.169.254/latest/meta-data/ami-launch-index

Comedy answer: use metal instances and use virtualbox to run windows on each metal instance.

Methanar fucked around with this message at 01:05 on Feb 19, 2020

Agrikk
Oct 17, 2003

Take care with that! We have not fully ascertained its function, and the ticking is accelerating.

Matt Zerella posted:

Maybe I'm a dummy, or my google skills suck.

I'm trying to figure out a theoretical AWS problem.

Lets say I want to have 2 ASGs, no load balancer in front of them.

They'll be doing bulk processing.

In one ASG will be Linux servers

On another, Windows servers.

The windows servers will be running a service which connects to only 1 of the ASG linux servers. The ASGs will always match in number of nodes, 1 to 1.

Has anyone figured out a way to get index number of the ASG instance so I can generate something like.

ASG1 (Linux)
Name: ENV-LINUX-1
Name: ENV-LINUX-2

ASG2 (Windows)
Name: ENV-WIN-1
Name: ENV-WIN-2

Then I can pull those values in to set hostname, generate certificates, etc in the userdata and use the predictive naming to pair the Windows and Linux server 1 to 1?

What is the workload? This is an odd architecture that resembles some kind of grid compute but putting nodes into an ASG and the requiring them to connect to a specific partner has me curious.

Linking one server to a partner locks an architecture into a static configuration. Instead you’d build a stateless configuration which sends a completed work request from column A into a queue that would be pulled down into the next available server in column b.

Agrikk fucked around with this message at 01:15 on Feb 19, 2020

Matt Zerella
Oct 7, 2002

Norris'es are back baby. It's good again. Awoouu (fox Howl)

Agrikk posted:

What is the workload? This is an odd architecture that resembles some kind of grid compute but putting nodes into an ASG and the requiring them to connect to a specific partner has me curious.

Linking one server to a partner locks an architecture into a static configuration. Instead you’d build a stateless configuration which sends a completed work request from column A into a queue that would be pulled down into the next available server in column b.

The idea is to spin this up, do a bunch of processing and tear it down. The Linux to Windows link is due to a service that only runs on Windows and consumes data from the Linux machine and sends it back.

Neither side can be load balanced.

Yes. This sucks. Out software is definitely square peg for the clouds round hole but we are trying to work around this.

We are just in the brainstorming phase of things right now. The idea is to fire a command to the ASG to spin up instances which will consume a SQS queue to generate documents (tens of thousands) then somehow when the job is done it will fire a command to spin the ASG down to zero.

Adbot
ADBOT LOVES YOU

Docjowles
Apr 9, 2009

This is probably dumb but the first idea I had (without totally rethinking the process or adding more scaffolding) was something like:

Ditch the ASGs, they aren’t adding anything here. Whatever kicks off these jobs (Jenkins or w/e) uses the EC2 API directly to boot up up the proper number of instances for the job and give each windows/Linux pair a matching tag. Each server can query the EC2 API to find its mate based on the tags. Also configure them to terminate on instance shutdown.

App does its thing. When done, have the last step of the job shutdown the operating system. It’ll go down and delete itself.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply