|
I reckon there'll be someone here who will say "you loving idiot, obviously this is how you do it" because this seems like a pretty common scenario. With ECS, if you have a pipeline set up to deploy I'm finding that I keep hitting an issue where the new task won't start if it's on the same underlying instance because it uses the same port. How am I supposed to work around this? I did some reading that suggests using 0 as the host port so it'll use the ephemeral range, but if you do that how do you tell the load balancer to use the right port on a new deploy?
|
# ? Jan 31, 2020 09:36 |
|
|
# ? May 21, 2024 18:45 |
|
You link your ECS task/service to an ALB: https://aws.amazon.com/premiumsupport/knowledge-center/dynamic-port-mapping-ecs/ This article solves your exact issue: https://medium.com/@mohitshrestha02/understanding-dynamic-port-mapping-in-amazon-ecs-with-application-load-balancer-bf705ee0ca8e Cancelbot fucked around with this message at 10:07 on Jan 31, 2020 |
# ? Jan 31, 2020 10:03 |
|
Started writing some stuff up in CDK and plan on making my ECS stacks in it. Its kinda weird it generates cloudformation, but you get the benefits of having your stacks deployed that way without having to write hundreds of lines of YAML or JSON and template or map out everything. Been kinda confusing to debug with it being so new and functions changing, but theres a good discussion area on this gitter site someone pointed me to. The was someone from AWS on there answering questions. I wrote up something yesterday that I plan on using to create cloudfront distros for some sites so I don't have to do it by hand.
|
# ? Feb 1, 2020 15:17 |
|
Cancelbot posted:You link your ECS task/service to an ALB: https://aws.amazon.com/premiumsupport/knowledge-center/dynamic-port-mapping-ecs/ Precisely what I needed, this is why I love you guys! Thanks mate
|
# ? Feb 2, 2020 00:39 |
|
Why do the S3 related operations I'm trying to run from my lambda timeout silently ? I have an existing bit of code that runs ok, but it quietly stops if I add that snippet: Python code:
Tangent question: I have a dependency layer for extra libraries, should it include boto3 (so far it seems like this isn't necessary but I don't find clear answers online) Context is I'm doing python dev with no aws experience. e: it seems disabling the VPC setting on the lambda lets it talk with S3 ? (but now it can't message the other components) In my situation we had endpoints set-up, according to the aws guy. I understand the fix was to edit the security group the lambda belongs to, and add an explicit outbound rule to the buckets... prefix ? unpacked robinhood fucked around with this message at 12:06 on Feb 5, 2020 |
# ? Feb 4, 2020 15:57 |
|
S3 is internet facing, when you have a lambda that's part of a VPC it'll lose internet access unless its subnet is public, i.e. has an Internet & NAT gateway ($$: https://aws.amazon.com/premiumsupport/knowledge-center/internet-access-lambda-function/) A better (read: cheaper) solution is an S3 Gateway endpoint which gives you S3 DNS & access within a private VPC: https://docs.aws.amazon.com/vpc/latest/userguide/vpce-gateway.html https://docs.aws.amazon.com/vpc/latest/userguide/vpce-gateway.html#create-gateway-endpoint and https://docs.aws.amazon.com/vpc/latest/userguide/vpc-endpoints-s3.html Cancelbot fucked around with this message at 20:31 on Feb 4, 2020 |
# ? Feb 4, 2020 20:27 |
|
Is it possible to setup an S3 bucket to host a static website but not allow public access to it? I would want to be able to access it only from machines in a specific VPC via an S3 Endpoint.
|
# ? Feb 5, 2020 21:29 |
|
Does this help you? https://aws.amazon.com/premiumsupport/knowledge-center/accessible-restricted-s3-website/
|
# ? Feb 5, 2020 23:01 |
|
Scrapez posted:Is it possible to setup an S3 bucket to host a static website but not allow public access to it? I would want to be able to access it only from machines in a specific VPC via an S3 Endpoint. I know this is possible in Azure with blob storage.
|
# ? Feb 5, 2020 23:02 |
|
The Fool posted:I know this is possible in Azure with blob storage. Ew
|
# ? Feb 6, 2020 10:12 |
|
Can we please have a rule against suggesting self harm?
|
# ? Feb 6, 2020 10:14 |
|
Scrapez posted:Is it possible to setup an S3 bucket to host a static website but not allow public access to it? I would want to be able to access it only from machines in a specific VPC via an S3 Endpoint. S3 website with a bucket policy to allow access only to a specific IP or IP range: https://aws.amazon.com/premiumsupport/knowledge-center/block-s3-traffic-vpc-ip/ The DNS will be public-resolvable but they'll get an error trying to reach the content if they're not in the IP/VPC range.
|
# ? Feb 6, 2020 14:29 |
|
Thanks Ants posted:Does this help you? Cancelbot posted:S3 website with a bucket policy to allow access only to a specific IP or IP range: https://aws.amazon.com/premiumsupport/knowledge-center/block-s3-traffic-vpc-ip/ Thank you! That appears to be what I need.
|
# ? Feb 6, 2020 16:21 |
|
The above worked for connecting to the site with S3 endpoint however, I'm seeing something odd. When I try to curl a file within the bucket from an ec2 instance in the VPC that has a route to the s3 endpoint, I receive Access Denied unless I specify the -o flag and a local filename to write to. This fails: code:
code:
code:
Scrapez fucked around with this message at 20:20 on Feb 6, 2020 |
# ? Feb 6, 2020 20:13 |
|
Have you checked that the output file contains the correct content, and isn't just that Access Denied XML?
|
# ? Feb 6, 2020 20:38 |
|
Docjowles posted:Have you checked that the output file contains the correct content, and isn't just that Access Denied XML? It definitely does contain that content. Thank you!
|
# ? Feb 6, 2020 20:50 |
|
I'm glad it was just that because any other option would be worrisome, lol You probably want the private IP of the instance in your condition, not public (if that's the real IP you posted), if you are accessing it over a VPC endpoint. I think that should do the trick.
|
# ? Feb 6, 2020 20:59 |
|
The "deny notlike" is super giving me a headache. I would suggest turning it into an "Allow, StringLike" if you can.
|
# ? Feb 6, 2020 21:14 |
|
Docjowles posted:I'm glad it was just that because any other option would be worrisome, lol That IP is just my PC. The instance is in a VPC/subnet that is routed to it via the VPC Endpoint. My actual issue is that retrieving the file via http protocol gets a 403 but pulling it down via s3 cp works fine: code:
code:
|
# ? Feb 6, 2020 21:14 |
|
edit: nvm
Pile Of Garbage fucked around with this message at 02:16 on Feb 7, 2020 |
# ? Feb 7, 2020 02:14 |
|
Scrapez posted:That IP is just my PC. The instance is in a VPC/subnet that is routed to it via the VPC Endpoint. My actual issue is that retrieving the file via http protocol gets a 403 but pulling it down via s3 cp works fine: I'm no expert, but I think using the CLI within the EC2 instance would use the instance role, as opposed to HTTP which wouldn't have any associated role-granted permissions. Check the permissions policies on your bucket, in particular GetItem. Edit: nvm, I somehow missed that you posted the policy already putin is a cunt fucked around with this message at 04:09 on Feb 7, 2020 |
# ? Feb 7, 2020 04:06 |
Anybody try to do anything meaningful with Aurora Serverless yet? Wondering what sort of unknown horrors I'm about to encounter...
|
|
# ? Feb 8, 2020 00:45 |
|
fletcher posted:Anybody try to do anything meaningful with Aurora Serverless yet? Wondering what sort of unknown horrors I'm about to encounter... I used it for a couple moodle server instances and a couple functions the installer used to create the tables weren’t quite accessible. I had to do an install on MySQL and then import it to get going. No problems afterwards. Haven’t yet moved any production workloads to it but that might happen as we migrate to more serverless and containers.
|
# ? Feb 8, 2020 00:59 |
|
I use Aurora Serverless for a couple of temporary, rarely-used reporting instances. The only problem we've had is the timeouts on to scale-from-zero / resume operations. Most clients have a default timeout that's shorter than Aurora's refresh spin-up time. Make sure client timeouts are set to 30s and it's usually fine. We do also run into an occasional resume error from the server when it takes too long: "Database was unable to resume within timeout period", so you may need bake in a connection retry in your client. Pooled connections would probably "just work", but we're using SQLAlchemy engines in one-shot ETLs.
|
# ? Feb 8, 2020 02:06 |
Cool, thanks for the feedback! For our use case we probably won't have scale to zero enabled, at least in production. Good to know though, may come in handy for internal dev/qa instances.
|
|
# ? Feb 8, 2020 03:08 |
|
What’s the least painful way to implement OAuth for API Gateway calls? There seems to be a number of different options and I’d welcome insight to avoid headaches. Edit: I got a Cognito user pool set up and working, but only using client credentials. I'm a bit confused as to how to validate and use a specific user. I added my own email address as a verified SES account and then sent a verification email to myself but there's no link to use. I'm gathering I need to use an SDK to do this portion or can I put it to a REST call? PierreTheMime fucked around with this message at 21:30 on Feb 10, 2020 |
# ? Feb 10, 2020 18:00 |
|
Anybody know what Machine Learning on AWS does? Our databases at work are hosted on AWS, and the boss showed us the backend of it the other day, mentioning all the extras like machine learning that he nor anyone else at the company has any idea how to use. I ask because I've got a data entry problem that I'd like to automate a solution to somehow. So I've got two lists of names of companies. Some of them may match and others won't. But a lot of them may have a partial match. For example, one list might have something like: Acme Holdings Inc. while the other one, which is inexplicably riddled with numbers and random characters, might have 12949 ACME INC##. I want something to find any matches on what I, or any human, recognize as the part of the record that is the key word in the name of the company ("Acme"). I know a little bit of coding in Python but not much. I don't really want to spend weeks writing a crazy program for this either. Is there some kind of app, or hell, even a functionality in excel that could do this? AWS Machine Learning maybe?
|
# ? Feb 11, 2020 04:38 |
|
Is this any use? https://www.microsoft.com/en-us/download/details.aspx?id=15011
|
# ? Feb 11, 2020 10:42 |
|
This could be good, thanks!
|
# ? Feb 11, 2020 16:33 |
|
Ramrod Hotshot posted:Our databases at work are hosted on AWS, How big is your list? If its on the order of ~100K rows fuzzy logic scoring might be a better solution. i.e. score the similarity of one string to another string. The fuzzywuzzy python package does that trivially, and in a couple lines of code, then you can rank them by similarity. To get an idea of how that works: https://datascience.stackexchange.com/questions/12575/similarity-between-two-words posted:Informally, the Levenshtein distance between two words is the minimum number of single-character edits (i.e. insertions, deletions or substitutions) required to change one word into the other. NLTK (also python) has some (IMO harder to use and understand) tools that do very similar things such as: edit_distance Machine learning isn't magic and requires input data to model on. Fuzzywuzzy could help generate that input data if you need to do this on 100,000,000 rows and need something more robust (and perhaps a predictive model) TLDR try fuzzywuzzy first loading your DB into memory using pandas. FWIW I have an extremely similar problem, where two different web scraper data sources will report with slight variations of the same name, e.g. "SOMETHING AWFUL INC" and "Something Awful" and I use fuzzywuzzy with a minimum score to resolve that. Your two example strings are very different by Levenshtein distance and as someone who has used crowdsourced data labeling for this problem, I'd argue that most humans would NOT classify those as similar names. EDIT: If you want to know if any substring is located within any other string (e.g. "Acme"), thats actually pretty easy and fast.You could split the string using " " as a separator, then check for each of the string parts are they located in any of the columns containing Something like (pulled out of my rear end) code:
CarForumPoster fucked around with this message at 17:08 on Feb 11, 2020 |
# ? Feb 11, 2020 16:51 |
|
I built out a metrics collector for a personal project, does this sound sane? The bucket brigade is API Gateway -> Lambda -> CloudWatch Custom Metric. I’m just counting events (each live client submits their own count every minute or so to the API and resets their counter). Being able to graph the event count without any particular persistence is all I want. Probably cheaper to plop down an actual tiny sized app on to Elastic Beanstalk or something if I get too many clients, but as is it seems ok for several clients without costing a ton.
|
# ? Feb 17, 2020 01:11 |
|
That sounds fine to me. An alternative would be distributing credentials or otherwise allowing direct access to cloudwatch.put_metric_data() in the relevant AWS account, which IMO would be preferable if all of the clients were things that you controlled. It may seem like a lot of moving parts but api gateway to lambda function is standard enough that it won't especially be an ongoing support nightmare or cause other people who may work in your AWS account too much annoyance.
|
# ? Feb 17, 2020 07:10 |
|
If you’re using Put Metric, you might want to look into Embedded Metric Format with logs instead. IIRC, it can be cheaper from lambdas because lambdas can’t do batching or something like that. That might do away with that cost problem.
|
# ? Feb 18, 2020 01:37 |
|
Maybe I'm a dummy, or my google skills suck. I'm trying to figure out a theoretical AWS problem. Lets say I want to have 2 ASGs, no load balancer in front of them. They'll be doing bulk processing. In one ASG will be Linux servers On another, Windows servers. The windows servers will be running a service which connects to only 1 of the ASG linux servers. The ASGs will always match in number of nodes, 1 to 1. Has anyone figured out a way to get index number of the ASG instance so I can generate something like. ASG1 (Linux) Name: ENV-LINUX-1 Name: ENV-LINUX-2 ASG2 (Windows) Name: ENV-WIN-1 Name: ENV-WIN-2 Then I can pull those values in to set hostname, generate certificates, etc in the userdata and use the predictive naming to pair the Windows and Linux server 1 to 1?
|
# ? Feb 19, 2020 00:03 |
|
Matt Zerella posted:Maybe I'm a dummy, or my google skills suck. ASG nodes don't have identity like that. If the number of instances you need is fixed, you could do one ASG per instance and set the hostname from an ASG tag.
|
# ? Feb 19, 2020 00:22 |
|
Nomnom Cookie posted:ASG nodes don't have identity like that. If the number of instances you need is fixed, you could do one ASG per instance and set the hostname from an ASG tag. I'm legit shocked I can't set the name tag in user data like this. All I need is an instance number!!! One thing we are thinking is some kind of queue or DB where each instance consumes a value (which disappears) and we can just set them that way?
|
# ? Feb 19, 2020 00:33 |
|
Maybe you could use the ami launch index for that curl -s http://169.254.169.254/latest/meta-data/ami-launch-index Comedy answer: use metal instances and use virtualbox to run windows on each metal instance. Methanar fucked around with this message at 01:05 on Feb 19, 2020 |
# ? Feb 19, 2020 01:02 |
|
Matt Zerella posted:Maybe I'm a dummy, or my google skills suck. What is the workload? This is an odd architecture that resembles some kind of grid compute but putting nodes into an ASG and the requiring them to connect to a specific partner has me curious. Linking one server to a partner locks an architecture into a static configuration. Instead you’d build a stateless configuration which sends a completed work request from column A into a queue that would be pulled down into the next available server in column b. Agrikk fucked around with this message at 01:15 on Feb 19, 2020 |
# ? Feb 19, 2020 01:05 |
|
Agrikk posted:What is the workload? This is an odd architecture that resembles some kind of grid compute but putting nodes into an ASG and the requiring them to connect to a specific partner has me curious. The idea is to spin this up, do a bunch of processing and tear it down. The Linux to Windows link is due to a service that only runs on Windows and consumes data from the Linux machine and sends it back. Neither side can be load balanced. Yes. This sucks. Out software is definitely square peg for the clouds round hole but we are trying to work around this. We are just in the brainstorming phase of things right now. The idea is to fire a command to the ASG to spin up instances which will consume a SQS queue to generate documents (tens of thousands) then somehow when the job is done it will fire a command to spin the ASG down to zero.
|
# ? Feb 19, 2020 01:44 |
|
|
# ? May 21, 2024 18:45 |
|
This is probably dumb but the first idea I had (without totally rethinking the process or adding more scaffolding) was something like: Ditch the ASGs, they aren’t adding anything here. Whatever kicks off these jobs (Jenkins or w/e) uses the EC2 API directly to boot up up the proper number of instances for the job and give each windows/Linux pair a matching tag. Each server can query the EC2 API to find its mate based on the tags. Also configure them to terminate on instance shutdown. App does its thing. When done, have the last step of the job shutdown the operating system. It’ll go down and delete itself.
|
# ? Feb 19, 2020 02:26 |