|
Pretty new to AWS and trying to understand autoscaling and how I can initiate an event to increase group size based upon a log file on one of my instances rather than a cloudwatch metric like CPU utilization, etc. For instance, I would have an instance receiving calls that had a finite amount of telephony. Once the maximum simultaneous calls reached a certain value, I would like to spin up another EC2 instance. I have log files on the EC2 instance that I could monitor and use to trigger the event but I'm not sure how exactly to do that. The only thing I can think of would be to write a script that runs on the EC2 instance itself and once the max simultaneous calls were reached, it would spin up another EC2 instance via the AWS CLI. Anyone doing something similar to this? Edit: It looks like CloudWatch Agent may have the ability to do this... Scrapez fucked around with this message at 21:42 on Jan 3, 2019 |
# ¿ Jan 3, 2019 21:31 |
|
|
# ¿ May 22, 2024 08:11 |
|
Arzakon posted:Assuming each server has X slots available for a call but the slots were eventually recycled at the end of the call, create a custom CloudWatch metric and have a script on each server to report "Free Slots". Trigger your scale up based off of the sum of Free Slots going below some number and scale down based off of some other number. The latter gets a bit complicated because you don't want to terminate an instance that still has active calls so you need to write a lifecycle hook and respond to that when the active calls falls to zero, or don't have any automated scale down termination and manage deprecating and terminating the instances yourself. That is very helpful. Thank you. I think scaling up is all I will need so managing the scale down termination manually is perfectly fine. I expect the scaling to be slow and predictable and likely will never need to scale down as the platform will continue to grow and become busier over time.
|
# ¿ Jan 3, 2019 22:16 |
|
Is there a way from the command line on an EC2 instance to retrieve just the public IP address associated with that instance based on the private IP? I can do `aws ec2 --region us-east-1 describe-addresses` which returns a list of all addresses and I could parse out the PublicIP of the instance I'm looking for with a combination of grep and awk but is there a better way of doing this? I would be putting the private IP in a variable as I can obtain that via ifconfig and then I'd like to return the public IP based on the private IP. I'm writing a bootstrap script that will update a config file on the instance with the public IP of that instance. Thoughts?
|
# ¿ Jan 8, 2019 17:15 |
|
2nd Rate Poster posted:From within the instance you can use the metadata service to find the public ip. JHVH-1 posted:Also be aware if you use public-hostname it will resolve to the public IP or private IP depending on where it is resolved. That way you can do things like route the traffic internally for some systems with the same hostname. Much easier way of doing it. Thank you!
|
# ¿ Jan 8, 2019 17:46 |
|
Question on methodology. I want to create a cloudwatch event that will kick off when auto scaling launches a new instance successfully. Additionally, I want a script or a bunch of commands to be run on the ec2 instance that is launched. I've created the cloudwatch event with the correct service, event and group name as the source. I've set the Target as SSM Run Command with Document AWS-RunShellScript (Linux). I have my Target key set to "tag:Server Type" and target value of <kamailio>. (I have the launch configuration of the autoscaling group set to tag new instances with tag Server Type and value kamailio. Is this the above the proper way to say "execute the following commands on new instances with the tag Server Type and value kamailio? Additionally, is there a way to have it just execute a whole script rather than putting each command in separately as a Constant Configure Parameter? I hope the above makes sense. Ultimately, if an instance crashes, I want the autoscaling group to launch a replacement, I then want the cloudwatch event to be triggered and run a script that will basically grab the local and public IP address of the instance, put them into variables and then write them out to application config files and start the applications.
|
# ¿ Jan 21, 2019 21:26 |
|
Arzakon posted:Have you looked into using user-data to execute the script on launch? You could bake the script into your AMI and just use the user-data to run the command, or put the whole script into the user-data. I have successfully done it this way but was hoping to move it to a CloudWatch event as I'll have a subsequent Event that will need to happen when a new instance is launched as well. I thought it'd be better to have all the items together there for easier management.
|
# ¿ Jan 21, 2019 21:51 |
|
My understanding was that I could select AWS-RunShellScript (Linux) in the Document type and then in the Commands section, I could just add commands to be run on the commandline. Below is how I have it setup currently for testing. My Auto Scaling group called kamailio successfully launches a new EC2 instance when I terminate one but that either is not triggering this event or once triggered, the event just isn't executing the commands. https://imgur.com/a/T7ZoqNN Edit: I'm working from this tutorial: https://docs.aws.amazon.com/AmazonCloudWatch/latest/events/EC2_Run_Command.html Double Edit: I setup a cloudwatch alarm for invocation of my autoscaling group kamailio. Deleted my instance which triggered the autoscale function and I did get an alarm in Cloudwatch. I'm stumped. Scrapez fucked around with this message at 15:38 on Jan 22, 2019 |
# ¿ Jan 22, 2019 15:05 |
|
JHVH-1 posted:Can you manually run the SSM command to make sure it works? You have the instances set up with the agent and everything right? (Depending on what your base image is I think there is a chance its not installed already). That could be the problem. I did not manually setup SSM at all on the image. I'll look into that. Thank you. Edit: I made sure the SSM agent was running, took a new image. Confirmed that when it launches a new instance, SSM agent is running on startup. Made sure IAM role for the Cloudwatch event has all permissions for SSM. No clue why it isn't working. 2nd Edit: This is everything related to ssm I see in /var/log/messages on the launched EC2 instance: code:
Scrapez fucked around with this message at 17:44 on Jan 22, 2019 |
# ¿ Jan 22, 2019 16:43 |
|
Bit of an obscure question. I'm trying to update a dns SRV record when a new instance is launched. I have a Lambda/python function that is performing a ChangeResourceRecordSets upsert and inserting the IP of the newly launched instance into the Value portion of the SRV record. The problem is that when I launch an additional instance, it replaces the value in the SRV record instead of appending the info for the new instance. I thought with using UPSERT it was supposed to just update a record if it already exists. I'm assuming this somehow doesn't apply to SRV records or the value section specifically. Is my only recourse to list-resource-record-sets for the record, throw the current value in a variable and then perform my ChangeResourceRecordSets adding the existing value and my new value? Just trying to understand if UPSERT should be overwriting the value as I'm seeing or if it's something I'm doing incorrectly.
|
# ¿ Jan 24, 2019 17:36 |
|
Docjowles posted:https://docs.aws.amazon.com/Route53/latest/APIReference/API_ChangeResourceRecordSets.html Thanks. Makes sense. I wish they had something that could just append. Seems like a useful function that people would use.
|
# ¿ Jan 24, 2019 20:08 |
|
What is the best method for triggering an autoscaling event based on output into a log file on the EC2 instances in the group? Use case is a SIP platform and I'd like to be able to trigger a scale out event when number of calls on any given instance reaches X.
|
# ¿ Feb 1, 2019 15:04 |
|
JHVH-1 posted:You can create a custom cloudwatch metric and then use it as your scaling criteria Thank you. That is exactly what I was looking for but google searches had not gotten me to that.
|
# ¿ Feb 1, 2019 16:02 |
|
JHVH-1 posted:Probably have to play around with the alarms and getting the right metrics so you have both scale up and scale down criteria based on something that covers the whole cluster. Definitely. I need to get my head around how I'm going to do that. I'll basically be starting with 3 instances, each having a call capacity of 500 calls. So the thought on scale up was just to spin up another instance if any of the 3 reach 450 concurrent calls. The scale down side becomes a bit more difficult. I'm thinking I can write the concurrent calls value from all active instances and if the average of that number falls below a certain value I terminate one of the instances. That will also require keeping track of how many of this type of instance are active. The good thing about this platform, at least right now, is that it will be very predicable. There won't be huge peak calling events or anything like that. It likely will just continue to slowly scale up over time. One of my fears is that I do this wrong and instances start launching willy nilly all crazy.
|
# ¿ Feb 1, 2019 17:30 |
|
JHVH-1 posted:You can do some tweaking based on time at least. I've done that before reducing the capacity over the weekends when I knew it wouldn't be that important. Yeah that makes sense. Overnight or on weekends potentially. At this point, I don't know exactly who the customer base is going to be so it's possible that it will be a lot of international and there won't be a truly "slow time." Being able to send anything you want back to Cloudwatch via the CLI is really awesome and I can use that for a whole host of other things from alarming problems to reports on various things.
|
# ¿ Feb 1, 2019 17:50 |
|
I'm sure there's a very reasonable explanation but why can't you set DHCP Options Sets at the subnet level? I have different types of machines in a single VPC and was hoping to be able to give them hostnames that would identify which type they are.
|
# ¿ Feb 8, 2019 15:54 |
|
Agrikk posted:This is a perfectly reasonable request and one that I have heard countless times before. Yeah, I guess I understand that. Perhaps they could make it so that only subnets of a certain size would be allowed to have dhcp options sets.
|
# ¿ Feb 8, 2019 16:57 |
|
Agrikk posted:Speaking personally, I believe there should just be a fixed limit, say 5 or 10, of option sets per vpc. One could implement a default option set for the vpc and then allocate subnet option sets for special cases that could override the default. Yeah that would be great. Second on my list of wants behind ELBs that can do UDP.
|
# ¿ Feb 9, 2019 03:26 |
|
I'm trying to get my head around whether I need a Route 53 Outbound endpoint, Inbound inpoint, or both. I have two VPCs. Each have their own dhcp option sets associated and hostnames and dns resolution enabled. VPC 1 has option set production.instance and VPC 2 has option set scrapez.com I have a VPC peering connection setup between them. I want to be able to resolve the records in my production.instance. hosted zone from my instance in the scrapez.com VPC. iE: I need Instance 1: ip-10-0-0-200.scrapez.com to be able to resolve SRV record _sip._udp.pool.production.instance. which has underlying values of: ip-10-100-73-19.production.instance. and ip-10-100-96-92.production.instance. Is a Route 53 outbound endpoint from the originating VPC the way to accomplish this? Or an inbound endpoint to the target VPC? Or other?
|
# ¿ Feb 12, 2019 20:32 |
|
Docjowles posted:You may be overthinking it. I think you just need to do this? I associated both VPCs with the private hosted zone production.instance but the following query still fails from the instances in the source VPC: nslookup -type=SRV _sip._udp.pool.production.instance Server: 10.0.0.2 Address: 10.0.0.2#53 ** server can't find _sip._udp.bvr.production.instance: NXDOMAIN
|
# ¿ Feb 12, 2019 22:28 |
|
Docjowles posted:Just to ask the super dummy question, that same query works fine within the other VPC? Not dumb. I appreciate the feedback. It does work within the VPC. Edit: Follow-up associating the VPC with the private hosted zone DID resolve the issue. I just still had inbound and outbound endpoints setup that were breaking things. Thanks, Docjowles! Scrapez fucked around with this message at 15:43 on Feb 13, 2019 |
# ¿ Feb 13, 2019 05:12 |
|
When performing a describe-network-interfaces, is there a way to do wildcards in the description filter to return all matching ENIs? For example, I have two ENIs with descriptions of: TestAdapter0 and TestAdapter1 Is there a way to do something like `aws ec2 describe-network-interfaces --filters Name=description,Values="TestAdapter*"` Edit: Gosh I'm dumb...that does work. I just wasn't putting the double quotes around the Value
|
# ¿ Feb 15, 2019 17:35 |
|
Is there a cost associated with Elastic Network Interfaces? I can't find anything that talks about pricing so I think they're free to use but I can't find anything definitive.
|
# ¿ Feb 18, 2019 17:05 |
|
Agrikk posted:FYI- So would it be better to set the description of all the ENIs to the same string (TestAdapter) and then instead do the query as: `aws ec2 describe-network-interfaces --filters Name=description,Values="TestAdapter"` There's really no reason I need to do it as a wildcard. I had just planned to set descriptions as TestAdapter1, TestAdapter2, etc but it isn't really a requirement to do that.
|
# ¿ Feb 20, 2019 15:34 |
|
Agrikk posted:That is correct. If this process is going to be anything other than a one-off you should probably build a tagging scheme and do your search based on tags. Right. Searching for them via tags does make much more sense. Then I can add unique descriptions. Thank you!
|
# ¿ Feb 20, 2019 18:20 |
|
Cloudformation's 200 resource limit is a real bummer. I wanted to use CloudFormer to create a cloudformation template with everything in my us-east-1 region to replicate it in us-west-2 but I have 337 resources. It would be nice if Cloudformer could recognize this and break the resources into multiple nested templates.
|
# ¿ Mar 12, 2019 21:29 |
|
Internet Explorer posted:For anyone who has gone through the AWS certification process, about how much studying did it take you? I see that it basically goes Foundational, Associate, Professional, with subcategories along the way. Would it be unrealistic to try to get an Associate cert in a month? Did you do the live trainings or were the self-paced materials enough? The recommended experience is 6 months for Foundational, 1 year for Associate, 2 years for Professional. Did you find that to be accurate or is it something you can pick up with only a surface level of AWS experience? I just took and passed the certified solutions architect associate. I used a combination of a cloud.guru video courses and whizlabs exams to study. It's hard to say how many days I studied as life and work were so busy it was hard to dedicate a month continuously. I think if you focused everyday studying for a month with the two, along with reading the white papers and doing the practice exercises, you have a decent chance of passing. It really just depends how quickly you learn and how well you retain knowledge. The exam was hard and it isn't the type of exam where you can just memorize certain things and be good. You actually have to know what each AWS service does and how it can be used in conjunction with other services to solve an issue in the best and most cost effective manner.
|
# ¿ Mar 13, 2019 04:23 |
|
Cloudformation drift detection...Does it just tell you that objects have changed since you launched your template or is there a way for it to produce an edited Cloudformation template that includes the changes? Or a separate template that only includes the additions/changes?
|
# ¿ Mar 14, 2019 14:30 |
|
Agrikk posted:No. Gotcha. It would be neat if they could sync up drift detection with cloudformer to have it automatically generate a replacement template. As it is, it isn't possible to use cloudformer to create a cloudformation template of say your objects in us-east-1 and restore said template to us-east-2 without manually building some objects.
|
# ¿ Mar 14, 2019 19:18 |
|
I feel like I should be able to figure this out but I'm kind of stumped. Trying to setup a NAT Gateway for a private subnet. Have public subnet 10.10.1.0/24 which has an Internet Gateway attached. I've setup NAT Gateway 10.10.1.99 in this subnet. Have private subnet 10.100.96.0/21. I'm trying to setup the route table for 0.0.0.0/0 to go through the NAT Gateway 10.10.1.99 but it is not listed in the NAT Gateway pulldown list. The two VPCs that the two mentioned subnets are in have a VPC peering connection established between then but not sure that would have any impact on adding route for NAT gateway. I'm guessing it's something glaringly obvious but I'm not seeing it. Anyone help me out?
|
# ¿ Aug 15, 2019 16:53 |
|
You guys had it right. I was trying to share a NAT Gateway through a VPC peering connection which does not work. After I took a step back I kind of realized I didn't need multiple VPCs anyway. After consolidating into one, it works just fine. Thanks!
|
# ¿ Aug 16, 2019 14:22 |
|
Anyone ever run into the issue of receiving "Server refused our key" when attempting to login to a machine? It worked fine a couple days ago and everyone with access to the machine swears they've not been in it but I receive that when trying to login with the appropriate key now. AWS provides this as a resolution of placing code into user data to write the public key into the authorized_keys file: https://aws.amazon.com/premiumsupport/knowledge-center/ec2-server-refused-our-key/ This does not work for me. I thought perhaps the server didn't have the cloud-init package installed so I put `yum -y install cloud-init` into user data and started the instance and then stopped and tried the adding of the key again to no avail. I'm at a bit of a loss here. The only thing I can think of is that the /home/centos/.ssh/authorized_keys file is somehow corrupt. Anyone have any ideas? Of course I did not take an AMI of the machine when it was in a healthy state as I should have.
|
# ¿ Aug 29, 2019 23:31 |
|
deedee megadoodoo posted:You can take an image now then fire up an instance using that AMI and use a known good key. I tried that and still got the same error. I did use the same key when I launched it. I guess I could try it with a different key. Edit: Created a new key. Launched an instance using the AMI of the "bad server" and still receive "Server refused our key" when attempting to login. I've tried using both cloud-init and a general bash script to copy the public key into the /home/centos/.ssh/authorized_keys file and neither seem to work. Would that indicate that the file could be corrupt? user-data executes as root, right? So it shouldn't be a permissions issue? Not that I changed permissions on that file or directory structure anyway. This just happened out of the blue seemingly for no reason. Scrapez fucked around with this message at 03:03 on Aug 30, 2019 |
# ¿ Aug 30, 2019 01:51 |
|
Docjowles posted:Riffing on your corrupt file theory, openssh is (rightly) very paranoid about file permissions. So maybe the .ssh dir or authorized_keys file is being created with inappropriate ownership or permissions? It should be 700 / 600 respectively and owned by the same user as the parent home directory. It's easy for these to be set overly broad in provisioning scripts because the defaults are usually like 755 / 644. If those didn't exist at all and were created as root during cloud-init, they probably have the wrong ownership and permissions unless you are actually logging in directly as root. Which is a bad idea, and also you're probably getting blocked by your sshd_config denying root login. That makes sense. This issue, I've discovered, is actually impacting any machine that uses this particular key pair. I'm using centos to login and my ppk file hasn't changed. I had someone else try to login using their ppk from a completely different machine and they also get the same error. It makes me think that the key pair within AWS where it shows the fingerprint has become corrupt or some sort of weirdness. For one of the machines, I heard from someone else that they still have an active SSH session up so I'm asking them to send me the contents of /home/centos/.ssh/authorized_keys file so I can compare to the original public PEM.
|
# ¿ Aug 30, 2019 14:26 |
|
So thankfully someone still had an SSH connection up to one of the impacted machines. I was able to jump onto her session and figure out that the permissions of /home/centos must have been changed. Once I changed them to 751, everything works fine to that machine. Unfortunately, I still have one other machine in this same state. I've tried adding a script to user data to change permissions of /home/centos to 751 but that didn't seem to help. Will user data allow you to script the change of file permissions or is is somehow restricted from doing that as it could be a security risk?
|
# ¿ Aug 30, 2019 16:09 |
|
AWS provided me with a script that had a bit more to it. Running that resolved the issue. On another note, Is there a way to return only elastic IPs that are unattached through the CLI? I see where I can do an `aws ec2 describe-addresses` to return all elastic IPs. If I have to, I can script around that but thought perhaps I'm missing an easier way to only return unattached EIPs. Writing a user-data script that will go find an unattached EIP and attach it to the EC2 instance when it starts.
|
# ¿ Sep 5, 2019 20:45 |
|
This is a voip telephony application running on the ec2 instances and our outbound carrier has to whitelist IPs to allow them to make calls.
|
# ¿ Sep 8, 2019 01:48 |
|
Cancelbot posted:Based on this; https://docs.aws.amazon.com/cli/latest/reference/ec2/describe-addresses.html This is what I ended up using: aws ec2 describe-addresses --region us-east-2 --query 'Addresses[?AssociationId==null]' | jq -r '.[] | .PublicIp' | head -n 1
|
# ¿ Sep 12, 2019 15:21 |
|
Is there a way to move files into an EFS directly from my desktop machine? Right now, I have to SCP the files up to an EC2 instance and then copy them over to the mounted EFS.
|
# ¿ Sep 26, 2019 20:10 |
|
I did see DataSync. That seems like the method I'll have to go. To expand on my reason for needing this. I've setup our environment in the secure way that AWS suggests with a bastion ec2 host in a public subnet and then all of our ec2 machines in private subnets. The EFS storage is mounted on all the ec2 machines in the private subnets. So, if I want to transfer something up, I have to scp the files to the bastion host and then scp them from there over to the ec2 instance in the private subnets. I don't want to attach the EFS to the bastion as the EFS may contain sensitive data that I wouldn't want accessible from a machine that's in a public subnet. I'll give DataSync a try and see how that does. The crap part is that you have to pay for the service but it does seem very cheap (4 cents per gb of transfer.)
|
# ¿ Oct 1, 2019 16:08 |
|
|
# ¿ May 22, 2024 08:11 |
|
I have an environment setup in AWS where I have a bastion instance in a public subnet and multiple other ec2 instances in private subnets. I have an EFS setup and mounted on all the machines in the private subnets. What is the best method for transferring files between my PC and the EFS? As it sits now, to get a file from the EFS to my local machine, I have to scp it out to the bastion instance and then scp it from the bastion back to my PC. I noticed AWS DataSync but that seems to be for copying huge swaths of data to an EFS in one fell swoop rather than transferring individual log files from time to time like I'm trying to do. Is there a better way than secure copying the file twice to get it back to my machine?
|
# ¿ Nov 22, 2019 21:15 |