Amazon Web Services - Cloud Giant Hits Hard - The Something Awful Forums

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Amazon Web Services - Cloud Giant Hits Hard

«‹›61 »

Pollyanna: Mar 5, 2005; Milk's on them.

Has anyone here used DMS to migrate their Mongo database to DocumentDB? One of my migration tasks is horribly slow (which makes sense, the table is 190gb large) and it has a nasty tendency to silently fail partway through the multihour process. I tried changing the task settings with the CLI to enable debug logs, but it just gives me an �invalid task settings JSON� message when I try. Any way to get details on why it�s a bad input?

# ? Aug 26, 2019 22:57

Adbot: ADBOT LOVES YOU

# ? May 21, 2024 20:36

Cancelbot: Nov 22, 2006; Canceling spam since 1928

Agrikk posted:

-snip-

Also: get your sleep and eat well the day before and have stuff to snack on during your on-site loop. The five hours [or whatever it is these days] can be a grueling affair and you are best to be well-rested and well nourished.

Awesome! Thanks for that as I've been very focused on having everything answered "right" but our TAM can't know everything, but he knows who to ask to find out and that's what I imagine they want to see.

Also on the rest thing; AWS are paying for my travel and stay to London and have recommended I sleep in a hotel the night before, so I'd be insane to not take it up. Otherwise it's a 5am train which isn't a good idea.

I'll take note of the snacks too, am I allowed to take in notes? I know it's not an exam but most interviews it's only my CV/r�sum� that's in front of me, but they've done all that in the screening process.

# ? Aug 26, 2019 23:06

Agrikk: Oct 17, 2003; Take care with that! We have not fully ascertained its function, and the ticking is accelerating.

You can bring in anything you want, but you are under NDA so you might not be able to take any notes out of the interview space.

It�s enforced only when we talk about future roadmap type stuff but know that it does happen.

# ? Aug 26, 2019 23:51

Arzakon: Nov 24, 2002; "I hereby retire from Mafia"
Please turbo me if you catch me in a game.

Cancelbot posted:

Awesome! Thanks for that as I've been very focused on having everything answered "right" but our TAM can't know everything, but he knows who to ask to find out and that's what I imagine they want to see.

Don't focus on knowing stuff. At most, 2 of your interviewers will be spending only part of your interview on technical skills. If your interviewers aren't poo poo it won't be "recite this man page" and will instead focus on having you tell interesting technical stories from your background and digging into how well you actually know what you volunteered as something you know about. You might get some of the interviewers favorite networking/OS/database fundamental questions but if you don't know already cramming probably isn't going to get you there.

Your stories are much more important and you need to be ready for follow-up questions on the scope of your role, actions you took, results you achieved, impact on your business, what you could have done differently, etc, etc. All of your interviewers will be assessing this.

Good luck!

# ? Aug 27, 2019 00:30

Agrikk: Oct 17, 2003; Take care with that! We have not fully ascertained its function, and the ticking is accelerating.

^^^ this as well.

If you have �migrated from on-perm to a datacenter� at the top of your resume you can me very certain that I am going to grill you on it:

- what was your role?
- how did you contribute?

Not �your team� but �you�. If the next word out of your mouth isn�t �I...� then I�m now mad at you, and as your interviewer that�s a bad thing.

- what did success look like?
- what would you do different next time? (What did you learn from this experience?)

It is really apparent to us if you were an active driver of the engagement or if you were a water-carrier.

# ? Aug 27, 2019 02:14

Cancelbot: Nov 22, 2006; Canceling spam since 1928

During my technical screen I was told the same thing, they directed me towards putting things more in "I" terms as I did indeed do things and lead people etc. they told me I was being too diplomatic about it, of course a platform migration is a "we" thing but they only care about what I did to make it a success. Fortunately I am a driver of change throughout my career but British politeness & interviews tend to be anti self-promotion, the ESM told me to be a massive show-off for the onsite.

Cancelbot fucked around with this message at 09:04 on Aug 27, 2019

# ? Aug 27, 2019 09:00

necrobobsledder: Mar 21, 2005; Lay down your soul to the gods rock 'n roll; Nap Ghost

�My role...�
�Thank you for applying to Amazon. We�ll be in touch�
�Uh....�

I�ve done like... 6 migrations or extensions of different applications and platforms but had different roles each time so it�s hard for me to pick a particular one. Each one is a distinct war movie with different actors, different resources, etc. and because they were across time and so many met failures because the companies were ailing my interviews unintentionally sound like I�m this guy I�ve realized.

�Entire production databases were lost�
�We have flattened production accounts 5 times before and have become exceedingly efficient at it�

# ? Aug 27, 2019 09:33

Pollyanna: Mar 5, 2005; Milk's on them.

Pollyanna posted:

Has anyone here used DMS to migrate their Mongo database to DocumentDB? One of my migration tasks is horribly slow (which makes sense, the table is 190gb large) and it has a nasty tendency to silently fail partway through the multihour process. I tried changing the task settings with the CLI to enable debug logs, but it just gives me an �invalid task settings JSON� message when I try. Any way to get details on why it�s a bad input?

Figured this out: AWS expects you to append file:// to reference a local file for input, unlike literally every other program I�ve used. gently caress off, AWS.

# ? Aug 27, 2019 18:01

The Fool: Oct 16, 2003

Are you sure? this looks kinda weird

code:

/home/fool/butts.jsonfile://

# ? Aug 27, 2019 18:04

Pollyanna: Mar 5, 2005; Milk's on them.

s/ap/pre

# ? Aug 27, 2019 18:06

Startyde: Apr 19, 2007; come post with us, forever and ever and ever

If it makes you feel any better, it's also sometimes fileb:// !

# ? Aug 28, 2019 01:38

PierreTheMime: Dec 9, 2004; Hero of hormagaunts everywhere!; Buglord

For AWS Step Functions, is there a way to allow other tasks to continue if a failure occurs in a parallel process and then just fail the parallel task "container" after everything else is done? I've got a series of file movements, extracts, etc. going followed by a simple ETL ingestion (Glue) in three parallel flows and I'd rather not have the whole thing die if an ETL task fails, since it clips what would otherwise be other perfectly valid running processes. I can sequester all the Glue tasks into a secondary parallel task, but then I'm losing out of some available processing time due to some file sets taking longer than others.

Do I just have to have all the jobs have a catch that touches off a simple "blank" task and then end? If I do, I don't want the whole parallel task to succeed.

# ? Aug 28, 2019 03:41

deedee megadoodoo: Sep 28, 2000; Two roads diverged in a wood, and I, I took the one to Flavortown, and that has made all the difference.

You could just remove that entire process from your step function and trigger it another way like triggering the lambda via API calls. Or pushing messages to a queue. That�s how I�d handle a fragile process like that.

# ? Aug 28, 2019 15:20

Bhodi: Dec 9, 2007; Oh, it's just a cat.; Pillbug

Does anyone have any blogs they recommend on implementing federated IAM roles from directory services user groups via hashicorp vault's oidc (or another open source product)? We've got short lived CLI assumed role tokens working (this is neato) and SSH keys too but the oauth piece isn't nearly as well documented. We're trying to make directory services the source of truth. I found a blog from several years ago which talks about some of it but nothing recent. Difficulty: govcloud, otherwise I'd look at adfs.

I'm also investigating pre-shared keys as a second factor or yubi or some other method of having people not have to type in passwords. Gettin' real tired of loading up my google authenticator multiple times a day. There's a pretty astonishing lack of detail on the backend infra of implementing SSO for use with AWS for both console and ec2 instances and workspaces - mostly, I assume, because AWS doesn't offer a native solution and doesn't want to play favorites.

Related: Is there really no way to share workspaces? Not even if we use directory services and pinky-swear not to assume a user role? Since they take several minutes to boot, are there convenient ways to turn them on/off on a timer (lambda?) and provision them for specific users with an env as part of an onboarding process?

Bhodi fucked around with this message at 03:51 on Aug 29, 2019

# ? Aug 29, 2019 03:42

Cancelbot: Nov 22, 2006; Canceling spam since 1928

Got my TAM interview confirmation with all the associated details now, next Friday is my loop day.

Two questions;
1. Why do AWS folk refer to the interviews as "loops"?
2. The TAM test is... suspiciously easy. I don't think I can mention the contents but Agrikk this seems trap-like (but also not a trap). I appreciate I'll get an hour dedicated to the decisions I've made but I thought it'd be meatier.

# ? Aug 29, 2019 21:19

Agrikk: Oct 17, 2003; Take care with that! We have not fully ascertained its function, and the ticking is accelerating.

It�s not a trap. AWS doesn�t play interview games. Our questions are straightforward and you won�t see �why is a man hole cover round?� questions anywhere.

If you found it easy then congratulations: you are exactly who we are looking for.

Edit:
And they are loops because interviews are ultimately an iterative process: if we like you and think you have Amazon qualities, we will keep cycling you through positions until we find a fit for you.

But due to the revamped screening process, a loop is more of a straight line these days.

Agrikk fucked around with this message at 21:42 on Aug 29, 2019

# ? Aug 29, 2019 21:38

The Fool: Oct 16, 2003

Cancelbot posted:

1. Why do AWS folk refer to the interviews as "loops"?

Not an AWS person, but whenever I hear people talk about it I think of "feedback loops" where each interview is a loop, and the results of the previous one is fed into the next one.

# ? Aug 29, 2019 21:40

Scrapez: Feb 27, 2004

Anyone ever run into the issue of receiving "Server refused our key" when attempting to login to a machine? It worked fine a couple days ago and everyone with access to the machine swears they've not been in it but I receive that when trying to login with the appropriate key now.

AWS provides this as a resolution of placing code into user data to write the public key into the authorized_keys file:
https://aws.amazon.com/premiumsupport/knowledge-center/ec2-server-refused-our-key/

This does not work for me. I thought perhaps the server didn't have the cloud-init package installed so I put `yum -y install cloud-init` into user data and started the instance and then stopped and tried the adding of the key again to no avail.

I'm at a bit of a loss here. The only thing I can think of is that the /home/centos/.ssh/authorized_keys file is somehow corrupt.

Anyone have any ideas? Of course I did not take an AMI of the machine when it was in a healthy state as I should have.

# ? Aug 29, 2019 23:31

deedee megadoodoo: Sep 28, 2000; Two roads diverged in a wood, and I, I took the one to Flavortown, and that has made all the difference.

You can take an image now then fire up an instance using that AMI and use a known good key.

# ? Aug 29, 2019 23:58

Scrapez: Feb 27, 2004

deedee megadoodoo posted:

You can take an image now then fire up an instance using that AMI and use a known good key.

I tried that and still got the same error. I did use the same key when I launched it. I guess I could try it with a different key.

Edit: Created a new key. Launched an instance using the AMI of the "bad server" and still receive "Server refused our key" when attempting to login.

I've tried using both cloud-init and a general bash script to copy the public key into the /home/centos/.ssh/authorized_keys file and neither seem to work. Would that indicate that the file could be corrupt? user-data executes as root, right? So it shouldn't be a permissions issue? Not that I changed permissions on that file or directory structure anyway. This just happened out of the blue seemingly for no reason.

Scrapez fucked around with this message at 03:03 on Aug 30, 2019

# ? Aug 30, 2019 01:51

Docjowles: Apr 9, 2009

Riffing on your corrupt file theory, openssh is (rightly) very paranoid about file permissions. So maybe the .ssh dir or authorized_keys file is being created with inappropriate ownership or permissions? It should be 700 / 600 respectively and owned by the same user as the parent home directory. It's easy for these to be set overly broad in provisioning scripts because the defaults are usually like 755 / 644. If those didn't exist at all and were created as root during cloud-init, they probably have the wrong ownership and permissions unless you are actually logging in directly as root. Which is a bad idea, and also you're probably getting blocked by your sshd_config denying root login.

Also a simple thing, but make sure you are using the right username for your AMI.

Docjowles fucked around with this message at 08:27 on Aug 30, 2019

# ? Aug 30, 2019 08:23

Scrapez: Feb 27, 2004

Docjowles posted:

Riffing on your corrupt file theory, openssh is (rightly) very paranoid about file permissions. So maybe the .ssh dir or authorized_keys file is being created with inappropriate ownership or permissions? It should be 700 / 600 respectively and owned by the same user as the parent home directory. It's easy for these to be set overly broad in provisioning scripts because the defaults are usually like 755 / 644. If those didn't exist at all and were created as root during cloud-init, they probably have the wrong ownership and permissions unless you are actually logging in directly as root. Which is a bad idea, and also you're probably getting blocked by your sshd_config denying root login.

Also a simple thing, but make sure you are using the right username for your AMI.

That makes sense. This issue, I've discovered, is actually impacting any machine that uses this particular key pair. I'm using centos to login and my ppk file hasn't changed. I had someone else try to login using their ppk from a completely different machine and they also get the same error. It makes me think that the key pair within AWS where it shows the fingerprint has become corrupt or some sort of weirdness.

For one of the machines, I heard from someone else that they still have an active SSH session up so I'm asking them to send me the contents of /home/centos/.ssh/authorized_keys file so I can compare to the original public PEM.

# ? Aug 30, 2019 14:26

Blinkz0rz: May 27, 2001; MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

I'd nothing works consider spinning up a new instance and mounting the old volume on it to make sure the right key is set in authorized_keys and the file had the right permissions.

# ? Aug 30, 2019 15:53

Scrapez: Feb 27, 2004

So thankfully someone still had an SSH connection up to one of the impacted machines. I was able to jump onto her session and figure out that the permissions of /home/centos must have been changed. Once I changed them to 751, everything works fine to that machine.

Unfortunately, I still have one other machine in this same state. I've tried adding a script to user data to change permissions of /home/centos to 751 but that didn't seem to help. Will user data allow you to script the change of file permissions or is is somehow restricted from doing that as it could be a security risk?

# ? Aug 30, 2019 16:09

Cancelbot: Nov 22, 2006; Canceling spam since 1928

Couldn't you use SSM to get in? SSM doesn't rely on SSH IIRC as long as it's running the agent + has the IAM role, but you mentioned centos so there's a risk you dont have it. I don't know enough about userdata to know what user/permissions it has.

# ? Sep 4, 2019 08:51

deedee megadoodoo: Sep 28, 2000; Two roads diverged in a wood, and I, I took the one to Flavortown, and that has made all the difference.

I�m sure you�ve figured it out by now but yes, your user data scripts execute as root. A common use case is to put configuration steps in the user data so instances can finalize their setup on start. In our use case the script downloads an ansible playbook on startup and executes that.

# ? Sep 4, 2019 12:59

Scrapez: Feb 27, 2004

AWS provided me with a script that had a bit more to it. Running that resolved the issue.

On another note, Is there a way to return only elastic IPs that are unattached through the CLI? I see where I can do an `aws ec2 describe-addresses` to return all elastic IPs. If I have to, I can script around that but thought perhaps I'm missing an easier way to only return unattached EIPs.

Writing a user-data script that will go find an unattached EIP and attach it to the EC2 instance when it starts.

# ? Sep 5, 2019 20:45

fluppet: Feb 10, 2009

You'll want to look at filtering possibly network-interface-id
https://docs.aws.amazon.com/cli/latest/reference/ec2/describe-addresses.html

Why are you using elastic ips over just using the ip that the instance comes up with?

# ? Sep 5, 2019 21:07

necrobobsledder: Mar 21, 2005; Lay down your soul to the gods rock 'n roll; Nap Ghost

I think the EIPs in those situations are primarily used to facilitate security groups/ firewalls that live outside one�s VPCs. Having to do some back and forth tickets across companies to open up a firewall rule is a massive headache so I�ve oftentimes reserved a few EIPs, asked for exceptions for the lot of them, and moved on in life.

It�s usually better to reference other security groups in your rules if possible but that is just not feasible right now and we resort to IP-based ACLs and such.

For this scenario, it sounds like the desired UX is something like DHCP for EIPs. I wrote something similar by having an external service based on DynamoDB claim / lock an EIP and ran a periodic job that cleaned up the locks when instances were terminated. That may be overkill but it kept me from giving way too many permissions to my instance profiles.

# ? Sep 6, 2019 00:57

Docjowles: Apr 9, 2009

Yeah exactly regarding why use an EIP. If the public IP matters and needs to survive your instance being rebuilt for whatever reason (external white lists being the biggie) you�re gonna want that EIP.

# ? Sep 6, 2019 04:17

Cancelbot: Nov 22, 2006; Canceling spam since 1928

Just got back from TAM interview - what an exhausting and exhilarating experience. I think I only flubbed a question once; it was mid-answer where I got asked about RAID which isn't my strong suit and I think I mixed 0 and 1 up. But the rest of the day I feel like I did relatively well. I told my stories, didn't repeat myself and went deep at the right times. I should know next week if it was enough.

# ? Sep 6, 2019 20:13

Agrikk: Oct 17, 2003; Take care with that! We have not fully ascertained its function, and the ticking is accelerating.

The interviews are rough, yep.

I hope the right thing happens for you!

# ? Sep 7, 2019 18:31

Scrapez: Feb 27, 2004

This is a voip telephony application running on the ec2 instances and our outbound carrier has to whitelist IPs to allow them to make calls.

# ? Sep 8, 2019 01:48

Cancelbot: Nov 22, 2006; Canceling spam since 1928

I loving got it! :woop:

I'll be a Senior TAM in 2 months (notice period boo).

Cancelbot fucked around with this message at 12:51 on Sep 10, 2019

# ? Sep 10, 2019 12:49

PierreTheMime: Dec 9, 2004; Hero of hormagaunts everywhere!; Buglord

Cancelbot posted:

I loving got it! I'll be a Senior TAM in 2 months (notice period boo).

Congratulations!

Step Function question for the thread: What�s the best ASL for creating a catch that can send the failed task name and any error in a notification? I�ve played around with it a bit but for whatever reason playing with outputs and paths with $ variables just hasn�t clicked for me yet.

I have a few parallel tasks that I want to alarm if they fail, but want to catch the failure so it doesn�t stop the other tasks.

I also need to figure out how to eventually fail the whole parallel run at the end if one or more tasks failed.

# ? Sep 10, 2019 12:58

Internet Explorer: Jun 1, 2005

Cancelbot posted:

I loving got it! I'll be a Senior TAM in 2 months (notice period boo).

That's really awesome. Congrats!

# ? Sep 10, 2019 13:43

Agrikk: Oct 17, 2003; Take care with that! We have not fully ascertained its function, and the ticking is accelerating.

Cancelbot posted:

I loving got it! I'll be a Senior TAM in 2 months (notice period boo).

Congratulations Cancelbot!

:yotj:

# ? Sep 10, 2019 14:10

Adhemar: Jan 21, 2004; Kellner, da ist ein scheussliches Biest in meiner Suppe.

Cancelbot posted:

I loving got it! I'll be a Senior TAM in 2 months (notice period boo).

Congrats!

What org will you be in?

# ? Sep 10, 2019 17:04

Cancelbot: Nov 22, 2006; Canceling spam since 1928

Enterprise support I think... I've got the contracts & background check coming this week so I'll know for sure, but AFAIK It's in Enterprise.

# ? Sep 10, 2019 19:29

Adbot: ADBOT LOVES YOU

# ? May 21, 2024 20:36

Arzakon: Nov 24, 2002; "I hereby retire from Mafia"
Please turbo me if you catch me in a game.

Congrats and shout when you get tired of your pager and want to be a lazy Solutions Architect, Agrikk never returns my calls.

# ? Sep 11, 2019 01:54

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Amazon Web Services - Cloud Giant Hits Hard

«‹›61 »