|
I terminated the accounts for the CEO of the Japan office because his name landed on my terms sheet and we were told to do the terms on the sheet, no questions. The CEO of Japan was not meant to be terminated. It was a clerical error. I actually got brownie points for following orders, and I also wrote the logging and restore-from-logs scripts, so “heroically” getting him back online after I blew up his account made me look extra good. Work is silly.
|
# ? Jul 10, 2020 02:09 |
|
|
# ? May 30, 2024 14:00 |
|
I deleted someone out of Exchange by accident which killed their AD account, but restoring from the AD recycle bin is trivial. Furiously googling how to restore from ad recycle bin.
|
# ? Jul 10, 2020 02:21 |
|
GreenNight posted:I deleted someone out of Exchange by accident which killed their AD account, but restoring from the AD recycle bin is trivial. We definitely enabled it after that. Man, I'm a gently caress-up.
|
# ? Jul 10, 2020 02:27 |
|
The only reason I ever enabled AD recycle bin was because someone in here hosed up and didn't have it enabled..
|
# ? Jul 10, 2020 02:44 |
|
Y’all have an AD recycle bin?
|
# ? Jul 10, 2020 03:58 |
|
yeah, it's an OU called "deleted users"
|
# ? Jul 10, 2020 03:58 |
|
Third week into my first IT job, having been given domain admin because I was a newly-certified MCSE and knew everything, I casually enabled DNS scavenging at the biggest client we had on a random Tuesday at 10 AM without having done any of the legwork to set the aging properties and make sure timestamps were actually being properly set on records. Like, say, all the records AD has defining where domain controllers are and how clients should reach domain resources. Which were all promptly scavenged. It's an easy fix, just restart Netlogon on domain controllers and they'll reregister the records, but when the sysadmin calls and says hey uh did you change anything recently because no one can logon, finding and implementing that easy fix is not a fun process. Same client got bought by a much larger company a year later and they borged the Exchange mailboxes into their cluster and gave us Exchange admin rights (Exchange 2010). I was quite familiar with Exchange admin by that point, but I'd never worked in a situation like that, where there were actually different pieces of Exchange being used by different parts of the organization. Cue me, a few weeks later, editing the ActiveSync PIN rules for mobile devices....not paying attention to the fact those are in the ORGANIZATION section. As in, the entire 1500-person organization. Got a call an hour later from their main sysadmin asking "hey, uh, did you make a change to this because our CEO is asking why he has to enter a six-digit PIN on his phone" - loving pants-making GBS threads right there. I have a lot of gratitude for that sysadmin though, he was patient and asked nicely and accepted my profuse apologies, when he would have been well within his rights to tear me a new one and remove my Exchange access entirely. But what mllaneza said is correct, poor judgment gives you experience which enables good judgment. I've enabled DNS scavenging a million times since, but never without firmly checking that timestamps are being set properly (and I don't set scavenging to fire until after hours), and although I don't have to deal with Exchange anymore the wider lesson has stuck with me and I always pay attention to the scope of a change.
|
# ? Jul 10, 2020 04:48 |
|
Ghostlight posted:yeah, it's an OU called "deleted users" I laughed until I realized our OU is not even on the right functional level to enable AD recycle bin
|
# ? Jul 10, 2020 05:58 |
|
We've got Server 2008 DCs running functional level 2003. Tech debt ? We have it ! Amazingly enough we have a two-phase upgrade project in process, with a serious hardware buy included. Security is going to have to start doing real work, instead of copying and pasting about 50 slides from the big deck of vulnerabilities every year.
|
# ? Jul 10, 2020 06:29 |
|
I accidentally did a rolling restart of 90% of the europe version of my saas product's microservices. twice. the second time it didn't come back on its own Methanar fucked around with this message at 07:37 on Jul 10, 2020 |
# ? Jul 10, 2020 07:27 |
|
at my last job I don't even know how many disasters I caused one way or another. one of the last incidents was caused by me pushing a change to some logic in how our fanout video encoder processes registers to the core. It just sends a callback URL that the service can be reached at that the core uses for communication back to the fanout parts. did the change and it worked fine so I patted myself on the back for another job well done. Except a few hours later everything burned to the ground when our cloud overflow capacity didn't turn on as it was meant to. When I had built the cloud overflow init stuff, I did it quite a bit different from how the normal baseline DC stuff works. In particular an environment variable that exists in the DC isn't necessarily populated correctly in GCE and that variable was a necessary part of how the callback urls were constructed. So we had this great issue of unidirectional communication with the cloud overflow bits that came up but were non functional. The fanout parts could speak to the core and register themselves, but the core couldn't speak back. This was Bad. The pages started coming in when I was in the backseat of my bosses' car with the entire rest of the SRE team as my boss was driving us back to long island from virginia.
|
# ? Jul 10, 2020 07:34 |
|
oh yeah and there was this thing that we'd ddos ourselves because in response to errors, we'd start logging a poo poo ton about all the errors and we'd kill our nat gateway and we'd then start logging the failure to log and just keep exponentially making the flood of traffic worse. that was only kind of my fault in the sense of I undersized the machine being used for nat. Methanar fucked around with this message at 07:49 on Jul 10, 2020 |
# ? Jul 10, 2020 07:35 |
|
I once nearly caused a catastrophic outage that would have only become apparent during the next reboot of a large file server. There was a script that collected SMART information of all connected drives but there was some kind of problem with it on that server and as a debug step i wanted to write smartctl output into one file per device so I could diff it and see what was going wrong. So in effect: for drive in /dev/disk/by-path/*; do smartctl -a $drive > $drive; done
|
# ? Jul 10, 2020 08:16 |
|
Antigravitas posted:
lmao
|
# ? Jul 10, 2020 08:18 |
|
To this day if you read the first few blocks on those drives you'll get parts of outdated smart data, doubtlessly confusing someone (probably myself) in the future. The reboot after fixing it was a harrowing experience.
|
# ? Jul 10, 2020 09:27 |
Oh wow, I can absolutely feel the sinking feeling that you got when you did that. I remember it from when I did something alike it with tcshs 'foreach'.
|
|
# ? Jul 10, 2020 10:10 |
|
Reminds me of my earliest adventures in Linux backup attempts where I tried to make the backups of the OS on the same disk and fluffed the excluded directories in such a way that it simply ignored that flag and recursively backed up the backup directory forever.
|
# ? Jul 10, 2020 11:00 |
|
Computers are the mirrors that show us for the fools we are. Even with the monitor turned on.
|
# ? Jul 10, 2020 11:57 |
|
First day as a network engineer at my current position. I'm tasked with working on a project that involves us removing completely open trunks between devices and pruning them down to the vlans that are actually in use. Anyone who works with Cisco knows where this is going... I learned the difference between switchport trunk allowed vlan ADD and switchport trunk allowed vlan. Basically I cut off a huge segment of our company from the internet and our LAN for about 30 minutes until I could get to the site and reboot the switch. Its a mistake I've only made once.
|
# ? Jul 10, 2020 13:13 |
|
Everything should have commit confirmed
|
# ? Jul 10, 2020 13:27 |
|
I bricked a switch (as in return to manufacturer not "oops wrong config" (ps gently caress you Meraki)) from 1000 miles away once. Had to fly out with a spare the next morning. My first IT job was very seat of the pants like that but I learned a lot. I can't imagine just waltzing into some major company fresh out of Cisco bootcamp or whatever and touching big important network stuff, that's like buying your 16 year old an BMW M3 instead of a beater Civic to learn on.
|
# ? Jul 10, 2020 13:45 |
|
I had domain admin creds when I was an intern. I was always nervous as poo poo that I'd gently caress everything up. But it was also eDirectory, Zenworks, GroupWise, etc.
|
# ? Jul 10, 2020 13:58 |
|
Wow, some of these stories are triggering uncomfortable memories and remind me why I got out of production infrastructure and moved into DevOps.
|
# ? Jul 10, 2020 14:22 |
|
Something every network guy has done at least once - cut off the branch you're sitting on. I had to barge into a board meeting with a laptop and a serial cable because the network switch that I just accidentally dropped the wrong vlan on was in a closet only reachable from the meeting room. That was pretty awkward. I also had to spend an all nighter restoring a database from tape because I ran a "drop users" in prod instead of test. After that I made sure the test and prod environments were firewalled off from each other, which in turn revealed that our developers had been using services in test for prod features to circumvent change control.
|
# ? Jul 10, 2020 14:25 |
|
The worst I've ever done was accidentally drop a table I was trying to truncate. But that was in a test environment and we were able to trash and refresh it while we were at lunch. The most instructive failures I was witness to, taught me the importance of version control, and that you can get away with literally anything if the CTO is using you as an excuse to expense trips to strip clubs.
|
# ? Jul 10, 2020 14:29 |
|
Dravs posted:Wow, some of these stories are triggering uncomfortable memories and remind me why I got out of production infrastructure and moved into DevOps. Great, now you can code:
|
# ? Jul 10, 2020 14:46 |
|
Arquinsiel posted:Reminds me of my earliest adventures in Linux backup attempts where I tried to make the backups of the OS on the same disk and fluffed the excluded directories in such a way that it simply ignored that flag and recursively backed up the backup directory forever. actually reminds me of manually building a mirror in solaris by piping the partition table from the prod disk to the new mirror and getting the 2 drives confused(I was doing 2 drive replacements at the same time, and got the notes reversed) and overwriting the prod disks partition table. I have never been so terrified of a reboot happening
|
# ? Jul 10, 2020 16:16 |
|
Dravs posted:Wow, some of these stories are triggering uncomfortable memories and remind me why I got out of production infrastructure and moved into DevOps. This doesn't absolve you of risk - my story earlier was puppet code that made a bad assumption about home directories being unique that resulted in significant irreplaceable data loss.
|
# ? Jul 10, 2020 16:39 |
|
Dravs posted:Wow, some of these stories are triggering uncomfortable memories and remind me why I got out of production infrastructure and moved into DevOps. devops isn't safer lol
|
# ? Jul 10, 2020 17:01 |
|
Yeah, devops just lets you automate your gently caress up at scale
|
# ? Jul 10, 2020 17:04 |
|
Emory University found that out the hard way: https://www.itprotoday.com/windows-78/aggressive-configmgr-based-windows-7-deployment-takes-down-emory-university
|
# ? Jul 10, 2020 18:24 |
.Xr Amazon US-East-1, brought down by automation.
|
|
# ? Jul 10, 2020 18:26 |
|
I migrated another clients database to an incoming new client's site once.
|
# ? Jul 10, 2020 18:32 |
|
If someone at work wants to get a rise out of me, all they have to do is quote the hapless "devops" guy who stole one of my projects, abandoned it, and was shocked (SHOCKED) that it needed maintenance, because:This Fuckwit posted:"But I automated the heck out of it!" (Related, my previous boss emailed me the other day with a link to a plush Psyduck and the note, "You probably need this.")
|
# ? Jul 10, 2020 20:29 |
|
In the windows 2000 era, one of the sysadmins at my employer, who was always irritated at login screens and thus made his daily driver account an enterprise administrator, forgot which RDP window had focus and shift-deleted the AD forest. Never seen my boss so happy, since his calendar was gone which de facto cancelled all meetings.
|
# ? Jul 11, 2020 00:29 |
|
Echoing everyone who has demonstrated that any good IT pro has some kind of history of loving everything up in some kind of way. It's seemingly the only way we learn. The worst people I've worked with are the ones who get burned by some stupid bullshit and either learn nothing at all or entirely the wrong lesson. My favourite was the dude who FOR YEARS insisted on typing certain commands by hand in specific ways because one time it didn't work, except it hadn't worked because of some entirely unrelated gently caress up prior to that step being run. Mine was way back in 2012 when I was a junior, younger and far stupider, I was running my first system upgrade and thought I knew everything I ever needed to know. Anything that needed done I'd incorporate into my process and that'd be fine, including renaming ~16,000 users and migrating an entire healthcare DB (from SQL 2000 to SQL 2008). All in all my upgrade instructions were 150 steps. I learned my lesson about 60 steps in at 4am on a Saturday morning after we'd already taken the heathcare software for ~1.2mil patients offline to begin the process when one of my steps didn't work at all. All the remaining steps were of course dependent on this step completing so I was completely out of the designed process. In the end the senior guy I was working with was able to bail me out (by assisting in manually completing the bits that step would have done automatically) and we continued on to complete the upgrade approximately on time. 4 years later I was talking to people about what I'd learned from this process and how clearly the senior I'd been working with had been my safety net and knew what was going on, only for that person to explain that they had had no idea either and it was pure chance they'd been able to bail me out. From that singular experience I learned an awful lot about not overpromising, that I absolutely don't know everything and that nothing trumps making a go live boring and easy. While I've screwed up plenty of things since, nothing has stuck with me as much or as long as that first upgrade.
|
# ? Jul 11, 2020 10:11 |
|
Dirt Road Junglist posted:(Related, my previous boss emailed me the other day with a link to a plush Psyduck and the note, "You probably need this.")
|
# ? Jul 11, 2020 11:29 |
|
Arquinsiel posted:Is it a good Psyduck? APAG
|
# ? Jul 11, 2020 13:14 |
|
Link this psyduck
|
# ? Jul 11, 2020 14:43 |
|
|
# ? May 30, 2024 14:00 |
|
Schadenboner posted:APAG Yeah I don't know what I was thinking there.
|
# ? Jul 11, 2020 15:18 |