|
I'm told that true redundancy means you should be able to pull power, fiber or network out of any port at any time with no interruption to service, and I'm like yeah why don't you get your hands the hell off my infrastructure.
|
# ? Apr 19, 2016 13:37 |
|
|
# ? May 14, 2024 06:59 |
|
The worst that I can recall doing was being out in a village in Western Alaska to fix one of our satellite internet connections in an office and reconfigure the little ASA we had with a new static IP. Some kids had climbed under the building (99% of buildings in that part of Alaska are on gravel pads with adjustable metal frame foundations) and cut our cables and done a bunch of other damage. I re-peaked the satellite dish and started running new cable but couldnt find the previous hole that had been drilled from the foundation through the floor, for some reason they hadnt put an access pipe in this building. Anyway I go ahead and drill a new hole, have a little bit of difficulty but manage to punch through, run my cables and get everything fixed. Coincidentally the phones in that office quit working. Later I found out that after the kids had vandalized the cables and the building foundation someone had come by and put up new plywood under the building covering the existing cables and holes which I couldnt see. I drilled right through all of their phone lines. Thankfully this is Western Alaska so you cant get much more relaxed, the employees were happy to have their internet back and used their cell phones in the meantime. I finished quick which meant a guy I knew took me for a spin and we got to explore an old abandoned White Alice antenna site.
|
# ? Apr 19, 2016 13:57 |
|
MC Fruit Stripe posted:I'm told that true redundancy means you should be able to pull power, fiber or network out of any port at any time with no interruption to service, and I'm like yeah why don't you get your hands the hell off my infrastructure. It was fun testing the SONAS PoC with an IBM engineer a few years back, and just walking over to the breaker panel and flipping switches under heavy I/O load to see what would happen.
|
# ? Apr 19, 2016 14:16 |
|
My favorite non-failover are enough packet loss to gently caress a VPN, but not enough to fail to another provider
|
# ? Apr 19, 2016 14:23 |
|
In no particular order of favorites: Applied over-zealous CoPP policy to all switches in the metro area, caused about 3 million people to connect but not browse our metro-wifi for 8 hours Applied router config to an ASA due to accidentally right clicking in putty for biggest oil company in america, crashed their intranet Thought I was network jesus and solved some issue with a load balancer for a big fragrance company, ended up causing a spanning-tree loop and crashed their network during peak hours Applied a NAT exemption on an ASA which somehow corrupted the flash then synced the corruption to the standby ASA, both crashed and needed remote hands to resolve. Screwed up payroll for some companies I think, not the coolest outage. Most of those were done at a MSP that didn't have change control or maintenance windows for their clients, after having both of those things for the past 4 years I can't imagine not having it. Sepist fucked around with this message at 15:00 on Apr 19, 2016 |
# ? Apr 19, 2016 14:57 |
|
go3 posted:My favorite non-failover are enough packet loss to gently caress a VPN, but not enough to fail to another provider gently caress this so much. Having experience that hell once before and spending the better part of a week troubleshooting it, driving myself insane, before I realized it was ATTs lovely connection.
|
# ? Apr 19, 2016 14:57 |
|
My best (almost) fuckup was issuing a shutdown command to a core SAP server I was cleaning /var on instead of the server we were replacing the CPU on caused by me having WAY to many putty sessions to way to many servers(I was running 3 concurrent hardware maintenances) Fortunately, I realized what I did and Solaris can take so long to acknowledge a shutdown command I was able to abort in time to save it.
|
# ? Apr 19, 2016 15:09 |
|
Sepist posted:Applied router config to an ASA due to accidentally right clicking in putty for biggest oil company in america, crashed their intranet
|
# ? Apr 19, 2016 15:12 |
|
I once restarted an entire rack of Cisco switches on accident because they were stacked and I had not learned to slow the gently caress down yet. Brought down the entire network. It was down for as long as it takes to reboot a stack (which isn't that long) and I people just assumed it was an internet hiccup. But I knew...... Collateral Damage posted:This is why you always change the configuration of Putty from right-click-paste to right-click-menu. As much as I am a die hard command line purist for all things cisco I refuse to use the IOS to configure an ASA. I might dump an initial config that way, but anything beyond that is going to be through the ASDM thing. Firewall configuration is just too loving long on IOS. Sickening fucked around with this message at 15:20 on Apr 19, 2016 |
# ? Apr 19, 2016 15:16 |
|
I don't mind the CLI for the ASA. CLI on the FIC sitting in front of a UCS is definitely where you don't want to use CLI though.
|
# ? Apr 19, 2016 15:24 |
|
Hmm what are all these incidents about running out of inodes? Oh well if it's anything important the ops guy will call. Moral: never assume someone else is doing his job.
|
# ? Apr 19, 2016 15:34 |
|
I was trying to figure out how to fix a problem on a Dev server. I didn't take a snapshot because I was dumb. I'd replaced about five files, and remembered to save copies of four of them. Things broke, and the one file I didn't back up was the problem. I logged into the Prod server, copied the file and fixed dev. About five minutes later, Prod went down. Fast forward an hour, I got in touch with someone more senior. Turns out the database had coincidentally filled up right at that exact time. I totally thought I was gonna get fired.
|
# ? Apr 19, 2016 16:06 |
|
Dr. Arbitrary posted:I was trying to figure out how to fix a problem on a Dev server. I didn't take a snapshot because I was dumb. This is why, even tho i back up the original, i also comment out what i am changing and add a new line. E: \/\/\/\/\/ is there such a thing as too much safeguarding against losing your config files? RFC2324 fucked around with this message at 16:22 on Apr 19, 2016 |
# ? Apr 19, 2016 16:13 |
|
I too wrote my own 0-line revision control system. It's really discoverable, I have 45 versions of the same config file in this directory with names like haproxy.cfg.20160316-2-working
|
# ? Apr 19, 2016 16:17 |
|
Vulture Culture posted:I too wrote my own 0-line revision control system. It's really discoverable, I have 45 versions of the same config file in this directory with names like haproxy.cfg.20160316-2-working Get on my old coworker's level. He'd create all those lovely backup files... and then check them into source control along with the actual running config
|
# ? Apr 19, 2016 17:42 |
|
Vulture Culture posted:I too wrote my own 0-line revision control system. It's really discoverable, I have 45 versions of the same config file in this directory with names like haproxy.cfg.20160316-2-working This is exactly what I do, rename it to the same name and add .yyyymmdd-#
|
# ? Apr 19, 2016 19:29 |
|
Ohhhh the crap I just pulled on a folder full of installs. Recreated here on some generically named MSIs, but this was a fun "why is every file in the folder called the same thing now?" moment as I realized I'd be piecing that back together. So I'm looking around in the list of files trying to find the executable it wants me to update, and if I click on any one of them by accident, I RENAME IT! Oh no, its been renamed *nothing*, it was some kind of important system file, and the computer crashes!
|
# ? Apr 19, 2016 21:18 |
|
MC Fruit Stripe posted:
Woah, blast from the past.
|
# ? Apr 19, 2016 21:25 |
|
It's what I thought of immediately, I was like, how am I this guy (worth mentioning that this actually happened, I didn't just suddenly have an urge to rehash a 10 year old joke we've all heard)
|
# ? Apr 19, 2016 22:08 |
|
Vulture Culture posted:This is true, but this annoys me because nobody understands partial failures or how they work in the real world. Like yeah, sure, pulling this fiber port and sending a physical link loss signal downwire to the adapter is great and all, but that's a really easy failure to recover from. What happens when one controller just stops responding to I/O? What happens when you start getting checksum errors on data across your link? What happens when data is flowing down the wire at 2% the speed it's supposed to? This is my favorite thing in the world to demonstrate to people. Standing anywhere in a site talking to them about redundancy, reaching over blindly and just yanking a power cord out of something. (Or doing the PoC stuff with a specific vendor, same discussion, and same action. Or just popping out a line card. The more they squirm the less I trust them. It's more about their reaction than what the hardware actually does.) "What are you doing?!" "I DON'T KNOW! "
|
# ? Apr 19, 2016 23:08 |
|
MC Fruit Stripe posted:You're being facetious, but I've never been nervous at work, I don't think. Someone should regale us with a story about a time that you thought, terrified, to yourself, "well that's it, I'm getting fired". As long as you're okay now and we can look back and laugh. If you're posting from the library, please don't. Long time ago, probably 1998 or so I had just sold my small business and got my first 'real' corporate america job as the Sr. network engineer working for Xylan in Calabasas CA. Day one was all HR stuff, meet the staff etc. Day two I'm reading through the Xylan manuals and taking stock of inventory. One main office, 16 US remote locations, 17 out of contry remote locations, all brick and mortar, all in a spoke and hub configuration, no remote offices had internet. All Xylan products, switches, routers, everything. Xylan had a nifty little feature where you could pull configurations from remote routers with a single command. I entered that command. Every one of our 33 remote locations went offline within the next 4 minutes. I sat there as the monitoring software (intermapper) starting BLARING the star trek alert sound in the data center, certain I was going to be frogmarched right out the door. Turns out there was a bug in the software build the core router was using that caused it to push its config to all the remote endpoints, not pull theirs. That wasn't supposed to even be possible, and wasn't documented at all.
|
# ? Apr 20, 2016 00:42 |
|
Collateral Damage posted:This is why you always change the configuration of Putty from right-click-paste to right-click-menu. You can do this? Holy poo poo. Because gently caress accidentally pasting commands from your clipboard.
|
# ? Apr 20, 2016 01:09 |
|
MC Fruit Stripe posted:Ohhhh the crap I just pulled on a folder full of installs. Recreated here on some generically named MSIs, but this was a fun "why is every file in the folder called the same thing now?" moment as I realized I'd be piecing that back together. Sometimes, ctrl-z can fix this. Windows is very strange about it, though, and sometimes it doesn't.
|
# ? Apr 20, 2016 02:58 |
|
MC Fruit Stripe posted:You're being facetious, but I've never been nervous at work, I don't think. Someone should regale us with a story about a time that you thought, terrified, to yourself, "well that's it, I'm getting fired". As long as you're okay now and we can look back and laugh. If you're posting from the library, please don't. Here's a wall of text: Iraq, 2003. I noticed a corrupt backup, and then proceeded to overwrite system files with the corrupt backup instead of the other way around (I still blame the awful GUI for that one). Knocked out military phone/data communications for our site and supporting sites, so pretty much all coalition forces operating in the city. Since I just put our platoon out of work, I was volunteered for convoy duty to obtain a replacement drive. Drive up there was uneventful, but in the meantime, a prominent Shiite cleric was assassinated. You know those Walter Mitty types that swear they'd kick some hijacker rear end if it was them on the plane? Well, they're in Iraq too. The town is emptied out as everyone's taken to the streets in a show of force. AKs unslung, sandbags emplacements with RPKs on roofs, meanwhile we're just nudging our trucks through the crowds to get back to camp. Again, this is 2003, so our humvees have canvas for doors and covering instead of armor. Kids start showing off, first by trying to put their finger in the barrel of our rifles, then once they realized we weren't going to start a massacre, the older ones get emboldened and start reaching inside the humvee. I'm trying to play this off the best I can, grabbing their hands and shaking them while pushing them out, while the driver is frantically shouting to do something to get them to stop. Well, when you're surrounded by a mob of angry people, and some of them can empty a drum magazine into your canvas roof, turns out there's not much you can do that doesn't play out like Black Hawk Down. So I spent a few agonizing hours contemplating the notion that if anyone dies, it's on me. We all made it back though. On the bright side, I did see a kid in the crowd, no more than 10, wearing a "legalize weed" tshirt, complete with smiling anthropomorphic pot leaf. How he got that shirt was later a topic of much speculation. 0/10, would not try to replace backups again.
|
# ? Apr 20, 2016 04:07 |
|
Anyone care to follow that?
|
# ? Apr 20, 2016 04:24 |
|
Where's the poo poo? Every good Army story involves poo poo in some way, shape, or form.
|
# ? Apr 20, 2016 04:25 |
|
"Cthulhuite, your backups from four weeks ago are borked. You must enter...the Gauntlet!" That story, Contingency, holy poo poo. I doubt there's many IT folk who can say that they've made a mistake which could have literally cost them their life.
|
# ? Apr 20, 2016 04:26 |
|
Vulture Culture posted:This is true, but this annoys me because nobody understands partial failures or how they work in the real world. Like yeah, sure, pulling this fiber port and sending a physical link loss signal downwire to the adapter is great and all, but that's a really easy failure to recover from. What happens when one controller just stops responding to I/O? What happens when you start getting checksum errors on data across your link? What happens when data is flowing down the wire at 2% the speed it's supposed to? My favourite's one my boss told me about, where a multimode fiber cable broke on only one fiber core, so we saw everything as being up and fine while the site was down from their side.
|
# ? Apr 20, 2016 04:36 |
|
psydude posted:Where's the poo poo? Every good Army story involves poo poo in some way, shape, or form. Sorry, no poo poo, only Shiites.
|
# ? Apr 20, 2016 04:50 |
|
Tbh, backing up corruption sounds pretty normal for the US in the third world.
|
# ? Apr 20, 2016 05:43 |
|
22 Eargesplitten posted:Tbh, backing up corruption sounds pretty normal for the US in the third world. Contingency posted:Here's a wall of text: Something something metaphor.
|
# ? Apr 20, 2016 05:48 |
|
MC Fruit Stripe posted:
Vilerat
|
# ? Apr 20, 2016 05:53 |
|
SaltLick posted:Vilerat
|
# ? Apr 20, 2016 06:14 |
|
Agrikk posted:You can do this? Holy poo poo.
|
# ? Apr 20, 2016 08:38 |
|
SaltLick posted:Vilerat anyone but Hillary.
|
# ? Apr 20, 2016 12:23 |
|
Not gently caress-up related, but our RPG programmer whose been with the company for 30 years announced his retirement at the end of the year. Thanks for giving us an 8 month notice but holy poo poo it's gonna be fun interviewing people for an RPG job. We're asking him if he wants to get paid $150/hr after retirement for project work and as a knowledge base.
|
# ? Apr 20, 2016 13:04 |
|
Anti tank or anti air RPG programmer?
|
# ? Apr 20, 2016 13:13 |
|
Swink posted:Anti tank or anti air RPG programmer? http://www.rpgmakerweb.com This poo poo looks easy. Where my $150 at?
|
# ? Apr 20, 2016 13:15 |
|
Even more dangerous - AS400. Also needs to know the differences between v3 and v4 RPG and this: https://en.wikipedia.org/wiki/IBM_i_Control_Language
|
# ? Apr 20, 2016 13:15 |
|
|
# ? May 14, 2024 06:59 |
|
At the end of the day an AS/400 is just another computer. Finding someone full-time may be difficult. You could try finding a consultant and outsource?
|
# ? Apr 20, 2016 13:52 |