|
Wibla posted:So mystery partially solved: A switch is starting to fail, rebooted and somehow triggered an STP root change that made a lot of noise gently caress spanning tree and gently caress even harder everyone who turns it on without bothering to configure root bridge priorities as an aside, you should never be using STP in an environment where you dont have intentional redundant paths. It causes more problems than it fixes.
|
# ? Oct 20, 2022 20:37 |
|
|
# ? Jun 5, 2024 05:07 |
|
Don't you need some sort of spanning tree protocol turned on to do stuff like BPDU guard?
|
# ? Oct 20, 2022 20:40 |
|
You need STP for the unintentional redundant paths.
|
# ? Oct 20, 2022 20:53 |
|
Unexpected Raw Anime posted:gently caress spanning tree and gently caress even harder everyone who turns it on without bothering to configure root bridge priorities This part of the network has been 'less extensively tested' after a hasty go-live a few years ago, so we have a lot of smaller stuff to fix yet. Some of this poo poo is surfacing now because the oldest switches are 10-15 years old and have started to fail We're doing a bottom-up redesign and consolidation of our industrial networks, and this network is one of them, so we'll see how much effort we put in beyond me learning poo poo on the job while causing (intermittent) planned downtime.
|
# ? Oct 20, 2022 21:34 |
|
Thanks Ants posted:It's meant to sort that out automatically iOS/iPadOS 15.6+ only, older build will stay on basic auth without user intervention.
|
# ? Oct 20, 2022 21:48 |
|
Blocked basic auth with conditional access over a year ago when they gave the No Really We Mean It This Time deadline, and worked through the handful of exceptions that the logging called out. There is no excuse for any org to have missed the deadline, that’s just rank incompetence.
|
# ? Oct 20, 2022 22:12 |
I'm trying to do a prioritization exercise. We are trying to go through a list of requests and determine which ones are most important. We defined our goal as X. We defined "Impact" as how the change is likely to improve X. We defined "Risk" as how likely the change is to cause a new problem that reduces Impact. Our goal is to identify those with high Impact and low Risk. That gives us our prioritized list of changes. I spent all day yesterday going through our (incredibly lovely) ticketing system to get all the enhancement requests that could help us reach X. Now the very same people who agreed to our definitions are not happy with the definitions, and want to do X and Y both. Usually I don't mind this so much, it's all part of the game, but these days the flip-floppers are coming from within my own team. It's starting to wear me down.
|
|
# ? Oct 20, 2022 22:24 |
|
devmd01 posted:Blocked basic auth with conditional access over a year ago when they gave the No Really We Mean It This Time deadline, and worked through the handful of exceptions that the logging called out. There is no excuse for any org to have missed the deadline, that’s just rank incompetence. Funny enough certain azure 3rd party enterprise applications are going to fail and the error isn't going to be very clear. For example , a helpdesk person was doing some apple integrations with their macbook management application and auth was failing over and over again. The details of the auth only said interrupted. No other alerts or error codes existed. I , on a total loving hunch, looked the log and saw "single factor auth". Knew right away after that. Azure AD sign on logs should be registering new errors about "unsupported authentication" or some poo poo, not just "interrupted". Because in the case of orgs who have been blocking basic auth through conditional access policies, tracking down this issue was clear. This? not so much.
|
# ? Oct 20, 2022 22:49 |
|
Nothing better than half our GCP artifactory images failing to pull on a Friday. All our repos are federated so we can at least tell folks to pull cross cloud but there’s no rhyme or reason I can tell why some of these images pull successfully and some don’t. Adding insult to injury the error message is a completely incorrect false message about untrusted certificates which I just know is going to waste a few hours with support
|
# ? Oct 28, 2022 18:12 |
|
Can some please explain why when someone accidentally clicked reply all, there is always that one person who has to reply with something like "Think about how much time has been wasted, replying to these emails"
|
# ? Oct 28, 2022 18:51 |
People are pedantic know it all fucks sometimes
|
|
# ? Oct 28, 2022 19:51 |
|
The Iron Rose posted:Nothing better than half our GCP artifactory images failing to pull on a Friday. All our repos are federated so we can at least tell folks to pull cross cloud but there’s no rhyme or reason I can tell why some of these images pull successfully and some don’t.
|
# ? Oct 28, 2022 20:46 |
|
joebuddah posted:Can some please explain why when someone accidentally clicked reply all, there is always that one person who has to reply with something like I have only had power over a person like this once and it was one of the more satisfying days I eve had. They seemed crushed by the fact that I dressed them down for being an rear end in a top hat.
|
# ? Oct 28, 2022 21:07 |
|
Arquinsiel posted:Have the failing ones somehow managed to end up with a broken chain of trust? I’d be surprised since there are good and bad images in the same repo. Some I can pull by SHA, some by tag, others no. The little docker icon identifying an image as an image is missing on all the bad ones so I’m more inclined to think something fucky with replication or the cache. Thankfully this is exactly why we have multiple repos we can fail back to and pull from. We ran into this GitHub issue a few weeks back so I think it’s the same faulty error message at hand here: https://github.com/containerd/containerd/issues/6097
|
# ? Oct 28, 2022 21:21 |
|
That's a real interesting and fucky error.
|
# ? Oct 28, 2022 22:00 |
|
Unexpected Raw Anime posted:as an aside, you should never be using STP in an environment where you dont have intentional redundant paths. It causes more problems than it fixes. Is it weird that in thirty years of networking I have never turned on STP?
|
# ? Oct 28, 2022 23:09 |
|
Agrikk posted:Is it weird that in thirty years of networking I have never turned on STP? I haven't either, but that's because it's on by default for all the switches I use.
|
# ? Oct 28, 2022 23:14 |
|
Github actions is making me miss having gitlab, gently caress e: snip Potato Salad fucked around with this message at 03:06 on Oct 31, 2022 |
# ? Oct 31, 2022 03:02 |
|
Potato Salad posted:Github actions is making me miss having gitlab, gently caress gitlab is pretty great.
|
# ? Oct 31, 2022 03:11 |
|
Actuarial Fables posted:I haven't either, but that's because it's on by default for all the switches I use. I actually had to turn it off on a couple new replacement switches I obtained in the last year. It was causing a rather lengthy delay in users getting an IP address when first connecting to the network.
|
# ? Oct 31, 2022 19:23 |
|
PremiumSupport posted:I actually had to turn it off on a couple new replacement switches I obtained in the last year. It was causing a rather lengthy delay in users getting an IP address when first connecting to the network. Enable portfast on the user facing ports instead. It allows the switch to skip the Listening/Learning steps and go straight to Forwarding.
|
# ? Oct 31, 2022 21:46 |
|
Filthy Lucre posted:Enable portfast on the user facing ports instead. It allows the switch to skip the Listening/Learning steps and go straight to Forwarding. It was quicker and easier to turn off STP. We're not a complicated network and nobody but me dares touch a network cable anyway so STP is not needed. Edit: they're also all user facing ports except the single one being used for uplink.
|
# ? Nov 1, 2022 16:46 |
|
Arquinsiel posted:That's a real interesting and fucky error. The issue ended up being missing manifest.json files, which was caused because we had to delete the underlying storage for one of our clouds awhile back before we realized we could orphan services using PVCs. Federated repos only update new files by default though, so images that hadn’t been updated in time didn’t get replacement manifests but did get the image itself apparently. Regardless, running a full sync fixed it nicely.
|
# ? Nov 1, 2022 16:49 |
|
PremiumSupport posted:It was quicker and easier to turn off STP. We're not a complicated network and nobody but me dares touch a network cable anyway so STP is not needed. You say this... In my younger years I worked in an office with unmanaged (and thus, no STP) switches. Some end user managed to accidently bridge their wireless adapter with the wired one on their notebook and subsequently brought down the entire network when they docked their computer.
|
# ? Nov 1, 2022 17:25 |
|
Yeah, not having STP on at endpoint switches is asking for pain, IMO.
|
# ? Nov 1, 2022 17:27 |
|
The Iron Rose posted:The issue ended up being missing manifest.json files, which was caused because we had to delete the underlying storage for one of our clouds awhile back before we realized we could orphan services using PVCs. Federated repos only update new files by default though, so images that hadn’t been updated in time didn’t get replacement manifests but did get the image itself apparently. Regardless, running a full sync fixed it nicely.
|
# ? Nov 1, 2022 17:56 |
|
Arquinsiel posted:So the manifest.json files were telling things where to check the certs and lacking updates they were pointing to certs that didn't exist? Not quite. This is a docker manifest, which has information about layers, size, and the digest. It can also give information about the OS and CPU arch an image was built for. This file was missing from the repositories we were trying to download from, so containerd failed to get the layers and download the image. The error returned from artifactory to containerd *should* have been “invalid image”, and if we were using docker as our container runtime it would have been the error we got. However, due to the aforementioned GitHub issue I linked, artifactory instead told containerd it was a cert issue, even though it was nothing of the sort. The Iron Rose fucked around with this message at 20:48 on Nov 1, 2022 |
# ? Nov 1, 2022 18:05 |
|
Internet Explorer posted:Yeah, not having STP on at endpoint switches is asking for pain, IMO. 100%. Let's set a scene. A computer and IP phone was just removed from a cube. The tech left the cat5 cable that was connected to the phone dangling from its port. The other end of the cable just happens to be laying next to the port that the computer was plugged into. How long before a well-intentioned yet ignorant worker sees the disconnected cable conspicuously next to a hole it clearly fits in, and decides that must be the reason the printer on the other side of the cubicle wall "isn't working"? Trick question - it already happened as you were reading this and now your day is hosed
|
# ? Nov 1, 2022 20:35 |
|
KillHour posted:100%. Let's set a scene. A computer and IP phone was just removed from a cube. The tech left the cat5 cable that was connected to the phone dangling from its port. The other end of the cable just happens to be laying next to the port that the computer was plugged into. How long before a well-intentioned yet ignorant worker sees the disconnected cable conspicuously next to a hole it clearly fits in, and decides that must be the reason the printer on the other side of the cubicle wall "isn't working"? I'd have to take off both my shoes to use my fingers and toes to count how many times users looped the network using the PC port on a VOIP device.
|
# ? Nov 1, 2022 21:38 |
|
Yuuuuup. It's also one of those things where you get to run around like an idiot trying to figure out what's going on. Because if you don't have STP enabled on end-user facing ports, you definitely don't have things set up to easily be able to track down looping.
|
# ? Nov 1, 2022 21:40 |
|
I know in my case I had to sniff traffic to find the offending mac address, track down the vendor and only blind luck made it so there were only a handful of those in the office to physically look at. Took me about 45 minutes from onset of symptoms to tracking down what happened and with whom. Not too bad, but it could have been much worse.
|
# ? Nov 1, 2022 22:31 |
somebody fidgeting in a conference room brought our entire office's network down because they plugged an ethernet cable's two ends into the same port. it really seems like something that should be more idiot proof.
|
|
# ? Nov 1, 2022 22:57 |
|
It is, if you have STP on and configured properly.
|
# ? Nov 1, 2022 22:58 |
|
Internet Explorer posted:It is, if you have STP on and configured properly. But they said no more grunge rock in the conference rooms.
|
# ? Nov 1, 2022 23:14 |
|
An ip phone nearly took down one of our critical networks a few years ago. Not a good time
|
# ? Nov 1, 2022 23:16 |
|
Polio Vax Scene posted:somebody fidgeting in a conference room brought our entire office's network down because they plugged an ethernet cable's two ends into the same port. it really seems like something that should be more idiot proof.
|
# ? Nov 1, 2022 23:32 |
|
That reminds me of back when I did IT support for a university campus of ~500 students. It was pretty calm usually, except when someone plugged in their router backwards every now and then, and inadvertently created a rogue DHCP server that took the network down. Until we scrambled to find the offending outlet and shut it down, then map it back to a dorm number, and have a stern talking-to with the resident. The only other thing I really did was forward copyright infringement letters to idiots who used public Torrents, instead of the underground student DC++ warez network.
|
# ? Nov 1, 2022 23:51 |
|
Arquinsiel posted:Uhm... they got both ends of the cable into the port at once? How? I'm figuring it's more along the lines of a plate with two physical ports labeled something like 43A and 43B.
|
# ? Nov 2, 2022 00:21 |
|
Proteus Jones posted:I'm figuring it's more along the lines of a plate with two physical ports labeled something like 43A and 43B. Yeah, this was my assumption as well.
|
# ? Nov 2, 2022 01:13 |
|
|
# ? Jun 5, 2024 05:07 |
|
Someone doing that with a network cable is how our department took over switching. Person was at the staging line BSing with a tech, saw a loose cable, and plugged it into a jack and then headed out the door to lunch. I was called to see what was going on, I could get to our router, but nothing past it, and I could see the uplink port was going wild. I couldn't get an answer out of anyone if something had changed there. Thankfully I finally go to someone who recalled that cable being plugged in. They unplugged it, and things calmed down. This took our helpdesk down for about 30 minutes. Later had it happen to a switch at a remote site, I caused it. This location still had the old HP switches and I was onsite to turn up gear and a person was helping me reconnect all the cables. It was a giant mess on the floor, and we didn't notice we plugged in both ends until he noticed a local server there dropped. Thankfully, no one was on that switch yet. CitizenKain fucked around with this message at 02:09 on Nov 2, 2022 |
# ? Nov 2, 2022 02:06 |