|
my homie dhall posted:I have no idea how they have convinced people that their product, which totally dumps decades of investment and research in memory management, is ready for prime time it’s because linux is poo poo at managing swap so you are better off not having swap, eating the OOMs, and moving pods off the overloaded node to another one. distributed systems are designed around fail-stop and swap turns fail-stop into fail-slow. it doesn’t help for the scenarios k8s is deployed into. that’s why k8s didn’t support swap originally. idk why it’s starting to—demand from people like you maybe? I won’t be enabling it
|
# ? Dec 10, 2021 23:03 |
|
|
# ? May 28, 2024 16:16 |
|
Nomnom Cookie posted:it’s because linux is poo poo at managing swap so you are better off not having swap, eating the OOMs, and moving pods off the overloaded node to another one. distributed systems are designed around fail-stop and swap turns fail-stop into fail-slow. it doesn’t help for the scenarios k8s is deployed into. that’s why k8s didn’t support swap originally. idk why it’s starting to—demand from people like you maybe? I won’t be enabling it I don't think it's a matter of fail-stop vs fail-slow, not having swap is strictly worse in low-memory conditions. you're going to be paging to disk no matter what, you probably want it to be the memory a bunch of programs allocated at startup and never used again rather than hot items in the page cache or the executable sections of your programs if you don't want your programs to be able to overcommit memory, that's your choice. but my understanding is that in v2 cgroups the accounting is going to (correctly) include things outside of anonymous memory like pages and kernel allocations, so your users might try to allocate less than whatever their limit is and things will either be killed or "slow down" because you've reduced the number of pages they're allowed to have in the cache
|
# ? Dec 11, 2021 02:34 |
|
my homie dhall posted:I don't think it's a matter of fail-stop vs fail-slow, not having swap is strictly worse in low-memory conditions. you're going to be paging to disk no matter what, you probably want it to be the memory a bunch of programs allocated at startup and never used again rather than hot items in the page cache or the executable sections of your programs in practice not having swap means the OOM killer comes out to play quite quickly. kubelet will also start evicting pods once free memory gets low enough. yeah the kernel may swap out some text pages but it can’t reclaim enough memory that way to bring the system to a halt. it will just give up and start axe murdering processes. which is what you want especially with typical cloud deployments where all your storage is some kind of cloud disk throttled to like 1k iops, swapping is definitely not something you ever want to happen.
|
# ? Dec 11, 2021 04:58 |
|
Nomnom Cookie posted:it’s because linux is poo poo at managing swap Swapping is probably the least important thing the page fault handler does. Not worth the performance hit to try to improve it.
|
# ? Dec 11, 2021 05:19 |
how about, i dunno, fixing the OOM killer? oh wait, people have been trying for over a decade what it invariably comes down to is either people talking about performance as if that's the most important thing over system stability when the system is in production use, or it's some vague "you don't need that" post, as in the first response to the lwn.net article above BlankSystemDaemon fucked around with this message at 06:30 on Dec 11, 2021 |
|
# ? Dec 11, 2021 06:10 |
|
the OOM killer exists and is terrible because overcommit is a necessarily idiotic implementation detail of the fundamental Unix design flaw called fork(2)
|
# ? Dec 11, 2021 07:54 |
|
There's nothing to fix. Exhausting all available physical and virtual memory is an error condition caused by bugs or bad design on the part of the user. "System stability" is kind of a moot point when the other option is to automatically flush all dirty buffers/cache to disk and halt the system or wait for a human oracle rather than a heuristic. Giving people (more) control over the heuristic just makes it more likely the OOM killer won't be able to bring the system into a safe state to even respond to sysrq keys. Truly important work gets check pointed/has a commit log. Those mechanisms will fail if the kernel can't recover to the point that pending file writes can complete.
|
# ? Dec 11, 2021 08:47 |
|
"You don't need a swap partition" and "just set vm_swappiness to 0 or 1" and "swap kills SSDs" are some of the worst advice ever given by Linux "experts". Of course don't put swap on a IOPS-limited network disk or SD card or whatever, there's no reason to be daft, but unless you truly absolutely know that you have vastly more memory in your machine than your workload will ever need, you will want some swap to free up RAM for disk cache and actively running application that need to be responsive, rather than relying on their own private caching setup. This entire idea of swap only being something computers needed in the old days, because memory was expensive is a myth.
|
# ? Dec 11, 2021 09:24 |
|
on the servers at work i even have swap on the vm hosts, because even with 256gb of ram, vm performance is just better, and it lets me overprovision ram in a couple hungry vms that gets swapped out when unneeded on the cloud vms we host the webshits on i have 0b of swap, because even though digitalocean is supposedly all SSD, filling up ram absolutely murders performance somehow, usually to the point where even ssh becomes unresponsive. if a website is leaking enough to cause swapping, i'd rather it gets killed, which then gets logged, than me trying to figure out what the gently caress was going on on a vm before i had to power cycle it because it was unresponsive due to a single rogue process eating up ram on my shitbox at home i have 0b of swap because an old drive died, but i haven't bothered fixing it yet because i have 64gb ram and i no longer require running local vms, and don't use chome so my current peak usage is like 12-16gb with *everything* running i'll be replacing the last of my hdds with flash next year and will fix it then
|
# ? Dec 11, 2021 10:08 |
|
Truga posted:on my shitbox at home i have 0b of swap because an old drive died, but i haven't bothered fixing it yet because i have 64gb ram and i no longer require running local vms, and don't use chome so my current peak usage is like 12-16gb with *everything* running I have 32GB and run both Firefox and Chrome for different purposes. I definitely need swap
|
# ? Dec 11, 2021 10:23 |
|
for VMs you can also use zram for the swap partition
|
# ? Dec 11, 2021 10:48 |
|
Any time my computer starts swapping I get paranoid that it's dying/freezing. Can't be the only one.
|
# ? Dec 11, 2021 11:03 |
|
If you know a bunch of your pages are inactive, swapping them out to have more cache is good, actually. How well does KSM perform with container weeaboo? I don't have big enough container hosts to test.
|
# ? Dec 11, 2021 11:16 |
|
Antigravitas posted:If you know a bunch of your pages are inactive, swapping them out to have more cache is good, actually. Does the kernel still use LRU? On a desktop some unused pages have a lot more utility than others. You'll definitely notice if you come back to your computer and X has to be paged back in 4kb at a time. Dunno if you can give hints via malloc now either.
|
# ? Dec 11, 2021 11:23 |
|
You can tell Linux to keep your pages resident. Sadly, Linux does not have anything as good as the ARC…
|
# ? Dec 11, 2021 11:25 |
|
preemptively swapping means your OS is freeing up memory that's being use for no reason. if you have an SSD on a SATA3 port that's 600 MB/s of maximum I/O bandwidth and your random IOPS is going to be high enough that dumping a few hundred pages to disk doesn't matter. it can do that in preparation for something else being loaded, and native command queueing means that it can interleave swap flushes with loads for negligible I/O latency cost
|
# ? Dec 11, 2021 11:28 |
|
Kazinsal posted:preemptively swapping means your OS is freeing up memory that's being use for no reason. if you have an SSD on a SATA3 port that's 600 MB/s of maximum I/O bandwidth and your random IOPS is going to be high enough that dumping a few hundred pages to disk doesn't matter. it can do that in preparation for something else being loaded, and native command queueing means that it can interleave swap flushes with loads for negligible I/O latency cost Why cause that negligeable bit of wear on an SSD when I can just buy 32gb more ram instead? Antigravitas posted:You can tell Linux to keep your pages resident. Hey yeah, are you talking about mlock? Looking for ways to do it and you can force it, but you can also set per cgroup swappiness now via systemd.
|
# ? Dec 11, 2021 11:49 |
|
SYSV Fanfic posted:Why cause that negligeable bit of wear on an SSD when I can just buy 32gb more ram instead? My motherboard is officially limited to 16GB, but works with 32GB*, anything beyond that is nope. *The memory controller in the Phenom II can address 32GB and absolutely no more. I've been using this SSD (Samsung 840 Evo) since it was the fanciest and newest you could get and this PC had 4GB RAM, with swap enabled. According to the conservative values from SMART, it has 95% wear life left. I'm not worried about SSDs wearing out.
|
# ? Dec 11, 2021 12:09 |
|
Maybe with the SSDs you can buy on a KozmoNaut salary old person joke, sorry ur absolutely right though
|
# ? Dec 11, 2021 15:03 |
|
swapless oom killer should just unmap the lru page on the system, and have the page fault handler kill the process that touches a page unmapped this way.
|
# ? Dec 11, 2021 15:07 |
|
Cybernetic Vermin posted:swapless oom killer should just unmap the lru page on the system, and have the page fault handler kill the process that touches a page unmapped this way. Whether than focusing on whether this idea is good or bad, I'm trying to envision how this policy would be represented in a metaphorical computer world you could go inside of, like TRON.
|
# ? Dec 11, 2021 15:18 |
SYSV Fanfic posted:There's nothing to fix. Exhausting all available physical and virtual memory is an error condition caused by bugs or bad design on the part of the user. "System stability" is kind of a moot point when the other option is to automatically flush all dirty buffers/cache to disk and halt the system or wait for a human oracle rather than a heuristic. SYSV Fanfic posted:Any time my computer starts swapping I get paranoid that it's dying/freezing. Can't be the only one.
|
|
# ? Dec 11, 2021 15:43 |
|
BlankSystemDaemon posted:There's something deeply wrong with the entire loving virtual memory management when the response to fixing a tiny piece of software is to completely abandon (demand) paging and insist on going back to physical memorymapping, because there's apparently a guarantee that if any swapping happens, the system will crash. Talking about being anxious my hardware is dying when swapping happens. Like... did this thing just freeze for five seconds ten times in a row b/c I need to repaste something, am I going to have to do an RMA/buy a new board? Thermal throttling? Oh no, just firefox using 10,000% CPU and 150% available ram. Nope just working as intended, swapping out thousands of 4k chunks of memory like it's 1994 and I'm running emacs or some poo poo. On a desktop, OOM killer always gets its man b/c it's always the browser.
|
# ? Dec 11, 2021 15:56 |
|
Had it happen when I was helping my brother with his class on a true multi-user install of ubuntu. Some tab my mom had left open when we switched user went bezerk. I'd just written him a nice little annuity calculator in python. System locked up, couldn't log in to a console, then BAM. OOM killer stepped in and saved me. Real hero of the hour.
|
# ? Dec 11, 2021 16:00 |
SYSV Fanfic posted:Talking about being anxious my hardware is dying when swapping happens. Like... did this thing just freeze for five seconds ten times in a row b/c I need to repaste something, am I going to have to do an RMA/buy a new board? Thermal throttling? Oh no, just firefox using 10,000% CPU and 150% available ram. Nope just working as intended, swapping out thousands of 4k chunks of memory like it's 1994 and I'm running emacs or some poo poo.
|
|
# ? Dec 11, 2021 17:10 |
|
BlankSystemDaemon posted:There's something deeply wrong with the entire loving virtual memory management when the response to fixing a tiny piece of software is to completely abandon (demand) paging and insist on going back to physical memorymapping, because there's apparently a guarantee that if any swapping happens, the system will crash. you're conflating crashing and slowing down
|
# ? Dec 11, 2021 17:11 |
|
anyways bsd probably sucks just as much poo poo if not more when some docker crapware is spawning itself 300 times a second
|
# ? Dec 11, 2021 17:17 |
|
hifi posted:anyways bsd probably sucks just as much poo poo if not more when some docker crapware is spawning itself 300 times a second freebsd oom killer uses the "new guy in the prison yard" approach: find the biggest process and take it out. linux attempts to use some heuristics when trying to figure out which process to whack
|
# ? Dec 11, 2021 18:04 |
|
I haven't had dedicated swap in years and I've never had an issue. RAM is cheap as poo poo, even with multiple desktop VMs all with browsers overloaded with tabs because I'm incapable of closing anything and simultaneously running memory intensive games I've never noticed oom killer or any performance degradation/crashing due to memory pressure. I'm only using half my ram slots too.
|
# ? Dec 11, 2021 18:46 |
|
desktop linux these days uses user-mode systemd to contain every application within its own cgroup you'd think that would prevent the system from getting hosed by a process with runaway memory consumption and yet here we are. of course, we would ideally migrate everything to posix_spawn or at the very least vfork, then configure our kernels not to write rubber checks for memory in the first place. but yeah that isn't happening any time soon.
|
# ? Dec 11, 2021 18:55 |
|
I run zen on my desktop with no swap. I’ve had a few hard freezes — I wonder if maybe I shouldn’t do one of these things on 32 GB RAM?
|
# ? Dec 11, 2021 18:57 |
hifi posted:anyways bsd probably sucks just as much poo poo if not more when some docker crapware is spawning itself 300 times a second BlankSystemDaemon fucked around with this message at 19:57 on Dec 11, 2021 |
|
# ? Dec 11, 2021 19:43 |
|
so they succeded?
|
# ? Dec 11, 2021 19:44 |
|
Tankakern posted:so they succeded?
|
# ? Dec 11, 2021 19:46 |
Tankakern posted:so they succeded?
|
|
# ? Dec 11, 2021 19:57 |
|
does the linux oom killer just blindly pick a pid at random or what seems real good at killing the wrong things ime init, kubelet, whatever *blam*
|
# ? Dec 11, 2021 21:09 |
The_Franz posted:freebsd oom killer uses the "new guy in the prison yard" approach: find the biggest process and take it out. linux attempts to use some heuristics when trying to figure out which process to whack
|
|
# ? Dec 11, 2021 21:36 |
|
Sapozhnik posted:desktop linux these days uses user-mode systemd to contain every application within its own cgroup if you set a memory limit on the cgroup this will work. cgroups are how docker enforces limits. but the system can’t set a limit on its own
|
# ? Dec 11, 2021 23:50 |
|
BlankSystemDaemon posted:There's something deeply wrong with the entire loving virtual memory management when the response to fixing a tiny piece of software is to completely abandon (demand) paging and insist on going back to physical memorymapping, because there's apparently a guarantee that if any swapping happens, the system will crash. my earlier point, which you seem to have either missed or ignored, is that in the scenarios kubernetes is designed for and deployed into, crashing the system is better than slowing down. a crash is a fail-stop, fail-stops are well understood and distributed systems are usually pretty good at dealing with them. fail-slows are a lot less well understood, are more complicated to handle, and some of the mechanisms used to detect fail-stops will aggravate the effects of a fail-slow. that is why kubernetes was built to operate without swap and why people who know what they’re doing won’t use kube with swap unless they really know what they’re doing and really need it
|
# ? Dec 11, 2021 23:55 |
|
|
# ? May 28, 2024 16:16 |
|
BlankSystemDaemon posted:it might also be worth mentioning that protect(1) is meant to be used to protect processes from it, and that the _oomprotect suffix for rc.conf can set it for services there is a similar mechanism to give weights to processes for the linux oom killer, i have never looked in to details of how these are set but ive seen a bunch of oom killer ouptut in dmesg, and most systems are configured out of the box so e.g. sshd will never be oom-killed
|
# ? Dec 12, 2021 02:44 |