|
Speaking of mirrors and raidz. I'm setting up a replacement file server at work, with 15x 16TB drives. Last time I did this, I had the impression that using mirrors was also preferable because it makes resilvering way quicker. Is that still a thing, or is resilvering a drive in raidz quick enough in practice? I'm debating mirrors vs some level of raidz. The IO loads here are usually fairly nice; mostly reading and writing a single large file at the time (over NFS on 10gbit or SMB on 1gbit), though I'm sure there's sporadic "thousand tiny files at once" cases. I have enough disks to get enough space (for now) with mirrors, but more space is more space. Then again, more IOPS never hurts either. I also have 2x500GB NVME SSDs I'll need to consider the best use for, and a 300GB SAS SSD that Dell tossed into the server. Tempted to just go with 7x mirrored pairs, one hot spare. As for the SSDs, uh. To be considered and perhaps benchmarked. e: On a related note, I'm not overly impressed with the two PCIe slots in this server. There's fortunately also an OCP connector, but I'd happily trade that for more plain PCIe slots; there's plenty of space for them. Maybe even an NVME slot or three on the vast open expanses of motherboard. As is, NVME-on-PCIe-card + external SAS HBA + 10gbit NIC means I have zero PCIe connectivity left. Oh well, I assume the internal SAS controller eats a fair few lanes, and I don't know how many you get to play with on a modern Xeon. e2: The 15 disks are not a hard limit or an ideal number, it's just what our vendor had in stock. There's space for 20. Computer viking fucked around with this message at 10:56 on May 31, 2022 |
# ¿ May 31, 2022 10:44 |
|
|
# ¿ Apr 29, 2024 09:18 |
|
BlankSystemDaemon posted:Well, raidz resilver is limited by the speed that the CPU can do Reed-Solomon P(+Q(+R) - but if your CPU has AVX2 and you're using an implementation of OpenZFS that's new enough, the maths been vectorized and even without AVX2 it's still done using SIMD via SSE, so in practice you're usually limited by disk read speed rather than the CPU. Right - that's quite useful; thanks. It's a xeon silver 4309Y, which is listed on ARK as having AVX-512 and AESNI, so I think I'm good there. Also, IO is realistically not the bottleneck for most things we do - if it can deliver a few hundred MB/sec in most uses, that's more than enough. Doing so efficiently never hurts, of course. And thanks for the sha256 tip, that's a smart use of CPU features I wouldn't have thought of. Also, it's good to hear the resilver should ... mostly be fast enough on raidz? I'm not entirely comfortable with the redundancy pattern of a large pack of two-disk mirrors, nor the 50% "waste" - though the performance is at least good. Still, I won't exactly mind using something else. 2x 9 disk raidz3 sounds very reasonable, though I'll have to wait for the last few disks; I'll test with 2x 7 disk raidz1 for now and prepare to nuke and reconfigure before putting it to use. I like the idea of using the NVME disks as a special vdev - presumably an NVME mirror should be resilient enough. (And if it goes suddenly and completely bad, I guess that's why I set up the tape backup). I do remember hearing the BSD Now guys talk about the "redirect small writes to SSD" idea, but somehow didn't consider doing it here. Maybe the 300GB SSD would work as as L2ARC, if it's decently fast? It's a role where it's fine that it's non-redundant, at least. I think I'll just hold off the dedup. I've briefly tested it, and it doesn't really do a lot on our data; there's very little redundancy to work with. Computer viking fucked around with this message at 12:37 on Jun 1, 2022 |
# ¿ Jun 1, 2022 12:32 |
|
BlankSystemDaemon posted:As for 300GB L2ARC, that might not be the most efficient use of your memory. What's your working set (ie. what do you get from zfs-stats -A, if you're on one of the BSDs?), and how much system memory do you have? Ah, I didn't realize it was quite that expensive. I'll just use it as a boot drive, then. As for the working set, the current server has just 32GB of RAM, and the new one 64GB. I'd like a lot more, but realistically I don't think it'll be a huge problem? code:
|
# ¿ Jun 1, 2022 15:32 |
|
BlankSystemDaemon posted:Yeah, L2ARC is expensive, memory-wise - but if you've got a working set that's several TB large, and don't have one of the high-end Xeon Platinum that can take multiple TB of memory, it's the only way to have any caching. No, that's literally everything. Looking at it, I guess I should have used -AE or even -AEL? (The current server has an L2ARC set up.) Of course, I did just reboot to cold-swap in a new drive, so it'll take a bit to get representative numbers again. e: And sorry for the repeated editing of this post, I should draft before posting. Computer viking fucked around with this message at 23:21 on Jun 1, 2022 |
# ¿ Jun 1, 2022 23:11 |
|
For historical reasons, it's not set up for boot environments - I think this install started as a 10.x in 2015, and I haven't really bothered to try and retrofit it in there. But yes, it's really about time to jump to 13.1.
|
# ¿ Jun 1, 2022 23:30 |
|
Also, I'm now on 12.3, which was uneventful. I'll try 13 later.
|
# ¿ Jun 3, 2022 15:57 |
|
Also, if you're doing a five-disk raidz3, you'd get the same capacity and better performance from two mirrors and a hot spare, though of course traded against a bit higher risk.
|
# ¿ Jun 12, 2022 00:44 |
|
While obviously worse than a z3, those are not atrocious numbers. There's also the effect of having a hot spare around - ideally, that should shrink the window of vulnerability down to the time it takes to resilver a mirror? Of course, real life problems like batch effects and identical aging takes those numbers way down, so there's nothing wrong in being more careful. I would at least consider it, though.
|
# ¿ Jun 12, 2022 10:23 |
|
BlankSystemDaemon posted:A hotspare isn't going to give you more availability, it just means you can automate device replacement. If we consider the chance of losing data from a mirror as "hours spent in single disk operation × chance of failure per hour", wouldn't anything that reliably shortens that window (by immediately starting a resilvering) reduce the overall chance of data loss? I must admit I haven't ever touched the automated disk replacement tools, though. Maybe on the new server.
|
# ¿ Jun 12, 2022 10:43 |
|
This feels like it's more of a question of definitions - in which case I assume you're right. Still, consider a hypothetical. Two identical companies run identical storage hardware with identical raid levels. One has a well tested automatic hot spare system, and one has a weekly "walk around and replace broken disks" routine. Everything else being equal, I would expect the latter to have more cases of the second disk dying and taking out the mirror before the first failed disk was replaced, and thus more downtime (and restores from backups)? Point taken on this not being a "just works" kind of thing, though.
|
# ¿ Jun 12, 2022 11:28 |
|
BlankSystemDaemon posted:Sure, but ECC is good for much else besides this, so it's never a bad idea to have it if you can get it. Oh 100Gbit, that's fun. I'm still thinking about setting up some 10Gbit at home, and I genuinely don't have the storage or consumers to make use of more. Even at work 10Gbit over ethernet is honestly more than good enough - but it would be neat to play with RDMA just to have some experience with it.
|
# ¿ Jun 14, 2022 08:37 |
|
BlankSystemDaemon posted:If you're doing iSCSI over Ethernet (especially if backed with NVMe storage, but even without), you benefit quite a bit from the 10 times faster latency on fiber with SFP+ modules compared with RJ45. I have played with it, but no - it's all plain file storage. Makes sense, though. wolrah posted:For homelab-level fuckery 40G is the real sweet spot IMO. The hardware isn't much more expensive than 10G (in some cases it's cheaper) and it's generally compatible with 10G using inexpensive SFP+>QSFP+ adapters because it's literally just 4x10G acting as a single interface. Some 40G hardware, most commonly switches but also nicer NICs, can even break out those links in to four independent 10G interfaces. That is an interesting idea - though I'm very dependant on what shows up used; I'm not paying full price just for the Computer viking fucked around with this message at 17:33 on Jun 14, 2022 |
# ¿ Jun 14, 2022 17:31 |
|
It's worth noting that I'm in Norway, and something about being in the EEC and Schengen and whatever else is relevant, but not the EU, makes the cost of international shipping here really unpredictable. I'll keep the Chelsio cards in mind, though.
|
# ¿ Jun 14, 2022 18:13 |
|
Looks like I can get mellanox 40gbit cards without transceivers for €67 from Germany, which isn't half bad. It's another €35 for shipping if I'm not happy with "early July to early august", but ... I may be.
|
# ¿ Jun 14, 2022 18:46 |
|
necrobobsledder: Can I guess that your use is something like "I have a wheeled rack of gear, and I'd like to ship it to some foreign country and back and have it work by just swapping the power lead into the UPS"? (Where the gear itself isn't overly picky about voltage and frequency) Or is the gear itself picky, so you'd also like "always output 120V/60Hz"? Not that I can help you at all; I'm just wondering if I'm understanding it right.
|
# ¿ Jul 4, 2022 18:38 |
|
Depends on the realtek nic - some of them work perfectly fine. On the other hand I have an intel PCIe NIC in my FreeBSD NAS box for similar reasons, so maybe I shouldn't say anything.
|
# ¿ Jul 8, 2022 13:08 |
|
BlankSystemDaemon posted:Speaking of notifications, it's important to remember that email is not a stable protocol for notification delivery; you need something that'll do push notifications to a process running on your laptop/desktop or mobile device. While keeping in mind that at some parts of the mail infrastructure sees enough use that people notice it failing, while almost every other solution has a similar amount of jank but less user exposure.
|
# ¿ Jul 12, 2022 01:45 |
|
It looks like WD has moved to Red drives being SMR, while Red Plus and Pro are CMR. If you're not familiar, the short summary is that CMR is "normal", and SMR is more space efficient (so you can get more TB per platter) but requires you to rewrite long stretches of data if you want to change them. SMR drives typically have a CMR region to buffer incoming writes and then quietly moves things around in the background, so you may not notice if you only write moderate amounts of data at the time - but they get dog slow if that buffer region fills up before it has a chance to drain. (Reads should still be fine, though). They are a cheaper way to get more storage, and given your use they're probably fine. Personally I'd still get a CMR drive just in case, though - they're not that much more expensive. Red Pro drives mostly seem to be faster: 7200 rpm instead of the ~5600 of a Plus means lower latency and more throughput, but louder and warmer. Probably not important for you. Computer viking fucked around with this message at 01:55 on Jul 13, 2022 |
# ¿ Jul 13, 2022 01:53 |
|
Yeah, all my experience with SATA port multipliers suggests that it's supremely cursed technology. SAS has the crucial benefit of actually working with no real fuss. You may have to mess with the controller settings to get true JBOD (that is, just present the disks as-is to the OS), but it should support SATA drives just fine.
|
# ¿ Jul 20, 2022 19:26 |
|
Something like this as the enclosure (I have no idea if that's a good choice, but just to illustrate the category), plus some sort of LSI SAS HBA with external ports (the LSI SAS 9300-8e seems to be recommended, but may be overkill - a 9200-8e may be more than enough). And whatever cable is appropriate, of course; SAS cables are their own fun thing. Most likely this will be an SFF-8088 cable or an SFF-8088 to SFF-8644.
Computer viking fucked around with this message at 19:40 on Jul 20, 2022 |
# ¿ Jul 20, 2022 19:36 |
|
This is incidentally how the file servers at work ... work, except that both the server and the SAS enclosures are 2U units from HP or Dell. LSI adapters in IT mode, server and enclosure full of disks, ZFS. I have my share of problems at work, but that part of the hardware has so far not been among them - it's strangely painless.
|
# ¿ Jul 20, 2022 19:54 |
|
fletcher posted:I thought I wanted hot swap bays. Then I realized what a nightmare the thermals are in all of the cases that support them, and realized that I didn't actually need hot swap support outside of the novelty of it. My Node 804 is nice and quiet and my drives temps are great! They're fun in the enterprise gear I have with them, but ... I've never needed to hotswap a drive.
|
# ¿ Jul 22, 2022 02:07 |
|
Ha, it feels like the overlap between maxtor drives, noctua fan, that case, the non-modular PSU and the red SATA cables is enough to date this build very precisely. (I'll guess 2006?)
|
# ¿ Jul 22, 2022 12:56 |
|
Mr. Crow posted:
It's bolted to a piece of wood that's only slightly lighter than the floor.
|
# ¿ Jul 22, 2022 21:06 |
|
Sometimes I hate enterprise hardware. Oh, you dare put a third party hard drive (that's identical to the certified ones we sell for a 5x markup) in our server? 11000 rpm fan speeds forever as a punishment. That IPMI command to knock the fan speeds back down that we've sometimes mentioned? Naah, we locked those out in a firmware update. Next time I'm buying supermicro, preferred suppliers be damned. (Dell R550.) E: Ah yes, the same goes for adding a third party PCIe card - they even have a specific guide on how to disable their paranoid/punishing cooling algorithm for those that doesn't work on the newest generation. E2: Ah, the racadm command to disable the PCIe fan speed thing still works. Shame about the drives, but I think I can move enough non-Dell drives to the external DAS. E3: Nevermind, with the PCIe cards calmed down the minimun fan speed with third party drives jumps to 36% for a bit but then falls down to 13% again on its own. Huh. Computer viking fucked around with this message at 15:33 on Aug 10, 2022 |
# ¿ Aug 10, 2022 13:12 |
|
Aware posted:Somewhat ironically adding a second eBay CPU and filling out all the fan slots on an R740xd actually made the drat thing quieter overall. Either way don't buy Dell servers for home is my advice now. It's actually at work, but yeah, same thing applies. If I had a dedicated server room and the budget to pay for an all-Dell setup, I'm sure it'd be a great piece of kit. That said, I don't really have a lot of great alternatives; HPE are worse, and I'm very limited in suppliers. They have a couple of Lenovo and Fujitsu servers that don't really work for my purpose, and one or two annoyingly outdated supermicro parts. I can order anything from Dell, though, so ... R550 + MD1400 it is. And yeah I fully believe you; the fan speed algorithms on these machines seem to be mostly magic.
|
# ¿ Aug 10, 2022 23:41 |
|
I did think about that, yeah - though disconnecting one didn't seem to change anything. As of right now the MD1400 box is noisier, which I guess is good enough. As for the MD1400, if you should ever need to deal with one: the SAS connector units at the back have debug USB mini ports (in 2022). If you plug it into a PC, they present as USB serial adapters - with two ports, though only one seems to do anything. Connect at 115200 baud, and you get a console asking for a password. Dell does not give that password out, but there is exactly one reddit post and zero other pages that have the right one. In the interest of doubling that count, its "bluemoon". You can then use the help command - it accepts the same commands as the earlier MD1xxx boxes did over serial, including "shutup 20 0" to set fan 0 to 20%. Any lower and it tends to go back up to 50% on its own shortly afterwards. There are two fans - one in each PSU - numbered 0 and 1. As far as I can tell it does not matter which of the two SAS units you connect the USB cable to. Beware, though - it looks like there's some foot-pointing guns scattered around in there. Computer viking fucked around with this message at 01:25 on Aug 11, 2022 |
# ¿ Aug 11, 2022 01:19 |
|
I guess you could also use gstat - if the queue lengths start climbing above 1, you are probably bottlenecked by the disks?
|
# ¿ Aug 15, 2022 12:47 |
|
Do all consumer 2.5gbit NICs suck? We have two intel i225 cards in different machines, and both have a coin toss chance of just not working on any given boot. The realtek rtl8125 on my motherboard just doesn't work at all, which seems to be a common problem across multiple motherboard manufacturers who use it. I see the embedded intel i225 on the Mikrotik router I'm waiting for has a dedicated forum thread about its issues, too. I happen to have a 2.5gbit switch, so I optimistically thought that the NIC side should be a solved issue by now, but it seems kind of dire.
|
# ¿ Aug 16, 2022 01:07 |
|
BlankSystemDaemon posted:2.5G is a bad stopgap solution for people who have RJ45 already embedded inside wells, and can't easily replace it with fiber. Sure, but I would have expected the problems to be "it's hard to get full speed over most cabling" or "it uses too much power", not "the hardware, firmware and drivers all seems to have been made by the less competent interns".
|
# ¿ Aug 16, 2022 09:55 |
|
The realtek or the intel? Either way, I guess that is promising - it proves it's possible to make it work well.
|
# ¿ Aug 16, 2022 12:25 |
|
Worst thing is, grub2 has support for reading a whole bunch of file systems and is (as far as I can tell without trying) designed to make it easy to plug in more. They just like their convoluted initrd designs over in linux land, I guess.
|
# ¿ Aug 17, 2022 14:54 |
|
Hughlander posted:I don’t understand. I’ve been doing proxmox with zfs boot on Linux for the past 5 years. What’s lacking? Nothing, if it works it works. It's just not a given on the larger linux ecosystem - IIRC, Fedora routinely breaks ZFS if you install their recommended kernel upgrades. Also, a more zfs-first OS may have some neat extra tools. The boot environments mentioned are basically the opportunity to make clones of the boot drive before upgrades (or indeed at any point you want), and boot from any of them or roll back to them at will. It's possible to make work on linux, it's not the end of the world to not have it ... but it is neat.
|
# ¿ Aug 17, 2022 22:35 |
|
My impression of Microsoft and consumer file systems is that they want to treat Windows PCs like fat clients, with the primary storage being Onedrive, or a NAS for business desktops. If local disk is only really for impersonal data (software which can be downloaded again) and checked out copies from cloud/NAS, then there's less incentive to develop a fancier new file system.
|
# ¿ Aug 19, 2022 12:14 |
|
Truenas is perfectly fine for this - both Core and Scale should let you configure containers for things like that fairly easily. I use Core (the FreeBSD one), and I think most of the things you listed are available as plugins (e.g. preconfigured jails), and the rest should be doable. I bet they are mostly available as containers on Scale (the Linux based one), too. As for their Storj partnership, I bet it's way overblown for marketing and in practice means they're an option in a dropdown somewhere.
|
# ¿ Sep 16, 2022 12:50 |
|
My Truenas (core) machine is my old i5 6600K gaming machine with extra RAM, a pair of large spinning disks, and an M2 SSD left over from an upgrade. It's fine. The one concession I've made is to throw an AliExpress Special™ Realtek 2.5Gbit network card in it - I got 190MB/s from the steam cache we've set up earlier today, so it seems to be working ok.
|
# ¿ Sep 16, 2022 23:22 |
|
I fully expect there to be at least one "I turned a battery pack from a crashed Tesla into a UPS" video on youtube. But yeah, you'd think putting a bunch of 4680 cells and a new charge controller into an existing UPS design, and selling it for a non--silly price, would be within the capabilities of most companies.
|
# ¿ Sep 18, 2022 17:54 |
|
Just go full industrial and install a motor-generator set in the basement; with a suitable flywheel it should be able to smooth out a lot of noise and transient events. (Note: Do not do this)
|
# ¿ Sep 19, 2022 17:55 |
|
Klyith posted:I've never understood why twice as much of a thing costs more money. Just an amazingly hosed up state of affairs when you think about it. That kind of made me curious about how much of the cost of a consumer grade UPS is the batteries. Looking at cyberpower, and rounding a bit, their $200 UPS has $100 replacement batteries. Which suggests that twice the capacity would cost $300 plus whatever it costs to ship the more expensive unit, build the larger chassis, design and support another variant, and whatever modifications are needed to actually charge and use the extra cells. Plus the risk of running into the regulations you mention, of course. Basically, it seems like it's not "twice as much", it's "twice as much of something that makes up about half the price of the unit, plus a complicated overhead". Which I guess in a way is the answer to "why does it cost twice as much to double the capacity".
|
# ¿ Sep 20, 2022 01:37 |
|
|
# ¿ Apr 29, 2024 09:18 |
|
BlankSystemDaemon posted:You'll want a disk shelf with a SAS connector and a SAS HBA with external ports, as SAS is compatible with SATA. Though be careful, this is apparently not 100% - I just found out the Dell MD1400 somehow manages to not support SATA disks. e: According to the internet. I've got one I'm not actively using yet, so I can throw a SATA disk in it and see what happens.
|
# ¿ Sep 28, 2022 21:53 |