NAS/Storage Megathread: What is this "File Deletion" You Speak of

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > NAS/Storage Megathread: What is this "File Deletion" You Speak of

«‹›847 »

Moey: Oct 22, 2010; I LIKE TO MOVE IT

I'm still using putty like a heathen.

# ? Feb 12, 2022 05:33

Adbot: ADBOT LOVES YOU

# ? May 27, 2024 11:10

r u ready to WALK: Sep 29, 2001

MobaXterm is the one you really want, mostly for its ability to auto colorize text based on keywords and the built-in sftp browser.

# ? Feb 12, 2022 06:59

IOwnCalculus: Apr 2, 2003

Cantide posted:

Is it also possible to retrofit my current OS drive into a bootable mirrored zpool?
I went with mdadm because I read doing that with zfs would be hell. (this is roughly the guide I followed: https://feeding.cloud.geek.nz/posts/setting-up-raid-on-existing/)

I will admit I have never attempted root-on-zfs before with Ubuntu. I've kept the whole "boot media is disposable" mindset from FreeNAS/NAS4Free builds, so I just boot off of whatever SSD I have handy and keep it entirely separate from the pool. I don't even run any sort of hardware redundancy on that. Next time I have to do a clean slate reinstall I'll probably enable root-on-zfs but only as a single drive because that's vastly easier than root-on-zfs-pool.

The only things on my fileserver that I care about that aren't already inside /tank, are my various docker mounts that point at SSDs instead of the pool. And I just cover that with a basic backup script.

# ? Feb 12, 2022 07:20

Combat Pretzel: Jun 23, 2004; No, seriously... what kurds?!

When you're using Windows' built-in ssh, you best make sure that you use Windows Terminal, too, because it's actually compliant with all the *nix terminal stuff and escape codes and what not.

# ? Feb 12, 2022 09:58

BlankSystemDaemon: Mar 13, 2009

CopperHound posted:

Okay, I'm dumb. All those errors are from me swapping drive bays. I thought my logs were on UTC instead of local time. I have no cam messages from past several days other than that.

Welp.

Shaking my head at Windows users who configure servers after wallclock instead of UTC, because DOS couldn't handle anything but wallclock
Even Windows nowadays keeps UTC time in the kernel and just displays wallclock based on the timezone offset you configure.

CopperHound posted:

Okay, the last errors in messages were from Feb 2nd even though I could have sworn I saw the activity light on that drive going the day before yesterday.

code:

Feb  2 19:21:54 nas     (da0:mps0:0:0:0): SYNCHRONIZE CACHE(10). CDB: 35 00 00 00 00 00 00 00 00 00 length 0 SMID 282 Command timeout on target 0(0x0010) 60000set, 60.6313204 elapsed
Feb  2 19:21:54 nas mps0: Sending abort to target 0 for SMID 282
Feb  2 19:21:54 nas     (da0:mps0:0:0:0): SYNCHRONIZE CACHE(10). CDB: 35 00 00 00 00 00 00 00 00 00 length 0 SMID 282 Aborting command 0xfffffe00e9499af0
Feb  2 19:21:54 nas mps0: mpssas_action_scsiio: Freezing devq for target ID 0
Feb  2 19:21:54 nas (da0:mps0:0:0:0): READ(16). CDB: 88 00 00 00 00 01 d7 e3 49c8 00 00 00 08 00 00
Feb  2 19:21:54 nas (da0:mps0:0:0:0): CAM status: CAM subsystem is busy
Feb  2 19:21:54 nas (da0:mps0:0:0:0): Retrying command, 3 more tries remain
Feb  2 19:21:54 nas mps0: Controller reported scsi ioc terminated tgt 0 SMID 1595 loginfo 31130000
Feb  2 19:21:54 nas mps0: Controller reported scsi ioc terminated tgt 0 SMID 559 loginfo 31130000
Feb  2 19:21:54 nas mps0: Controller reported scsi ioc terminated tgt 0 SMID 545 loginfo 31130000
Feb  2 19:21:54 nas mps0: Controller reported scsi ioc terminated tgt 0 SMID 1192 loginfo 31130000
Feb  2 19:21:54 nas (da0:mps0:0:0:0): READ(16). CDB: 88 00 00 00 00 01 6c 2d ff80 00 00 00 18 00 00
Feb  2 19:21:54 nas mps0: (da0:mps0:0:0:0): CAM status: CCB request completed with an error
Feb  2 19:21:54 nas (da0:mps0:0:0:0): Retrying command, 3 more tries remain
Feb  2 19:21:54 nas (da0:mps0:0:0:0): READ(16). CDB: 88 00 00 00 00 01 50 22 7320 00 00 00 18 00 00
Feb  2 19:21:54 nas (da0:mps0:0:0:0): CAM status: CCB request completed with anerror
Feb  2 19:21:54 nas (da0:mps0:0:0:0): Retrying command, 3 more tries remain
Feb  2 19:21:54 nas (da0:mps0:0:0:0): READ(16). CDB: 88 00 00 00 00 01 50 22 7308 00 00 00 18 00 00
Feb  2 19:21:54 nas Finished abort recovery for target 0
Feb  2 19:21:54 nas (da0:mps0:0:0:0): CAM status: CCB request completed with anerror
Feb  2 19:21:54 nas (da0:mps0:0:0:0): Retrying command, 3 more tries remain
Feb  2 19:21:54 nas (da0:mps0:0:0:0): READ(16). CDB: 88 00 00 00 00 01 50 22 72f0 00 00 00 18 00 00
Feb  2 19:21:54 nas mps0: (da0:mps0:0:0:0): CAM status: CCB request completed with an error
Feb  2 19:21:54 nas (da0:mps0:0:0:0): Retrying command, 3 more tries remain
Feb  2 19:21:54 nas Unfreezing devq for target ID 0
Feb  2 19:21:54 nas (da0:mps0:0:0:0): SYNCHRONIZE CACHE(10). CDB: 35 00 00 00 00 00 00 00 00 00
Feb  2 19:21:54 nas (da0:mps0:0:0:0): CAM status: Command timeout
Feb  2 19:21:54 nas (da0:mps0:0:0:0): Retrying command, 0 more tries remain
Feb  2 19:21:54 nas (da0:mps0:0:0:0): SYNCHRONIZE CACHE(10). CDB: 35 00 00 00 00 00 00 00 00 00
Feb  2 19:21:54 nas (da0:mps0:0:0:0): CAM status: SCSI Status Error
Feb  2 19:21:54 nas (da0:mps0:0:0:0): SCSI status: Check Condition
Feb  2 19:21:54 nas (da0:mps0:0:0:0): SCSI sense: UNIT ATTENTION asc:29,0 (Power on, reset, or bus device reset occurred)
Feb  2 19:21:54 nas (da0:mps0:0:0:0): Error 6, Retries exhausted
Feb  2 19:21:54 nas (da0:mps0:0:0:0): Invalidating pack

also, the truenas core web terminal sucks for copying text.

Could you please follow the advice in the second paragraph in the BUGS subheader of recoverdisk(1) and decrease the retry-count if they're not already at 0?
If it isn't at 0, you can run into issues with probe effect where you're not detecting the errors or detect too many errors.
_{_{Also, since you mention TrueNAS, you probably need to set it somewhere in their configuration rather than on the commandline, if you want it to persist across reboots}}

# ? Feb 12, 2022 11:46

wolrah: May 8, 2006; what?

Motronic posted:

Holy crap, it's an iTerm rip off. I love it. Thank you.

It's missing a few key features you might expect from other modern terminals, most notably the ability to tear off tabs and have different tabs with different privilege levels in the same window.

That said there are good technical reasons why those features aren't there currently and there is a project to rearchitect the entire application to allow it to be done right.

edit: Details - https://github.com/microsoft/terminal/issues/5000

wolrah fucked around with this message at 18:07 on Feb 12, 2022

# ? Feb 12, 2022 17:58

AKA Pseudonym: May 16, 2004; A dashing and sophisticated young man; Doctor Rope

I just upgraded from a 1 TB HDD to a 2 TB SSD. I cloned the disk, but after a failed attempt I guess I made an oopsy and didn't set up my next attempt right. I forgot to resize the C partion and now I have a bunch of the drive unused.

Cool, no problem. Just delete the D partition, extend C, recreate the recovery drive. But Disk manager isn't letting me do anything with the D drive.

Right clicking gets me this:

Same thing for that 980 mg partition over to the left.

Googling mostly gets me pages trying to sell me partition management tools. Is there a fix? Is there another approach I should be trying? I really don't want to reinstall my old disk and try again.

# ? Feb 12, 2022 18:05

wolrah: May 8, 2006; what?

Windows won't let you touch system or recovery partitions from the Disk Management GUI. I think the command line tools can do it, but I usually just boot a Linux Live USB with GParted.

# ? Feb 12, 2022 18:10

Moey: Oct 22, 2010; I LIKE TO MOVE IT

Diskpart can nuke that without issue.

# ? Feb 12, 2022 19:05

Nulldevice: Jun 17, 2006; Toilet Rascal

AKA Pseudonym posted:

I just upgraded from a 1 TB HDD to a 2 TB SSD. I cloned the disk, but after a failed attempt I guess I made an oopsy and didn't set up my next attempt right. I forgot to resize the C partion and now I have a bunch of the drive unused.

Cool, no problem. Just delete the D partition, extend C, recreate the recovery drive. But Disk manager isn't letting me do anything with the D drive.

Right clicking gets me this:

Same thing for that 980 mg partition over to the left.

Googling mostly gets me pages trying to sell me partition management tools. Is there a fix? Is there another approach I should be trying? I really don't want to reinstall my old disk and try again.

Easeus partition manager will let you move the partitions around for free. Just had to do this with a Windows 10 VM. Used it when I did the same oops on a disk clone on my laptop.

# ? Feb 12, 2022 19:11

Javid: Oct 21, 2004

The drive in my mom's email-and-wordpad laptop is failing. She was using basically none of her existing 500gb HDD, so I figure just the cheapest decent SSD on Amazon is going to be both a cheap fix AND a huge upgrade from the 5400 rpm thing it had. I'm not familiar with the brands, but there's a bunch of random ones in the teens-to-twenties range like https://www.amazon.com/gp/product/B088KLN1HF/ on there; what should I be looking to spend to get something good enough, here?

# ? Feb 13, 2022 10:18

CoolCab: Apr 17, 2005; glem

literally anything will be a big jump, but when i did the exact thing for my mom i spent a little more to get one with a DRAM cache. in my region you can get cheap 128 gig ones for like 15-20 pounds, sometimes. i went with a little larger (500 gig) and a little nicer and more reputable in terms of warranty/support (WD digital blue) because proportionately it was still only 40-50 quid and i suspect that if it had an intermittent fault or odd behaviour (possibly from being close to capacity, windows gets fucky) i would ultimately be the one who fixed it.

# ? Feb 13, 2022 10:26

Klyith: Aug 3, 2007; GBS Pledge Week

Javid posted:

The drive in my mom's email-and-wordpad laptop is failing. She was using basically none of her existing 500gb HDD, so I figure just the cheapest decent SSD on Amazon is going to be both a cheap fix AND a huge upgrade from the 5400 rpm thing it had. I'm not familiar with the brands, but there's a bunch of random ones in the teens-to-twenties range like https://www.amazon.com/gp/product/B088KLN1HF/ on there; what should I be looking to spend to get something good enough, here?

Inland is Microcenter's house brand, they're fine. That drive in particular is a fine pick if you're positive 128GB is all you'll ever need.

Minimum size SSDs are not great in any application that needs performance -- they only have one NAND chip and high transfer speeds rely on the parallelism of reading/writing multiple chips at once. But they're still an improvement over a 5400 laptop drive, and as you say this isn't a performance situation.

There are suspiciously cheap SSDs on amazon that are actually a USB stick on the inside, so it's best not to do sort-by-cheapest and just throw a dart.

# ? Feb 13, 2022 14:02

BlankSystemDaemon: Mar 13, 2009

Very cheap TLC based SSDs probably aren't going to be very reliable since TLC is a lot less reliable than MLC or SLC - and the cheaper it is, the bigger the chances of failure.

So basically, what I'm saying is, ensure you have a backup solution in place.

# ? Feb 13, 2022 15:35

Klyith: Aug 3, 2007; GBS Pledge Week

BlankSystemDaemon posted:

Very cheap TLC based SSDs probably aren't going to be very reliable since TLC is a lot less reliable than MLC or SLC - and the cheaper it is, the bigger the chances of failure.

lol

TLC (or QLC) has been in every consumer SSD for the last 7-8 years. It's been absolutely fine. MLC or SLC drives don't even exist anymore outside of super-expensive enterprise stuff.

Like, yeah have backups. Everything should have backups. But your views on how often you think poo poo goes bad are really whack. Like if your standards for reliability are such that you only buy gold-plated enterprise gear for 10x the price, ok. But in the consumer space poo poo is not constantly failing. Normal people don't use SSDs like a server, they write a TB per year.

# ? Feb 13, 2022 16:07

Arivia: Mar 17, 2011

Klyith posted:

lol

TLC (or QLC) has been in every consumer SSD for the last 7-8 years. It's been absolutely fine. MLC or SLC drives don't even exist anymore outside of super-expensive enterprise stuff.

Like, yeah have backups. Everything should have backups. But your views on how often you think poo poo goes bad are really whack. Like if your standards for reliability are such that you only buy gold-plated enterprise gear for 10x the price, ok. But in the consumer space poo poo is not constantly failing. Normal people don't use SSDs like a server, they write a TB per year.

My old main system drive (samsung 850 pro 500GB) reports 43tb written over the five years I used it in hwinfo64, but I'm big on gaming and stuff so that's probably hammered it a bit extra.

# ? Feb 14, 2022 10:26

Javid: Oct 21, 2004

My backup records tell me that the drive that is failing has a whole 286 mb in the user data folder; I am BEYOND certain that a 128gb drive will outlive the rest of the (aging, cheap) laptop I'm shoving it into. Thanks for the advice!

# ? Feb 14, 2022 11:08

CerealKilla420: Jan 3, 2014; "I need a handle man..."

I remember when smaller SSDs first started to really get affordable back in 2010 or so and everyone was worried that the drives would burn out after 2 years... Hell I was worried about it myself and I decided against getting one in favor of an upgraded 500gb 2.5in HDD drive for my laptop at the time.

So far to this day I have not even heard OF someone's 64gb drive burning out on them honestly. I'm not saying that it doesn't happen but in that same time period (past 12 years), I've had at least 3 3.5in HDD drives fail on me.

That said I'm sure things are very different in a real production environment where the drives are responsible for something more important than delivering my 10 bit chinese cartoons to my chromecast or reading Gamecube ISO files lol.

# ? Feb 14, 2022 15:43

Scruff McGruff: Feb 13, 2007; Jesus, kid, you're almost a detective. All you need now is a gun, a gut, and three ex-wives.

CerealKilla420 posted:

I remember when smaller SSDs first started to really get affordable back in 2010 or so and everyone was worried that the drives would burn out after 2 years... Hell I was worried about it myself and I decided against getting one in favor of an upgraded 500gb 2.5in HDD drive for my laptop at the time.

So far to this day I have not even heard OF someone's 64gb drive burning out on them honestly. I'm not saying that it doesn't happen but in that same time period (past 12 years), I've had at least 3 3.5in HDD drives fail on me.

That said I'm sure things are very different in a real production environment where the drives are responsible for something more important than delivering my 10 bit chinese cartoons to my chromecast or reading Gamecube ISO files lol.

Yeah, my gaming rig is still booting from a 250gb Samsung 850 Evo I bought in like 2015. 56TB written apparently (though admittedly I've been looking to replace it and my gaming rig is designed to be nuked regularly so I'm not worried about losing anything on it).

# ? Feb 14, 2022 16:16

Shrimp or Shrimps: Feb 14, 2012

I was still using an x25m g2 as a game drive as recently as last month and I bought it when it first came out. Prior to the 850 Evo series it was my main os drive. I don't know how many years that is, but the sucker is still chugging. And it's going to be my truenas os drive that I'll be setting up today.

Building in the Fractal node 304 with 4 3.5" and 1 2.5" ssd was actually quite easy. I even fit a Fuma 2 in there, but only with the middle fan because of sata cable clearance. Had to remove the rear 140mm fan, though, but may mount that on the outside back of the case actually. The Fuma 2 is overkill for the cpu (6700k with a modest undervolt) but I had it lying around.

A quick power on test shows my new drives are at least detected in the bios and disk management in windows. Is there a quick-ish test I can do to check drives are working fine, or is the only good test that writing to every single sector thing which takes weeks? Because if so I'm just going to yolo this.

# ? Feb 14, 2022 22:51

Saukkis: May 16, 2003; Unless I'm on the inside curve pointing straight at oncoming traffic the high beams stay on and I laugh at your puny protest flashes.
I am Most Important Man. Most Important Man in the World.

CerealKilla420 posted:

I remember when smaller SSDs first started to really get affordable back in 2010 or so and everyone was worried that the drives would burn out after 2 years... Hell I was worried about it myself and I decided against getting one in favor of an upgraded 500gb 2.5in HDD drive for my laptop at the time.

So far to this day I have not even heard OF someone's 64gb drive burning out on them honestly. I'm not saying that it doesn't happen but in that same time period (past 12 years), I've had at least 3 3.5in HDD drives fail on me.

That said I'm sure things are very different in a real production environment where the drives are responsible for something more important than delivering my 10 bit chinese cartoons to my chromecast or reading Gamecube ISO files lol.

Just a while ago I checked the SSD status on a Moodle server at work. For the 800GB SAS-drives with "mixed use" rating storing the database, the parameter "Estimated Life Remaining based on workload to date" was around 21746 days, or 59.5 years. For the 7.68 TB SATA drives with "read intensive" rating, storing all the file data and which are filling up and another pair for extension is on order, the same parameter was at 450776 days, or 1234 years.

# ? Feb 15, 2022 00:20

Kivi: Aug 1, 2006; I care

I'm currently using Linux MD RAID and it's fine. However the last pages discussion has made me think what I should do in the future. My current setup is 8 x 8 TB disks in RAID6. I'm looking to migrate to larger disks, keeping the performance and capacity mainly the same. Use case is basically write once, read many for media (movies, tv on plex) and light backups and general file storage. OS, software, download scratch disk, VMs etc. reside on the 2 TB boot SSD and media is written once it's done downloading to the spinners.

I'm familiar with MD RAID recovery so I'm bit on the edge to migrate to something I've not used and reading on the net ZFS seems to require tons of extra horse power and hardware (1 GB of RAM per TB, log and cache disks) for benefits that are not applicable in my use case? What am I missing?

E: I haven't had bit rot (knocks on wood) since I migrated to ECC setup in 2015.

Kivi fucked around with this message at 09:10 on Feb 15, 2022

# ? Feb 15, 2022 09:03

Keito: Jul 21, 2005; WHAT DO I CHOOSE ?

Kivi posted:

I'm bit on the edge to migrate to something I've not used and reading on the net ZFS seems to require tons of extra horse power and hardware (1 GB of RAM per TB, log and cache disks) for benefits that are not applicable in my use case? What am I missing?

I've been running a ZFS pool with 8 x 14 TB disks on a (virtual) machine with 2 CPU cores and 16 GB of RAM for the past year, with no special device disks either. Have yet to experience any issues with it.

I went with ZFS because of its design focus on data integrity, but for me at least snapshots, file system level compression and being able to apply different properties per filesystem/dataset have really been killer features as well. I've also found the management interface quite nice to use.

# ? Feb 15, 2022 10:25

BlankSystemDaemon: Mar 13, 2009

Kivi posted:

I'm currently using Linux MD RAID and it's fine. However the last pages discussion has made me think what I should do in the future. My current setup is 8 x 8 TB disks in RAID6. I'm looking to migrate to larger disks, keeping the performance and capacity mainly the same. Use case is basically write once, read many for media (movies, tv on plex) and light backups and general file storage. OS, software, download scratch disk, VMs etc. reside on the 2 TB boot SSD and media is written once it's done downloading to the spinners.

I'm familiar with MD RAID recovery so I'm bit on the edge to migrate to something I've not used and reading on the net ZFS seems to require tons of extra horse power and hardware (1 GB of RAM per TB, log and cache disks) for benefits that are not applicable in my use case? What am I missing?

E: I haven't had bit rot (knocks on wood) since I migrated to ECC setup in 2015.

"It's fine" isn't really something you can prove with any confidence, since nothing in MD devices or EXT4 implements complete checksumming, transactional+atomic operations, or writes things in a hash trie or similar persistent data-structure - all of which are needed for the highest forms of fault tolerance that it's possible to build into a storage stack without introducing the risk of write holes that come with traditional RAID implementations or single-disk filesystems.

And as for bitrot, how're you going to know if you have the silent kind?
It doesn't show up unless you're dealing with files that're susceptible, ie. some images, some forms of compressed data, and text files whereas things like movie files and other formats which're the type to take up a large amount of space - and even then, you still have to open the files in order to notice that kind of bitrot, if you can (it's difficult-to-impossible to spot a single bitflip in a video stream).

The hardware requirements of ZFS that you're listing are for production environments where a company is hosting a database or something with similar IOPS requirements and with a resident set (ie. the amount of data being actively used) is bigger than can fit into the memory of any system it's possible to build.
I've run ZFS on a RPI3 (ie. 64bit CPU at ~1.6GHz) with 512MB memory, and while it wasn't fast, it didn't need to be to just put log files on a pair of mirrored harddrives connected via USB.

# ? Feb 15, 2022 13:38

some kinda jackal: Feb 25, 2003; �
�

Is there any serious competitor to the QNAP line for a simple and "low priced" NAS with 10gbe?

I would like to outfit my two proxmox servers with 10gbe cards and offload the local datastores to NFS or iSCSI, but I'm chafing a bit at the price tag of the Synology offering. At the same time I think this might be one of those "buy once cry once" situations because I'm really happy with my DS920+ and I'm not sure I really want two discrete NAS systems taking up power and physical space.

I could theoretically ditch Synology for QNAP since I don't do anything involving external access and this is just a home device so I'm not really sure I'd be that fussed if their security posture is lacking. At the same time if I'm already going to spend the money I could do exactly the above, buy once cry once, and just pay the premium for the lowest tier Synology with 10gbe.

I am pretty sure I could do what I want for cheaper if I built my own solution but I think at this point I'm pretty firmly in the camp of just buying an off-the-shelf solution. QNAP and Synology are the only two I've seriously considered and mainly because they're the only two I realistically believe would still be supporting a product 5+ years after I buy it, but I'd be interested to see if I'm wrong and there are other players to consider.

# ? Feb 15, 2022 13:49

Motronic: Nov 6, 2009

On the topics of 10 GbE:

Motronic posted:

That's good enough to splash out on a couple of NICs to give things a try. Thank you!

(e: just ordered $200 worth of crap to throw at the rack)

This all just worked. I've got some input errors from the switch to the ESXi box that I need to figure out, but they are minor (and not on the switch, so this is something else, potentially even a reporting error because I'm not too confident of the SNMP data I'm getting from ESXi). First port of each of these cards goes to the switch and I just slammed the second ports together with a twinx cable and their own subnet between the ESXi box and the TrusNAS box. All iSCSI goes over that and it works great.

So thanks again BSD and anyone else who brought up those Dell cards. Cheap, everything already had drivers, seems to pass the "been beating on this for a week+ test". And low stakes......worst case I bring up the GigE LAGGs again if something goes wrong.

# ? Feb 15, 2022 14:30

alo: May 1, 2005

I ended up getting the DS4246. Got it in, racked it up and swapped over all of my drives. It's a big improvement on my old Norco as the backplane isn't dropping drives every few days (it's probably just old and brittle).

The drive closures are great, much better than the Norco (and Supermicro stuff too) and actually better than most of the Dell/HP stuff I've worked with. I'm mostly referring to the insertion where with some drive sleds, you can make contact before you close the latch, or you miss the catch when inserting and the drive isn't fully seated, but is still powered on.

In addition, since it's an actual sas jbod, I can query it to get the drive locations.

# ? Feb 15, 2022 15:07

Less Fat Luke: May 23, 2003; Exciting Lemon

How's the noise? Did you swap any of the PSU fans or left it as is?

I'm seeing a lot of those and other DAS arrays on eBay thanks to Chia coin cratering (lol)

# ? Feb 15, 2022 15:13

Zorak of Michigan: Jun 10, 2006

BlankSystemDaemon posted:

"It's fine" isn't really something you can prove with any confidence, since nothing in MD devices or EXT4 implements complete checksumming, transactional+atomic operations, or writes things in a hash trie or similar persistent data-structure - all of which are needed for the highest forms of fault tolerance that it's possible to build into a storage stack without introducing the risk of write holes that come with traditional RAID implementations or single-disk filesystems.

And as for bitrot, how're you going to know if you have the silent kind?
It doesn't show up unless you're dealing with files that're susceptible, ie. some images, some forms of compressed data, and text files whereas things like movie files and other formats which're the type to take up a large amount of space - and even then, you still have to open the files in order to notice that kind of bitrot, if you can (it's difficult-to-impossible to spot a single bitflip in a video stream).

The hardware requirements of ZFS that you're listing are for production environments where a company is hosting a database or something with similar IOPS requirements and with a resident set (ie. the amount of data being actively used) is bigger than can fit into the memory of any system it's possible to build.
I've run ZFS on a RPI3 (ie. 64bit CPU at ~1.6GHz) with 512MB memory, and while it wasn't fast, it didn't need to be to just put log files on a pair of mirrored harddrives connected via USB.

I was going to say a lot of this, but BSD beat me to it, so kudos there. The only thing I'd add is that ZFS has scrubs, in in particular periodic scrubs, which give me a very warm and fuzzy feeling. My current RAIDZ2 pool (8x8TB) takes 4-5 hours to scrub, and gets scrubbed automatically every two weeks. That means that if I lose a disk, I know for certain that the blocks providing redundancy for that disk were completely valid as of (at most) two weeks ago. That doesn't mean there can't be bad blocks today or tomorrow, but to actually lose data, I'd have to have the corresponding parity data from two different drives get corrupted in the same two week period. I like my chances. By comparison, with most other RAID implementations, you can't tell if your redundancy data is still good without actually forcing a failure and a rebuild.

# ? Feb 15, 2022 17:36

alo: May 1, 2005

Less Fat Luke posted:

How's the noise? Did you swap any of the PSU fans or left it as is?

I'm seeing a lot of those and other DAS arrays on eBay thanks to Chia coin cratering (lol)

The noise is ok. It�s loud on first boot, but then the fans slow down to the quieter side of enterprise gear. I have a basement so noise isn�t really much of an issue.

Of course my wife was in the next room the other night and said �is that sound the new [disk shelf]� so perhaps I�m not the best judge of noise.

# ? Feb 15, 2022 18:39

BlankSystemDaemon: Mar 13, 2009

Zorak of Michigan posted:

I was going to say a lot of this, but BSD beat me to it, so kudos there. The only thing I'd add is that ZFS has scrubs, in in particular periodic scrubs, which give me a very warm and fuzzy feeling. My current RAIDZ2 pool (8x8TB) takes 4-5 hours to scrub, and gets scrubbed automatically every two weeks. That means that if I lose a disk, I know for certain that the blocks providing redundancy for that disk were completely valid as of (at most) two weeks ago. That doesn't mean there can't be bad blocks today or tomorrow, but to actually lose data, I'd have to have the corresponding parity data from two different drives get corrupted in the same two week period. I like my chances. By comparison, with most other RAID implementations, you can't tell if your redundancy data is still good without actually forcing a failure and a rebuild.

All RAID implementations have some form of consistency checking, because it's the only way to know that everything on the disks with every RAID level returns data instead of an error - it's just that ZFS also knows what the record should be, because of the checksum that's stored in the metadata for the parent record.

Also, another ZFS hardware performance datapoint that I forgot to mention:
My previous always-on storage was four 2TB disks in a raidz, and with a bit of tweaking, that was fast enough to saturate 1/1Gbps LAN with 9k jumboframes and regularly provided ~120MBps (roughly equivalent to spinning rust) for the iSCSI disk I had mounted in Windows to store games I weren't actively playing, before I got 1/1Gbps FTTH).
That machine had a 1.3GHz AMD N36L, which is a dual-core mobile chip at a measly 12W TDP (originally introduced back in 2007), and 8GB of ECC memory.

BlankSystemDaemon fucked around with this message at 18:52 on Feb 15, 2022

# ? Feb 15, 2022 18:48

Less Fat Luke: May 23, 2003; Exciting Lemon

alo posted:

The noise is ok. It�s loud on first boot, but then the fans slow down to the quieter side of enterprise gear. I have a basement so noise isn�t really much of an issue.

Of course my wife was in the next room the other night and said �is that sound the new [disk shelf]� so perhaps I�m not the best judge of noise.

Nice, I'm kind of torn on getting one; I can stick it in the basement as well but I don't really need that many drives or hotswap and might just go with a Meshify 2 XL instead.

# ? Feb 15, 2022 18:54

Methylethylaldehyde: Oct 23, 2004; BAKA BAKA

BlankSystemDaemon posted:

All RAID implementations have some form of consistency checking, because it's the only way to know that everything on the disks with every RAID level returns data instead of an error - it's just that ZFS also knows what the record should be, because of the checksum that's stored in the metadata for the parent record.

Also, another ZFS hardware performance datapoint that I forgot to mention:
My previous always-on storage was four 2TB disks in a raidz, and with a bit of tweaking, that was fast enough to saturate 1/1Gbps LAN with 9k jumboframes and regularly provided ~120MBps (roughly equivalent to spinning rust) for the iSCSI disk I had mounted in Windows to store games I weren't actively playing, before I got 1/1Gbps FTTH).
That machine had a 1.3GHz AMD N36L, which is a dual-core mobile chip at a measly 12W TDP (originally introduced back in 2007), and 8GB of ECC memory.

I have my 30 disk hobo-SAN hanging off some ancient 1.8 Ghz Xeon E3 v3 with DDR3 ram. Basic file serving doesn't really do anything to the CPU unless you have gzip compression enabled, and even a 800MB/sec scrub didn't really do more than peg one core while it ran.

You can do some pretty cool things with a 24 disk Supermicro/Norco/Fleabay Dell rackmount case and whatever old computer guts you have handy with ZFS.

# ? Feb 15, 2022 20:56

BlankSystemDaemon: Mar 13, 2009

Methylethylaldehyde posted:

I have my 30 disk hobo-SAN hanging off some ancient 1.8 Ghz Xeon E3 v3 with DDR3 ram. Basic file serving doesn't really do anything to the CPU unless you have gzip compression enabled, and even a 800MB/sec scrub didn't really do more than peg one core while it ran.

You can do some pretty cool things with a 24 disk Supermicro/Norco/Fleabay Dell rackmount case and whatever old computer guts you have handy with ZFS.

Well, the reason why it doesn't take up a lot of CPU is that fletcher family of checksums - of which fletcher4 is the default for ZFS - aren't very costly in terms of CPUtime per record, but also because the implementation found in OpenZFS comes with things like superscalar, SSE, SIMD optimizations, and even features vectorized checksum and raidz primitives as well as SIMD raidz implementation.

SHA256 and SHA512/256 are, respectively, either offloaded on newer CPUs, or features hand-rolled assembly optimizations from frameworks like OpenCrypto (which work in FreeBSD, and I assume work in Linux as well), or it's something like Skein or EDON-R which seem to be for paranoid people and/or are, respectively, almost never used and barely implemented until late last year in FreeBSD.

BlankSystemDaemon fucked around with this message at 21:23 on Feb 15, 2022

# ? Feb 15, 2022 21:14

IOwnCalculus: Apr 2, 2003

Biggest reason to me to run ZFS over mdraid: unless they've somehow fixed this since I last used it, mdraid will poo poo your entire array on a single URE if you're down to N drives out of N+x redundancy.

ZFS will note the URE, tell you the specific files it couldn't recover, and bring up as much of the array as it can.

# ? Feb 15, 2022 21:22

Shrimp or Shrimps: Feb 14, 2012

Okay, so I think I've yoloed into this without really understanding what I'm doing and am having a bit of a panic. So I've set up truenas on an old ssd, and I've got 2x10tb and 2x16tb drives. 6700k/16gb ram. Just an old pc I had after a recent upgrade.

I've created 1 pool of 2 vdevs, each pair of drives mirrored. Primary use case is media storage to be played back over 3 computers, game storage (not currently playing), and additional backups for important files (also on cloud, local computers, and some USB drives).

Have I screwed up somehow? Should I have 4 disks of the same capacity to run raidz2 for better data resiliency at the same space efficiency?

Should I use 2 pools so I can distribute files evenly across the drives for some reason? I don't even know how or in what order these drives fill up.

I bought the 16tb drives because in my market they have far and away best price per tb almost 3 dollars cheaper, but I really didn't think it through or do much research. I already had a 10tb drive, so bought a second because I had always assumed I was just going to do mirrors, but now I'm wondering if I should have gotten 2 more 10tb drives instead of the 16s. I was thinking about cost efficiency before even knowing what I was doing.

I'm completely new to all of this and really should have read up on this more.

# ? Feb 15, 2022 21:39

IOwnCalculus: Apr 2, 2003

You're not on an entirely bad track there, actually. The primary downsides of your solution as it sits are you're only getting 50% capacity out of your drives (versus 75% if you did 4x10TB in RAIDZ), and if you have catastrophic failure of two drives in a given vdev, you'll lose the whole array.

The upside is that you can expand inexpensively by adding another pair of drives, or by upgrading a pair of drives to larger drives. Rebuilds of a mirror are also quick.

ZFS will try to balance writes proportional to the free space on each vdev - so it'll send more data to the 16TB drives at first.

Do you have room to add more drives in the future, or is growth going to happen only by swapping out for larger drives? I'd maybe consider raidz (not z2) in this scenario so you get ~30TB usable, and your initial expansion would be by swapping out the 10TB drives for 16TB (or larger) in the future. ZFS doesn't care if the drives all match, it will treat larger drives as if they're the same as your smallest drive until you swap them out and expand the array (or have autoexpand enabled). raidz expansion by adding a single drive is also close-to-production-ready.

# ? Feb 15, 2022 21:53

Methylethylaldehyde: Oct 23, 2004; BAKA BAKA

Shrimp or Shrimps posted:

Okay, so I think I've yoloed into this without really understanding what I'm doing and am having a bit of a panic. So I've set up truenas on an old ssd, and I've got 2x10tb and 2x16tb drives. 6700k/16gb ram. Just an old pc I had after a recent upgrade.

I've created 1 pool of 2 vdevs, each pair of drives mirrored. Primary use case is media storage to be played back over 3 computers, game storage (not currently playing), and additional backups for important files (also on cloud, local computers, and some USB drives).

Have I screwed up somehow? Should I have 4 disks of the same capacity to run raidz2 for better data resiliency at the same space efficiency?

Should I use 2 pools so I can distribute files evenly across the drives for some reason? I don't even know how or in what order these drives fill up.

I bought the 16tb drives because in my market they have far and away best price per tb almost 3 dollars cheaper, but I really didn't think it through or do much research. I already had a 10tb drive, so bought a second because I had always assumed I was just going to do mirrors, but now I'm wondering if I should have gotten 2 more 10tb drives instead of the 16s. I was thinking about cost efficiency before even knowing what I was doing.

I'm completely new to all of this and really should have read up on this more.

You can play with https://magj.github.io/raid-failure/ to see what the math says. It's very much a rough calculation thing with a ton of assumptions, but the basics are there. More likely instead of an unrecoverable read error, the drives that are already old and tired will have a 2nd drive in the group die during the rebuild, potentially destroying all your data if you have raid5.

On drives bigger than 8ish TB, and pools larger than 4 drives, raid6/raidz2 is preferred over a raid1/Raid10 because any two disks can die vs. either disk in a specific pair can die. What you did isn't wrong per se, but it could have been done differently. If the 16s are still returnable, it might make sense to swap them for a 2 or 3 10s and make a raidZ2 array with 4 or 5 drives, even if the cost efficiency isn't the greatest.

# ? Feb 15, 2022 21:53

Shrimp or Shrimps: Feb 14, 2012

Thanks for the replies, the help is much appreciated.

Expansion: my motherboard is itx with 6 sata slots, and I'm currently using 5 (4 HDD, 1 SSD). In terms of adding drives, I'm guessing I have the option of moving truenas to an M2 NVME drive to put in that unused slot, and then using the remaining 2 SATA slots for 2 more drives. The Node 304 case can hold 6HDDs.

I don't think I can return the drives, I need to check. Assuming I'm stuck with the drives, my options are to either 1) continue mirroring for 26tb, 2) do raidZ1, the 16tb drives are treated as 10tb, for 30tb; 2) do raidZ2, the 16tb drives are treated as 10tb, for 20tb.

Z1 gives me more space, but only 1 drive can fail at a time.
Z2 gives me less space, but any 2 drives can fail at once.
Mirroring gives me in-between Z1 and Z2 space, but if 2 drives fail, it has to be from separate pairs in order to recover my data.

Would that be correct?

I'm struggling to even figure out what I want/need here. All absolutely mission critical stuff will also be on the cloud (assuming it's not privacy sensitive so like photos of my cat), on local computer SSDs, and backed up onto a variety of USB drives. The rest of it (media) can be lost and it wouldn't be the end of the world.

Methylethylaldehyde posted:

You can play with https://magj.github.io/raid-failure/ to see what the math says.

So I'm trying to figure out what I'm looking at here. So I set drive size to 12tb as that's the highest it goes, and with 4 drives I get a 6% success chance for an errorless rebuild in Z1, and a 19% chance of an errorless rebuild in Z2? So basically odds are that in a Z1 or Z2 rebuild, some files are going to get corrupted, assuming the drive doesn't fail during the rebuild?

So during a rebuild, if it is likely a second drive is going to die, then Z1 only offers technical redundancy? That's why Z2 is preferred.

So the same is going to apply to mirror, right? If one disk dies, the second will also be old and tired and likely fail during the rebuild, which would then mean the whole array is lost?

Dang. Hmm, I really should have looked into this more before buying the 16tb disks lmao

Shrimp or Shrimps fucked around with this message at 23:07 on Feb 15, 2022

# ? Feb 15, 2022 23:05

Adbot: ADBOT LOVES YOU

# ? May 27, 2024 11:10

IOwnCalculus: Apr 2, 2003

That calculator is built with traditional RAID in mind, not ZFS. Traditional hardware RAID, mdraid, etc, will all fault a drive completely on the first read error, and in a single-parity RAID5, that means one URE on one "working" drive during a rebuild nukes the entire array.

It's not entirely relevant for ZFS because if the rest of the drive can still be read, ZFS will recover the rest of the array and will tell you which files it could not safely recover, so you can restore them from the backup that you definitely have. And while not zero, the odds of having two catastrophic drive failures at the same time are much lower than the odds of a single URE across any of your working drives during a rebuild.

I abuse the everloving gently caress out of ZFS with a 20-drive pool made up of four-drive raidz vdevs. My drives are anywhere from old to ancient, some of them ran for years in my garage at 100+ degree ambient temps in the summer, I've got an entirely unapproved mishmash of SAS/SATA drives and controllers. It should implode at any instant. I've never lost the whole pool, but I have had it recover multiple times from URE-during-rebuild events.

IOwnCalculus fucked around with this message at 23:28 on Feb 15, 2022

# ? Feb 15, 2022 23:26

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > NAS/Storage Megathread: What is this "File Deletion" You Speak of

«‹›847 »