NAS/Storage Megathread: What is this "File Deletion" You Speak of

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > NAS/Storage Megathread: What is this "File Deletion" You Speak of

«‹›846 »

taqueso: Mar 8, 2004

Backblaze B2 is the best deal I was able to find. 42TB is a lot of data though, this might be a situation where you put a box of HDDs in the attic of a friend that lives on the other side of the country.

# ? Jun 14, 2020 19:57

Adbot: ADBOT LOVES YOU

# ? May 17, 2024 00:28

Zorak of Michigan: Jun 10, 2006

I have the same issue but no solution. In my case I own discs for my entire collection, so my backups are the original media.

# ? Jun 14, 2020 19:57

Hughlander: May 11, 2005

Google apps business account with duplicacy and rclone.

# ? Jun 14, 2020 20:00

Duck and Cover: Apr 6, 2007

Brain Issues posted:

What are you guys using for offsite backup that doesn't destroy your wallet?

I have about 42TB of movies/tv shows/music on a single (SHR2) volume on a Synology DS1618+, and I'm getting a bit nervous if something were to happen, but creating a backup of this much data seems like it would be very expensive.

None of it is absolutely critical, but would be extremely saddening to lose this collection that I've built up over nearly a decade.

Also, I have Comcast, with pitiful 40 Mbit upload, so uploading all of this to a Cloud backup would take months.

My opinion is most files are easy to redownload through usenet/torrent/whatever so that if you just backup the stuff that's difficult to get you're saving yourself a lot of money/time. That rip of Breaking Bad? You don't need a backup. Game of Thrones? You don't need a backup. That series you had to grab from youtube because it doesn't seem to be anywhere else? Yeah that might be worth backing up.

Duck and Cover fucked around with this message at 20:20 on Jun 14, 2020

# ? Jun 14, 2020 20:17

fletcher: Jun 27, 2003; ken park is my favorite movie; Cybernetic Crumb

Brain Issues posted:

What are you guys using for offsite backup that doesn't destroy your wallet?

I have about 42TB of movies/tv shows/music on a single (SHR2) volume on a Synology DS1618+, and I'm getting a bit nervous if something were to happen, but creating a backup of this much data seems like it would be very expensive.

None of it is absolutely critical, but would be extremely saddening to lose this collection that I've built up over nearly a decade.

Also, I have Comcast, with pitiful 40 Mbit upload, so uploading all of this to a Cloud backup would take months.

Backblaze B2. I have about 10TB up there and it costs me about $50/mo. It took months to do the initial upload.

# ? Jun 14, 2020 20:24

Chris Knight: Jun 5, 2002; me @ ur posts; Fun Shoe

Brain Issues posted:

What are you guys using for offsite backup that doesn't destroy your wallet?

Not bothering with video stuff tbh. Anything even remotely mainstream can be found again. If any of your disks starts going bad you should be getting alerts from your NAS to replace it right away. If multiple disks go bad you may be hosed.

# ? Jun 14, 2020 20:26

Raymond T. Racing: Jun 11, 2019

In my opinion yeah. As long as your list of ISOs is safe, I don't care too much about actually losing anything on the NAS.

# ? Jun 14, 2020 20:33

Sneeze Party: Apr 26, 2002; These are, by far, the most brilliant photographs that I have ever seen, and you are a GOD AMONG MEN.; Toilet Rascal

Brain Issues posted:

What are you guys using for offsite backup that doesn't destroy your wallet?

I have about 42TB of movies/tv shows/music on a single (SHR2) volume on a Synology DS1618+, and I'm getting a bit nervous if something were to happen, but creating a backup of this much data seems like it would be very expensive.

None of it is absolutely critical, but would be extremely saddening to lose this collection that I've built up over nearly a decade.

Also, I have Comcast, with pitiful 40 Mbit upload, so uploading all of this to a Cloud backup would take months.

And it would take longer than that, since you probably wouldn't want to saturate your upstream for 102 days straight. Have you considered just doing a USB backup once a month and storing it at a friend or family's place? That'd probably cost you a few hundred bucks in USB devices. Even Google Cloud would run you like 120 bucks per month if you did their archive storage option. 160 or so for their coldline option. You could install another DS1618 at somebody else's house and replicate your NAS to it.

Backblaze is another option, but that'd cost over 2 grand for the first year. Pretty pricey, still. https://www.backblaze.com/b2/cloud-storage-pricing.html#calculator

# ? Jun 14, 2020 21:41

H110Hawk: Dec 28, 2006

Don't backup your linux ISO's unless it's literally irreplaceable, in which case I would just find the data hoarder at your office and give it to them. Backup your pictures and your documents, but your 41.99TB of <whatever awful media> ? Don't.

# ? Jun 14, 2020 21:56

Takes No Damage: Nov 20, 2004; The most merciful thing in the world, I think, is the inability of the human mind to correlate all its contents. We live on a placid island of ignorance in the midst of black seas of infinity, and it was not meant that we should voyage far.; Grimey Drawer

DrDork posted:

However, buried near the start of the questioning was a notion of slowly expanding the array as more drives are needed / bigger drives get cheaper, and that's the one thing RAIDZ basically can't do in an economical way.

I probably worded it poorly, but by 'expanding the array' I was talking about increasing drive size, not number of drives.

DrDork posted:

That you can't have a "reasonable person" vdev of like 4x8-12TB and expand it by tossing in another 1-2 drives when you start to get full is a very real limit for a lot of home users.

This was my original assumption, that all the drives just kind of blobbed together and I could Borg-like add new storage whenever. Obviously that isn't the case, and I assume has a lot to do with the other features that makes ZFS so good at everything else. That's why I kept looking for ways to start things off with a larger drive count without dropping 1000+ all at once.

I may just split the difference and go with 8 drives. My parents have started making noises about 'cutting the cable' which has led me to research a lot of apps that end in 'rr'. I'm probably never going to be a power user compared to most other people who have gone to the trouble to build their own NAS, but the prospect of becoming a home DVR service does potentially increase my future disk use requirements.

An unrelated question, do any of you use use a Windows 10 VM from your NAS? Or is there some online calculator I can use to estimate the resource requirements to use a VM as a daily driver? Some of the PCs around here are becoming due for a rebuild, if I could virtualize them and just serve them up through a cheap linux laptop connected to a monitor that would be a lot cheaper than buying/building another whole PC. But I didn't really build the NAS with that in mind, so I'm currently just rocking a X3440 and 16GB RAM in a SuperMicro S8XLI. The PC in question would just be doing office work and YouTube with some light photo/video editing, doesn't need to be a gaming rig.

# ? Jun 14, 2020 22:06

Sniep: Mar 28, 2004; All I needed was that fatty blunt...

King of Breakfast

Zorak of Michigan posted:

I have the same issue but no solution. In my case I own discs for my entire collection, so my backups are the original media.

Same that my media is all off physical discs, but I still backup the library to the cheapest 8tb elements externals and keep them in a storage unit just in case.

I cant imagine the work (time) involved to re-rip everything if I oops i accidentally'd my library

# ? Jun 14, 2020 22:16

BlankSystemDaemon: Mar 13, 2009

I'm not sure if I'm brilliant or stupid, but I just thought of putting the new 'special' vdev, containing metadata from a secondary zpool of 15 disks in raidz3, on a zvol, possibly with checksumming, compression and caching disabled, belonging to a primary zpool consisting of four fast disks striped mirrors..

# ? Jun 14, 2020 22:58

Zorak of Michigan: Jun 10, 2006

You lost me at metadata and checksumming disabled.

# ? Jun 15, 2020 04:56

Paul MaudDib: May 3, 2006; TEAM NVIDIA:
FORUM POLICE

Your post made me wonder - sometimes people do a thing where they know they�re going to expand an array and so they temporarily set up multiple vdevs on a single disk using lvm or similar.

Why can�t you take that idea to the max and set up a �meta layer� where zfs runs on a multitude of small mirror vdevs (let�s say hundreds) with the logical volumes inside the mirror vdevs scattered across the drives for redundancy - essentially a second layer of striping where the stripes themselves are moveable.

This would let you expand an array simply by moving some of the vdev mirrors to the new disk and then expanding them (both the ones you moved and the ones you left behind). Space efficiency might not be 100% but surely a lot better than having to upgrade 8 disks at a time, or the �redirects� solution in the official array expansion code that permanently leaves you with O(N) additional disk reads for everything on the pool in perpetuity.

If instead of using lvm or similar, you had the vdevs as non-mirrored files on a pool with redundancy, then you wouldn�t even have to have full mirrors. Performance probably wouldn�t be as great but again, enterprise doesn�t care about dropping 8 drives at a time for expansion, it might still be fast enough for home users. The alternative isn�t ideal either.

Obviously the tooling for working with that arrangement safely doesn�t exist, but I don�t see any technical reason it wouldn�t work. Perhaps ZFS is not really optimized for handling pools with hundreds/thousands of vdevs though?

Paul MaudDib fucked around with this message at 05:17 on Jun 15, 2020

# ? Jun 15, 2020 05:04

Moey: Oct 22, 2010; I LIKE TO MOVE IT

Brain Issues posted:

What are you guys using for offsite backup that doesn't destroy your wallet?

Selective backups of important materials.

Aka not your entire Plex library.

# ? Jun 15, 2020 06:45

Axe-man: Apr 16, 2005; The product of hundreds of hours of scientific investigation and research.

The perfect meatball.; Clapping Larry

I've recommended and done this myself, getting another Synology unit, since well you indicated you have a DS1618+ and doing the backup on site for the first one, this means that any backups after that already have the majority of the data done, and will be tiny. Then give it to your parents, siblings, or close friends, and have it update weekly. Unless your data grows a lot between sessions, at most you will transfer a few gigs per month.

You can pay them back by allowing them access to your plex, or whatever, i know a lot of people not tech savvy enough to delve into this stuff, and they really are amazed by it.

If your data is growing terabytes per month, I am yeah, then you will hit bottlenecks no matter what, and having a local backup at the very least is best. There are a few that are fireproof and the like. One actually runs off a licensed version of synology operating system, if you want to make sure it is all the same:

https://iosafe.com/

# ? Jun 15, 2020 06:57

Hadlock: Nov 9, 2004

H110Hawk posted:

Don't backup your linux ISO's unless it's literally irreplaceable, in which case I would just find the data hoarder at your office and give it to them. Backup your pictures and your documents, but your 41.99TB of <whatever awful media> ? Don't.

What? This OSX installer for VLC 0.7.2 from July 2001 is irreplaceable, how dare you sir

Also that Google Earth installer from 2011

:fuckoff:

# ? Jun 15, 2020 08:43

BlankSystemDaemon: Mar 13, 2009

Zorak of Michigan posted:

You lost me at metadata and checksumming disabled.

The 'special' vdev already does checksumming, compression, and caching on its own, which is why I was thinking it would be smart to not go through those codepaths twice.

Paul MaudDib posted:

Your post made me wonder - sometimes people do a thing where they know they�re going to expand an array and so they temporarily set up multiple vdevs on a single disk using lvm or similar.

Why can�t you take that idea to the max and set up a �meta layer� where zfs runs on a multitude of small mirror vdevs (let�s say hundreds) with the logical volumes inside the mirror vdevs scattered across the drives for redundancy - essentially a second layer of striping where the stripes themselves are moveable.

This would let you expand an array simply by moving some of the vdev mirrors to the new disk and then expanding them (both the ones you moved and the ones you left behind). Space efficiency might not be 100% but surely a lot better than having to upgrade 8 disks at a time, or the �redirects� solution in the official array expansion code that permanently leaves you with O(N) additional disk reads for everything on the pool in perpetuity.

If instead of using lvm or similar, you had the vdevs as non-mirrored files on a pool with redundancy, then you wouldn�t even have to have full mirrors. Performance probably wouldn�t be as great but again, enterprise doesn�t care about dropping 8 drives at a time for expansion, it might still be fast enough for home users. The alternative isn�t ideal either.

Obviously the tooling for working with that arrangement safely doesn�t exist, but I don�t see any technical reason it wouldn�t work. Perhaps ZFS is not really optimized for handling pools with hundreds/thousands of vdevs though?

You absolutely could do that, sure. All that's needed, as you say, is the script.
In fact, the basics of it is how the UEFI boot partition + FreeBSD root on ZFS partition is layed out in bsdinstall, which is done in the Almquist shell.
With FreeBSD, a GEOM class can be layered any way you want, and since a partition is just a GEOM class via the gpart provider, you can absolutely create hundreds or thousands of partitions, as long as each one is above 64MB (that's the minimum size, which is also the reason you can't do ZFS on floppies - ZFS on floppies has come up more than once..).
I suspect if there's a maximum number of devices that can be in a vdev, it's likely (2^64)-1 - ZFS doesn't really use datatypes below that.

The "proper" way to expand a pool is to add a new vdev and simply do zfs send | receive on the local pool to redistribute the data onto the new disks - effectively what block pointer rewrite would do for raidz expansion. Block pointer rewrite isn't likely to happen, though - because as Matt Ahrens put it in a recent ZFS BoF at BSDCan, block pointer rewrite would be the final feature added to ZFS, since it complicates everything so massively.

With FreeBSD, you can even use the GEOM classes ggated and ggatec to share GEOM classes over the network, so I imagine you could do some sort of clustering, if you took the time to write the scripts for setting it up.
Another trick with GEOM is when you have a single disk with any filesystem on it in a machine, and want to add raidzN, what you can do is: Load the geom_zero kernel module (which writes and reads from /dev/null), create a zpool raidzN of some number of disks as well as the geom_zero device, then once you've moved the contents of the single harddisk over to the new pool, you do zpool replace tank geom_zero-device new-device.

BlankSystemDaemon fucked around with this message at 11:07 on Jun 15, 2020

# ? Jun 15, 2020 09:02

Yaoi Gagarin: Feb 20, 2014

D. Ebdrup posted:

The 'special' vdev already does checksumming, compression, and caching on its own, which is why I was thinking it would be smart to not go through those codepaths twice.

You absolutely could do that, sure. All that's needed, as you say, is the script.
In fact, the basics of it is how the UEFI boot partition + FreeBSD root on ZFS partition is layed out in bsdinstall, which is done in the Almquist shell.
With FreeBSD, a GEOM class can be layered any way you want, and since a partition is just a GEOM class via the gpart provider, you can absolutely create hundreds or thousands of partitions, as long as each one is above 64MB (that's the minimum size, which is also the reason you can't do ZFS on floppies - ZFS on floppies has come up more than once..).
I suspect if there's a maximum number of devices that can be in a vdev, it's likely (2^64)-1 - ZFS doesn't really use datatypes below that.

The "proper" way to expand a pool is to add a new vdev and simply do zfs send | receive on the local pool to redistribute the data onto the new disks - effectively what block pointer rewrite would do for raidz expansion. Block pointer rewrite isn't likely to happen, though - because as Matt Ahrens put it in a recent ZFS BoF at BSDCan, block pointer rewrite would be the final feature added to ZFS, since it complicates everything so massively.

With FreeBSD, you can even use the GEOM classes ggated and ggatec to share GEOM classes over the network, so I imagine you could do some sort of clustering, if you took the time to write the scripts for setting it up.
Another trick with GEOM is when you have a single disk with any filesystem on it in a machine, and want to add raidzN, what you can do is: Load the geom_zero kernel module (which writes and reads from /dev/null), create a zpool raidzN of some number of disks as well as the geom_zero device, then once you've moved the contents of the single harddisk over to the new pool, you do zpool replace tank geom_zero-device new-device.

Does send | receive onto the same pool actually work? That seems pretty crazy

# ? Jun 15, 2020 20:07

BlankSystemDaemon: Mar 13, 2009

VostokProgram posted:

Does send | receive onto the same pool actually work? That seems pretty crazy

I'm having a hard time answering this, because it seems like a pretty fundamental misconception is going on here.
ZFS is pooled storage consisting of DMU nodes with types such as block storage or filesystems provided by the ZFS POSIX layer, the same way memory is pooled by the MMU. Of course they can be copied around by zfs send | receive. What you're essentially saying is "copying things around in memory is crazy".

# ? Jun 15, 2020 22:29

IOwnCalculus: Apr 2, 2003

The question in my mind is, does it work if the pool is over 50% full? And if so, does it actually rebalance / possibly defragment the data in a meaningful way?

# ? Jun 16, 2020 00:53

cr0y: Mar 24, 2005

So I am waiting on parts for an uNRAID build that I will be putting together this week and am trying to think about how I want to do my 7TBish migration to it when it's complete, and then re-use the existing 10TB drive that the above 7TB lives on in the new unRAID pool. I have 2x10TB drives coming, and the (in use) 10TB that my media currently lives on.

Does this sound like a proper plan?

1) Build the drat thing
2) Create a 1 drive + 1 parity drive pool (so, use 2x10TB drives that are new, resulting in 10TB of usable space)
3) Copy all the media to that newly created network share
4) Make sure the media copied and is all safe and sound on the unRAID box.
5) Yank the old 10TB out of its existing home, format it and add it (??) as additional capacity for the above 1+1 pool
6) Profit?

Is there a better/safer way to do this?

# ? Jun 16, 2020 01:36

Smashing Link: Jul 8, 2003; I'll keep chucking bombs at you til you fall off that ledge!; Grimey Drawer

cr0y posted:

So I am waiting on parts for an uNRAID build that I will be putting together this week and am trying to think about how I want to do my 7TBish migration to it when it's complete, and then re-use the existing 10TB drive that the above 7TB lives on in the new unRAID pool. I have 2x10TB drives coming, and the (in use) 10TB that my media currently lives on.

Does this sound like a proper plan?

1) Build the drat thing
2) Create a 1 drive + 1 parity drive pool (so, use 2x10TB drives that are new, resulting in 10TB of usable space)
3) Copy all the media to that newly creacted network share
4) Make sure the media copied and is all safe and sound on the unRAID box.
5) Yank the old 10TB out of its existing home, format it and add it (??) as additional capacity for the above 1+1 pool
6) Profit?

Is there a better/safer way to do this?

I believe the thread will tell you keep the old 10TB as an off-site backup. Add to the pool when you can afford an additional 10TB. But it depends what other backups you have.

# ? Jun 16, 2020 01:43

cr0y: Mar 24, 2005

Smashing Link posted:

I believe the thread will tell you keep the old 10TB as an off-site backup. Add to the pool when you can afford an additional 10TB. But it depends what other backups you have.

It's all basically replaceable downloaded poo poo. Everything important has a couple copies elsewhere in the world.

Basically I want to be able to tolerate a failed drive, if my house burns down I have bigger concerns.

cr0y fucked around with this message at 01:49 on Jun 16, 2020

# ? Jun 16, 2020 01:47

Rooted Vegetable: Jun 1, 2002

Seems fine since you've got a back-up of the irreplaceable stuff.

# ? Jun 16, 2020 01:58

THF13: Sep 26, 2007; Keep an adversary in the dark about what you're capable of, and he has to assume the worst.

Leave the parity off while you're doing the initial copying over and turn it on afterwards, parity calculations will slow down your transfer.
edit: and disable your cache for any shares you're copying to as well

THF13 fucked around with this message at 03:19 on Jun 16, 2020

# ? Jun 16, 2020 02:11

Chilled Milk: Jun 22, 2003; No one here is alone,
satellites in every home

I also use B2. But I don't backup my Movie/TV rips only my personal files and my music collection. I held onto all those CDs but I never want to rip them again (FLAC on em), nor try and reassemble and musicbrainz all the downloaded stuff.

# ? Jun 16, 2020 05:16

Raymond T. Racing: Jun 11, 2019

THF13 posted:

Leave the parity off while you're doing the initial copying over and turn it on afterwards, parity calculations will slow down your transfer.
edit: and disable your cache for any shares you're copying to as well

If you set your minimum free space/share split levels right, you don't need to micromanage turning cache off on ingest. If minimum free space is set correctly, it'll write past the cache when the cache is full.

Plus its good practice to get them set right to begin with avoiding problems down the line if you ever write >cache size and haven't planned ahead for it

# ? Jun 16, 2020 05:36

Yaoi Gagarin: Feb 20, 2014

D. Ebdrup posted:

I'm having a hard time answering this, because it seems like a pretty fundamental misconception is going on here.
ZFS is pooled storage consisting of DMU nodes with types such as block storage or filesystems provided by the ZFS POSIX layer, the same way memory is pooled by the MMU. Of course they can be copied around by zfs send | receive. What you're essentially saying is "copying things around in memory is crazy".

Well what happens to the old blocks when the copies are written to the pool? Do I now have 2x everything in the pool?

# ? Jun 16, 2020 06:56

BlankSystemDaemon: Mar 13, 2009

IOwnCalculus posted:

The question in my mind is, does it work if the pool is over 50% full? And if so, does it actually rebalance / possibly defragment the data in a meaningful way?

The question is, why the gently caress do you have one dataset taking up more than 50% of your pool when you could have hundreds of thousands.

THF13 posted:

Leave the parity off while you're doing the initial copying over and turn it on afterwards, parity calculations will slow down your transfer.
edit: and disable your cache for any shares you're copying to as well

Yeah, who would want such a silly thing as parity calculations? Who needs them, what've they ever done for us, et cetera.

VostokProgram posted:

Well what happens to the old blocks when the copies are written to the pool? Do I now have 2x everything in the pool?

ZFS is copy-on-write so until you delete the "old" datasets, the data exists more times on the pool - but that's true if you're using mirroring, striping with distributed parity, or even ditto blocks.

# ? Jun 16, 2020 09:53

THF13: Sep 26, 2007; Keep an adversary in the dark about what you're capable of, and he has to assume the worst.

quote:

Yeah, who would want such a silly thing as parity calculations? Who needs them, what've they ever done for us, et cetera.

This advice is just for the initial copy to a new unraid server. If something goes wrong all his data is still there on the original hard drive he copied from. With Unraid you don't need to setup parity right away, you can add parity or expand it to 2 disk parity at any time.

# ? Jun 16, 2020 14:26

BlankSystemDaemon: Mar 13, 2009

And yet the thing most people worry about when replacing a drive in distributed parity situation is UREs, despite having backups, because rebuilding from backup is inevitably a week-long thing at least.
Always assuming, of course, that people know their backups to be good - something most people with homelabs don't even know, let alone everyone else.

# ? Jun 16, 2020 18:21

Chris Knight: Jun 5, 2002; me @ ur posts; Fun Shoe

The Milkman posted:

I also use B2. But I don't backup my Movie/TV rips only my personal files and my music collection. I held onto all those CDs but I never want to rip them again (FLAC on em), nor try and reassemble and musicbrainz all the downloaded stuff.

Yeah music I have like 3 copies of at all times.

# ? Jun 16, 2020 21:19

Munkeymon: Aug 14, 2003; Motherfucker's got an
armor-piercing crowbar! Rigoddamndicu𝜆ous.

D. Ebdrup posted:

The question is, why the gently caress do you have one dataset taking up more than 50% of your pool when you could have hundreds of thousands.

Newing up a filesystem for every different, ah, "Linux distro" you want to preserve on your NAS isn't going to occur to most people, including me!

# ? Jun 17, 2020 17:04

BlankSystemDaemon: Mar 13, 2009

Munkeymon posted:

Newing up a filesystem for every different, ah, "Linux distro" you want to preserve on your NAS isn't going to occur to most people, including me!

Ever since filesystems started implementing hierarchies of folders in the 70s, there's been very little distinction between them.
It disappeared completely with zfs, because 'zfs create' and 'mkdir' both accept -p, and 'zfs create' also lets you specify zfs-specific options via -o parameter=value like 'mount' and 'newfs' use.

# ? Jun 17, 2020 19:47

fletcher: Jun 27, 2003; ken park is my favorite movie; Cybernetic Crumb

I'm in the process of setting up my new NAS to migrate from NAS4Free over to FreeNAS. Previously with NAS4Free, I just had my fletch_vdev pool with no datasets, and the /mnt/fletch_vdev was shared via Samba for my Windows machine.

Over in FreeNAS, it will let me share /mnt/fletch_vdev via Samba, but it won't let me set ACLs after that. Reading more about it, apparently I should have been using ZFS Datasets this whole time. Then I can setup a Samba share for each Dataset and control the ACLs that way. I thought about just creating one big dataset, but after reading more about datasets I think it makes sense to break up my data into a few different ones, so I have more flexibility with snapshots and of course the ACLs.

I was planning on using zfs send/recv with nc to send it over the wire. To facilitate that, on my old NAS I created the datasets and started moving the files around, then I can do the zfs send/recv on each of the datasets.

I had just about completed that part on the old NAS when I noticed it started going very very slow. Sure enough:

code:

fletchn40l: mp3 # zpool status
  pool: fletch_vdev
 state: ONLINE
  scan: scrub in progress since Tue Jun 16 04:00:01 2020
        13.6T scanned out of 22.2T at 101M/s, 24h56m to go
        213M repaired, 61.26% done
config:

        NAME          STATE     READ WRITE CKSUM
        fletch_vdev   ONLINE       0     0     0
          raidz2-0    ONLINE       0     0     0
            ada0.nop  ONLINE       0     0     0
            ada1.nop  ONLINE       0     0     0
            ada2.nop  ONLINE       0     0     0  (repairing)
            ada3.nop  ONLINE       0     0     0
            ada4.nop  ONLINE       0     0     0

errors: No known data errors

smartctl not looking so good on that drive now either:

code:

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   169   169   051    Pre-fail  Always       -       11039
  3 Spin_Up_Time            0x0027   195   194   021    Pre-fail  Always       -       9233
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       45
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   049   049   000    Old_age   Always       -       37932
 10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       45
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       4
193 Load_Cycle_Count        0x0032   181   181   000    Old_age   Always       -       58108
194 Temperature_Celsius     0x0022   116   110   000    Old_age   Always       -       36
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       29
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0

I've got cloud backups so I'm not too worried about data loss. I was planning on letting the repair finish and then see how it looks, then start copying data over to the new NAS.

# ? Jun 17, 2020 20:37

BlankSystemDaemon: Mar 13, 2009

I've broken this up in sections, please excuse the multiquoting.

fletcher posted:

I'm in the process of setting up my new NAS to migrate from NAS4Free over to FreeNAS. Previously with NAS4Free, I just had my fletch_vdev pool with no datasets, and the /mnt/fletch_vdev was shared via Samba for my Windows machine.

Over in FreeNAS, it will let me share /mnt/fletch_vdev via Samba, but it won't let me set ACLs after that. Reading more about it, apparently I should have been using ZFS Datasets this whole time. Then I can setup a Samba share for each Dataset and control the ACLs that way. I thought about just creating one big dataset, but after reading more about datasets I think it makes sense to break up my data into a few different ones, so I have more flexibility with snapshots and of course the ACLs.

Depending on how you access your shares, you'll need to be very careful about ACLs. Is the dataset only going to be accessed by Windows, now or in the future? If so, you can set the share type to SMB instead, and sambad will use the winacl and smbcacls commands instead of chown and chmod. On the other hand, it means that you won't be able to use chmod and chown if you're accessing the dataset from the FreeBSD-like CLI or if you suddenly decide you want to use NFSv4 sharing in Windows (which it supports, and at least for me runs better than SMB).
Another benefit of datasets is that if you have one for your home movies or your own music, or stuff that's otherwise compressed, you can disable in-line compression on those datasets - but you don't have to, since lz4 has an early-abort feature whereby if compression doesn't reach 17% it'll stop trying.

fletcher posted:

I was planning on using zfs send/recv with nc to send it over the wire. To facilitate that, on my old NAS I created the datasets and started moving the files around, then I can do the zfs send/recv on each of the datasets.

I would recommend using mbuffer -O and -I since it does exactly the same as netcat (ie. send data over the wire, unencrypted) but also acts as a buffer.
The reason using a buffer is a good idea is that ZFS send will first compute what needs to be sent over the wire (and try to arrange the data sequentially, if possible), and then send it - each of these steps is serialized rather than parallelized, though, so if you don't create a memory buffer (which is what mbuffer exists for, the network I/O is just an added benefit), it can take longer than is strictly necessary.

fletcher posted:

I had just about completed that part on the old NAS when I noticed it started going very very slow. Sure enough:

code:

fletchn40l: mp3 # zpool status
  pool: fletch_vdev
 state: ONLINE
  scan: scrub in progress since Tue Jun 16 04:00:01 2020
        13.6T scanned out of 22.2T at 101M/s, 24h56m to go
        213M repaired, 61.26% done
config:

        NAME          STATE     READ WRITE CKSUM
        fletch_vdev   ONLINE       0     0     0
          raidz2-0    ONLINE       0     0     0
            ada0.nop  ONLINE       0     0     0
            ada1.nop  ONLINE       0     0     0
            ada2.nop  ONLINE       0     0     0  (repairing)
            ada3.nop  ONLINE       0     0     0
            ada4.nop  ONLINE       0     0     0

errors: No known data errors

smartctl not looking so good on that drive now either:

code:

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   169   169   051    Pre-fail  Always       -       11039
  3 Spin_Up_Time            0x0027   195   194   021    Pre-fail  Always       -       9233
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       45
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   049   049   000    Old_age   Always       -       37932
 10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       45
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       4
193 Load_Cycle_Count        0x0032   181   181   000    Old_age   Always       -       58108
194 Temperature_Celsius     0x0022   116   110   000    Old_age   Always       -       36
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       29
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0

I've got cloud backups so I'm not too worried about data loss. I was planning on letting the repair finish and then see how it looks, then start copying data over to the new NAS.

That's a big yikes.
I've finally got a better solution for local backup now, but the 78000 hours on my servers disks still make me wish I could afford to replace them before they die. :ohdear:

# ? Jun 18, 2020 05:27

fletcher: Jun 27, 2003; ken park is my favorite movie; Cybernetic Crumb

D. Ebdrup posted:

Depending on how you access your shares, you'll need to be very careful about ACLs. Is the dataset only going to be accessed by Windows, now or in the future? If so, you can set the share type to SMB instead, and sambad will use the winacl and smbcacls commands instead of chown and chmod. On the other hand, it means that you won't be able to use chmod and chown if you're accessing the dataset from the FreeBSD-like CLI or if you suddenly decide you want to use NFSv4 sharing in Windows (which it supports, and at least for me runs better than SMB).

Interesting! I didn't know that Windows supported NFSv4. It sounds like Windows doesn't have an official NFSv4 client though, only server. Are you using the client from University of Michigan? I am only planning on accessing these datasets from Windows machines both now and in the future, so I was planning on using the SMB share type on the new NAS.

D. Ebdrup posted:

Another benefit of datasets is that if you have one for your home movies or your own music, or stuff that's otherwise compressed, you can disable in-line compression on those datasets - but you don't have to, since lz4 has an early-abort feature whereby if compression doesn't reach 17% it'll stop trying.

I didn't really consider the compression setting, thinking about my data maybe it makes sense to disable compression on all my datasets.

D. Ebdrup posted:

I would recommend using mbuffer -O and -I since it does exactly the same as netcat (ie. send data over the wire, unencrypted) but also acts as a buffer.
The reason using a buffer is a good idea is that ZFS send will first compute what needs to be sent over the wire (and try to arrange the data sequentially, if possible), and then send it - each of these steps is serialized rather than parallelized, though, so if you don't create a memory buffer (which is what mbuffer exists for, the network I/O is just an added benefit), it can take longer than is strictly necessary.

Good to know! I guess one of the advantages with netcat is that it was already available on both machines. I'll see if there's a way get it on my ancient version of NAS4Free.

D. Ebdrup posted:

That's a big yikes.
I've finally got a better solution for local backup now, but the 78000 hours on my servers disks still make me wish I could afford to replace them before they die.

"scrub repaired 213M in 47h1m with 0 errors" hopefully it makes it through the big transfers! Thanks for the tips D. Edbdrup.

# ? Jun 18, 2020 08:49

Munkeymon: Aug 14, 2003; Motherfucker's got an
armor-piercing crowbar! Rigoddamndicu𝜆ous.

D. Ebdrup posted:

Ever since filesystems started implementing hierarchies of folders in the 70s, there's been very little distinction between them.
It disappeared completely with zfs, because 'zfs create' and 'mkdir' both accept -p, and 'zfs create' also lets you specify zfs-specific options via -o parameter=value like 'mount' and 'newfs' use.

OK but a folder isn't automatically a new dataset, as far as I can tell, and "alias mkdir to 'zfs create'" is neat idea but not something that's going to occur to most people. Probably not hard to replace a directory hierarchy with a bunch of hived-off datasets after the fact, but still not the obvious thing to do when you're a home ~~gamer~~server janitor.

fletcher posted:

Interesting! I didn't know that Windows supported NFSv4. It sounds like Windows doesn't have an official NFSv4 client though, only server. Are you using the client from University of Michigan? I am only planning on accessing these datasets from Windows machines both now and in the future, so I was planning on using the SMB share type on the new NAS.

Could've sworn Explorer became a client after you installed the feature but now I can't find the feature to install since they dumbed down the UI 😑

Munkeymon fucked around with this message at 16:43 on Jun 18, 2020

# ? Jun 18, 2020 16:40

Adbot: ADBOT LOVES YOU

# ? May 17, 2024 00:28

Less Fat Luke: May 23, 2003; Exciting Lemon

Munkeymon posted:

OK but a folder isn't automatically a new dataset, as far as I can tell, and "alias mkdir to 'zfs create'" is neat idea but not something that's going to occur to most people. Probably not hard to replace a directory hierarchy with a bunch of hived-off datasets after the fact, but still not the obvious thing to do when you're a home ~~gamer~~server janitor.

As long as moving files between datasets is a full read, write and delete then I wouldn't go that route, it's way too slow.

# ? Jun 18, 2020 16:52

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > NAS/Storage Megathread: What is this "File Deletion" You Speak of

«‹›846 »