NAS/Storage Megathread: What is this "File Deletion" You Speak of

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > NAS/Storage Megathread: What is this "File Deletion" You Speak of

«‹›3 »

Yaoi Gagarin: Feb 20, 2014

Eletriarnation posted:

Maybe in a mirror, but as far as I understand it the performance characteristics of distributed parity topologies like RAID-5/6/Z* have more similarities to striped arrays. You of course have a lot more CPU overhead, and at any given time some subset of your disks is reading/writing parity blocks that don't contribute to your final application-available bandwidth. Still, modern CPUs are fast so that's not much of a bottleneck to HDDs and you can absolutely get very fast numbers for sustained, sequential transfers.

Ah, neat. In that case I second what Wibla said. Make a raidz1 or raidz2 of your drives and you're good

# ¿ Mar 31, 2023 21:37

Adbot: ADBOT LOVES YOU

# ¿ May 17, 2024 00:33

Yaoi Gagarin: Feb 20, 2014

You could also run whatever OS you want in a VM on top of truenas. It is a little more configuration effort than a second box because you have to set up a network bridge, but after that it'll be the same.

# ¿ Apr 12, 2023 23:06

Yaoi Gagarin: Feb 20, 2014

You're also committing to forever use that pool on machines that have that much RAM. What if you're poorer in the future

# ¿ Apr 13, 2023 18:08

Yaoi Gagarin: Feb 20, 2014

I want to read some data from a bunch of snapshots.

A while ago I was playing around with truenas and I set up "backups" dataset with a recurring snapshot task. The idea being that I could simply use it as a dumb mirror of my other computer's files using rsync or syncthing or robocopy, and rely on the snapshots to provide versioning. I never got around to actually using it though.

Now I'm trying to use this truenas machine again, and I see that those backup snapshots are holding on to 30 GB:

code:

tank/backups@auto-2022-05-29_00-00                     36.3M      -     11.6G  -
tank/backups@auto-2022-06-05_00-00                      660K      -     17.2G  -
tank/backups@auto-2022-06-12_00-00                      240K      -     17.2G  -
tank/backups@auto-2022-06-19_00-00                     41.9M      -     19.1G  -
tank/backups@auto-2022-06-26_00-00                     36.0M      -     19.1G  -
tank/backups@auto-2022-07-03_00-00                     38.4M      -     19.1G  -
tank/backups@auto-2022-07-10_00-00                     58.5M      -     26.5G  -
tank/backups@auto-2022-07-17_00-00                     6.81M      -     28.1G  -
tank/backups@auto-2022-07-24_00-00                      220K      -     28.1G  -
tank/backups@auto-2022-07-31_00-00                      204K      -     28.1G  -
tank/backups@auto-2022-08-07_00-00                     86.6M      -     28.4G  -
tank/backups@auto-2022-08-14_00-00                     97.8M      -     28.5G  -
tank/backups@auto-2022-08-21_00-00                      131M      -     28.8G  -
tank/backups@auto-2022-08-28_00-00                      308K      -     28.7G  -
tank/backups@auto-2022-09-25_00-00                        0B      -     28.7G  -
tank/backups@auto-2022-10-02_00-00                        0B      -     28.7G  -
tank/backups@auto-2022-10-09_00-00                        0B      -     28.7G  -
tank/backups@auto-2022-10-16_00-00                        0B      -     28.7G  -
tank/backups@auto-2022-10-23_00-00                        0B      -     28.7G  -
tank/backups@auto-2022-10-30_00-00                        0B      -     28.7G  -
tank/backups@auto-2022-12-18_00-00                        0B      -     28.7G  -
tank/backups@auto-2022-12-25_00-00                        0B      -     28.7G  -
tank/backups@auto-2023-01-01_00-00                        0B      -     28.7G  -
tank/backups@auto-2023-01-08_00-00                        0B      -     28.7G  -
tank/backups@auto-2023-01-12_00-00                        0B      -     28.7G  -
tank/backups@auto-2023-01-13_00-00                        0B      -     28.7G  -
tank/backups@auto-2023-01-14_00-00                        0B      -     28.7G  -
tank/backups@auto-2023-01-15_00-00                        0B      -     28.7G  -
tank/backups@auto-2023-01-16_00-00                        0B      -     28.7G  -
tank/backups@auto-2023-01-17_00-00                        0B      -     28.7G  -
tank/backups@auto-2023-01-18_00-00                        0B      -     28.7G  -
tank/backups@auto-2023-01-19_00-00                        0B      -     28.7G  -
tank/backups@auto-2023-05-11_21-30                        0B      -     28.7G  -
tank/backups@auto-2023-05-11_21-45                        0B      -     28.7G  -

I want to figure out what's using that space and make sure its not important before I blow all this away and try again. But tank/backups has nothing in it besides 4 other datasets which themselves are also empty. So what do I do here?

E: I figured it out! /mnt/tank/backups/.zfs/snapshots has them. ls -a will not show .zfs/ even though it exists.

What confused me is that USED for tank/backups is at 30G, but is empty, while the snapshots have low USED numbers but can see the files. I would have expected the USED number on the snapshots to account for the data, since it was deleted from tank/backups itself.

Yaoi Gagarin fucked around with this message at 02:19 on May 13, 2023

# ¿ May 13, 2023 01:26

Yaoi Gagarin: Feb 20, 2014

Moey posted:

Seems like the migration from TrueNAS Core to Scale is pretty straightforward. Thinking about giving it a rip on the next few days.

Anyone done this recently?

Lotta Linux isos, no full backup, not worth it for this data (4x8tb RAID-Z and 4x14tb RAID-Z, single pool). All replaceable, just would be very very annoying.

I have not came across any horror stories about the process, so I'm feeling alright about it.

TrueNAS (13.0-U4)is a VM running on ESXi 7.0, PCIe passthrough for the LSI HBA (9207-8i). Zero issues for the past few years with the setup.

I have done it twice in the last few days since I've been experimenting with different setups. Both times it went through flawlessly. Even kept my shares configured. Go for it imo

# ¿ May 14, 2023 05:23

Yaoi Gagarin: Feb 20, 2014

BlankSystemDaemon posted:

On a traditional filesystem, fragmentation happens because whenever you do a partial overwrite to an existing file, that file gets written in full in a new place on the physical platters.
Defragmentation helps with this by aligning everything so data is contiguous, though it should also be noted that the data also benefits from sequential I/O patterns as well as being aligned on the outer part of the disk where it's rotating the fastest.

ZFS, being copy-on-write, will instead write those partial writes as a delta of changed bits to a subsequent record, and leave the data unchanged.
This has side-effect that when you delete files, you typically end up with much larger chunks of contiguous free space when the records are finally freed when they're no longer being used (this is an asynchronous task done in the background by a zfs kernel thread, and isn't part of the unlink(2) call), though it should also be said that before spacemap, it was one of the things that kept ZFS from being very performant on a filer with a large amount of churn - but thankfully, the last bits of issues with that were solved by spacemap v2.

Mind you, it's not perfect - and you can get into scenarios where compression means that a record that would've fit neatly into where a previous record was suddenly leaves a tiny bit of space in between - but in that instance, it's typically a matter of single or at most a few bytes.
But it's still a damned sight better than traditional filesystems with fragmentation - and for us packrats who don't know the meaning of deletions, it's a non-issue.

Incidentally, if you're using FreeBSD, you can spot the difference between the outer and inner part of the platter, by doing diskinfo -i [/path/to]<device>.
Here's an example from one of the disks in my always-on server:
pre:
/dev/ada0
        512             # sectorsize
        6001175126016   # mediasize in bytes (5.5T)
        11721045168     # mediasize in sectors
        4096            # stripesize
        0               # stripeoffset
        11628021        # Cylinders according to firmware.
        16              # Heads according to firmware.
        63              # Sectors according to firmware.
        WDC WD60EFRX-68L0BN1    # Disk descr.
        WD-WXQ1H26U79HX # Disk ident.
        ahcich0         # Attachment
        id1,enc@n3061686369656d30/type@0/slot@1/elmdesc@Slot_00 # Physical path
        No              # TRIM/UNMAP support
        5700            # Rotation rate in RPM
        Not_Zoned       # Zone Mode

Transfer rates:
        outside:       102400 kbytes in   0.581062 sec =   176229 kbytes/sec
        middle:        102400 kbytes in   0.697880 sec =   146730 kbytes/sec
        inside:        102400 kbytes in   1.115892 sec =    91765 kbytes/sec
As you can see, it makes a pretty big difference.

Also, I've no idea when it happened, but at some point enclosure physical path support was added to the HPE Smart Array S100i SR Gen10 SATA controller.

I don't think this is true. Most FS just mutate the file in-place. Not doing that is what makes COW, COW

# ¿ Jun 26, 2023 18:49

Yaoi Gagarin: Feb 20, 2014

In ext* if you grow the file and there isn't enough room after it, a new extent gets allocated somewhere else for that part of the file. So you can get fragmentation within the file itself, not just in the free space. On spinny disks this is of course bad because now you need to move the head twice to read the entire file.

On SSDs I don't think it matters at all

# ¿ Jun 27, 2023 17:18

Yaoi Gagarin: Feb 20, 2014

the user inside the container isnt uid 0, and the container itself has limited access to stuff, so i dont think its any more vulnerable than just running a normal process as a normal user

# ¿ Jun 30, 2023 00:37

Yaoi Gagarin: Feb 20, 2014

Windows 98 posted:

mega pool achieved

My dude I think you should make each vdev raidz2. You don't want to sit there restoring all that from backup if you get a double failure

# ¿ Aug 9, 2023 17:49

Yaoi Gagarin: Feb 20, 2014

Just don't turn on dedupe or L2ARC

# ¿ Aug 11, 2023 21:50

Yaoi Gagarin: Feb 20, 2014

The Meshify 2 XL is basically the Define 7 XL with a breathable front panel. I don't know if anyone's actually tested thermals with all 18 drives populated, though. I would probably upgrade to high static pressure fans if you fill it, and maybe rig up some cardboard so that the air is forced to go between the drives instead of escaping through the side

# ¿ Aug 18, 2023 02:07

Yaoi Gagarin: Feb 20, 2014

Btrfs can do arbitrary disk mixing, but it has the write hole problem :whitewater:

# ¿ Aug 27, 2023 23:35

Yaoi Gagarin: Feb 20, 2014

maybe bcachefs will be the one

# ¿ Aug 28, 2023 00:06

Yaoi Gagarin: Feb 20, 2014

Worth noting that the speed on a hard drive also depends on how far from the center the data is

# ¿ Sep 15, 2023 02:28

Yaoi Gagarin: Feb 20, 2014

Nitrousoxide posted:

Personally, I have a vm that I have docker or podman installed on and I do the hosting in those docker/podman containers. It's still the same container workflow, but you get the benefits of easily backing up the whole vm and individual containers which gives you a ton of flexibility on how you can restore stuff if (when) it goes wrong. A container blows up after you pull a new image and the config is hosed? Roll back just that container with your backup solution. You do a major version upgrade on the docker host and it fucks up the containerd engine? Roll that back from the VM interface in proxmox. You get new hardware because your server is old or failing? Just add it as a member of the proxmox cluster and migrate the vm's over to it and start them back up. Want to experiment with a service or a way to architecture your homelab? spin up some vm's with Proxmox and play around and nuke them if they don't serve your purpose.

I'm never going back to having the server run on the bare metal again.

I tried this for my sonarr+sabnzbd set up but I had a lot of trouble with permissions errors. I had truenas core, hosting both the data and a VM running Fedora server, and then I tried to run the linuxcontainer.io images in podman.

I'm sure I could have figured it out eventually, but I gave up because there's so many layers that could be configured wrong: ZFS permissions, NFS share permissions, NFS mount permissions, permissions inside the container. A lot of effort.

Now I'm using Scale and while I don't love TrueCharts, at least it wasn't that hard to set it up

# ¿ Oct 14, 2023 00:36

Yaoi Gagarin: Feb 20, 2014

Sub Rosa posted:

I just built a ten* 3.5" disk NAS in a Fractal Node 804. Plus one nvme and already ran sata cables to where I can add two more 2.5inch drives later.

As someone with an 804: how the gently caress did you fit all that in

# ¿ Nov 14, 2023 16:51

Yaoi Gagarin: Feb 20, 2014

Speaking of ashift - what is a good value for an SSD?

# ¿ Dec 18, 2023 22:39

Yaoi Gagarin: Feb 20, 2014

raidz expansion is in upstream zfs so truenas will eventually get that too

# ¿ Jan 8, 2024 21:23

Yaoi Gagarin: Feb 20, 2014

does having 2 disks of redundancy solve that or do you need 3.

# ¿ Jan 11, 2024 00:57

Yaoi Gagarin: Feb 20, 2014

Perhaps someone can recommend a case for me? I've done a lot of googling but cannot seem to find anything that meets all these requirements:
1. supports at least micro-ATX motherboards
2. has 8 3.5" hot swap bays
3. can use a normal ATX power supply
4. can keep the 8 hard drives at safe temperature
5. quiet
6. not a heavy rackmount box

The closest thing I have found is the Silverstone CS381, but I see lots of people on the internet saying their drives run hot in that thing. They also make a CS382 and it has fans directly on the drive cage, but they are half blocked by its SAS backplane. I think the latter is new enough that I can't find any info about how hot it runs, maybe half-blocked fans are OK?

# ¿ Jan 11, 2024 07:31

Yaoi Gagarin: Feb 20, 2014

fletcher posted:

I was on a quest for this as well, you can see my posts about the CS381 in this thread. I gave up on it though, the drives were running too hot for my tastes. Ended up getting a Node 804 - silent and drive temps are great. Just had to give up hot-swap - do you really need it?

I want stick my NAS in a rack now though, but I'm holding out for the Sliger CX3750. They teased me with a drawing a year ago in an email, still no word on a release date though!

funny, I have a node 804 right now and I feel like it's such a pain to work with. drives are mounted upside down hanging from the top so you have to plug in cables from the bottom. I only have two 3.5" drives right now so it's not a huge deal, but even so I haven't been able to put them in adjacent slots because then my power supply cable bunches up. And I've got 3 2.5" drives just floating around too.

Maybe on the weekend I'll tear everything out and see if I can find a neater way to route cables. as is I can't see myself cramming 8 drives in it even though it can theoretically handle that

# ¿ Jan 12, 2024 02:37

Yaoi Gagarin: Feb 20, 2014

Stux posted:

looking to put together a nas, largely for plex with some random storage on the side, and need some help picking a raid setup and a sanity check on the cpu ive been looking at.

im looking at cases with capacity for 10~ 3.5" drives, but i really want to be able to add smaller sets at a time rather than filling up the entire thing in one go, both because im not really sure how much space im going to end up needing until i start using it, and for cost both on the upfront and if i ever need to upgrade drives later. i think i have basically three options: 3 drive raidz1 vdevs, 5 drive raidz2 vdevs, or straight up mirrored pairs. this is how i understand these three but idk if ive got everything completely correct.

5 drive raidz2: will have to buy 5 drives to build the system, 5 to add a second vdev, then 5 for any upgrades, but will have 2-4 drives of failure tolerance. 60% space efficiency

3 drive raidz1: 3 drives to build, then can add another two sets of 3, upgrades will be 3 drives. 1-3 drives of failure but much more precarious. 66% efficiency, but because i would only be able to fill 9 bays would end up with the same overall space as the raidz2 setup unless i go for a case that can take 12 drives.

mirrored pairs: always just adding and upgrading 2 drives at a time. 1-5 drives of failure, statistically better redundancy but a very small chance of losing both pairs of a mirror. 50% of the storage i buy.

i feel like the raidz2 setup is the technical best in the trade off of long term cost efficiency vs not having the entire thing die from drive failures, but the costs are heavily front loaded and buying 5 drives to find out i need more space and have to buy another 5 drives immediately isnt very appealing. the raidz1 setup feels like the best trade off in that regard as 3 drives are easier to stomach, but even though the data isnt going to be very important it still feels like its uhh just going to die at some point to bad luck? lol. the mirror setup feels like extreme overkill on the side of data redundancy and the long term cost, but being able to add pairs and be flexible w scaling up the storage is appealing in the short term. that and as far as i understand the resilvering is a lot quicker and wont have me worrying about a cascade of drive failures as much. but im not sure if im looking at this wrong or missing a better option, like is raidz the wrong choice here, would just regular raid 5 or 6 be a better fit?

as for cpu i was looking at the intel n100 as it has quicksync for transcoding, but also has a super low tdp (euro power prices) and the cooling requirements are extremely low. but im not sure exactly how well its going to deal with stuff like plex pushing subtitles etc or if ive missed something thats going to cause issues with it. ive seen that i woudl need to ensure i have an up to date kernel for it to work, but im not settled on what im going to use yet and so im not sure how much of a pita thats going to be with various distros

supposedly raidz1 and raid5 are both a bad idea on modern drives because resilvering takes so long that another drive might die in the process.

if you have backups of this data, you can start with a mirror, and when it's full buy 3 more drives, destroy the mirror, and put all 5 into raidz2.

also, openzfs added raidz expansion a few months ago, at some point that'll get enabled by distros and then you can grow your vdev one drive at a time until you fill your case.

# ¿ Feb 25, 2024 20:56

Yaoi Gagarin: Feb 20, 2014

wait gently caress they're killing podcasts I actually use that

# ¿ Mar 16, 2024 04:05

Yaoi Gagarin: Feb 20, 2014

Shumagorath posted:

The last file system converter I remember was Windows 2K/XP FAT32 -> NTFS. Are they going to provide an offramp?

just copy the files to your new storage pool?

# ¿ Mar 16, 2024 04:53

Yaoi Gagarin: Feb 20, 2014

Can't hardlink but you could symlink, or use a bind mount.

But if you want to "merge" the pools - why even make a second pool, you could put those drives in as a new vdev in the original pool?

# ¿ Mar 28, 2024 02:57

Yaoi Gagarin: Feb 20, 2014

Would you mind pasting the output of `zpool status -v` here?

# ¿ Mar 28, 2024 11:19

Yaoi Gagarin: Feb 20, 2014

mekyabetsu posted:

I read about this, but I don't understand why it's a problem. I mean, obviously more data is going to be written to the larger drives because they're... bigger.

Basically if you care a lot about throughput and iops you want all writes to be spread as evenly as possible among the vdevs. For example if you need to write 10 GB and you have 4 vdevs, the fastest thing is to have each vdev take 2.5 GB so they all finish at the same time. However zfs wants to balance the % used, so bigger vdevs get more writes, as do new empty vdevs. That means after adding a new one, that entire 10 GB write will go just to the new vdev. This is slower because we aren't doing anything in parallel.

For home use this is not a problem.

# ¿ Apr 16, 2024 18:05

Yaoi Gagarin: Feb 20, 2014

aside from a weekly scrub you should set up timed snapshots. theres scripts floating around somewhere that handle all the logic like keep X daily snapshots and Y monthly snapshots etc

# ¿ May 2, 2024 22:45

Yaoi Gagarin: Feb 20, 2014

id just wait for the correct heatsink, otherwise you'll have to clean off the thermal goop and reapply it

# ¿ May 14, 2024 04:03

Yaoi Gagarin: Feb 20, 2014

afaik it is not a mirror of the metadata, when you make a special vdev it has all the metadata.

# ¿ May 15, 2024 05:40

Adbot: ADBOT LOVES YOU

# ¿ May 17, 2024 00:33

Yaoi Gagarin: Feb 20, 2014

It should probably throw a bunch of warnings at you if you try to make a special vdev without redundancy.

# ¿ May 15, 2024 21:46

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > NAS/Storage Megathread: What is this "File Deletion" You Speak of

«‹›3 »