Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Yaoi Gagarin
Feb 20, 2014

BlankSystemDaemon posted:

Where are you going to page to, if not a swap device?
Are you talking about the user being able to prioritize things? If so, FreeBSD accomplishes this with the sysctls in the vm.swap_idle_* OID and vm.disable_swapspace_pageouts.

Paging's most important purpose is to allow translation between virtual and physical addresses

Adbot
ADBOT LOVES YOU

Yaoi Gagarin
Feb 20, 2014

BlankSystemDaemon posted:

No, paging is the act of moving resident memory around - translation is handled by the page table/map (and hardware MMU, if available), which is another part of the VM subsystem that paging is also a part of.

In FreeBSD nomenclature (because that's what I know) it's the difference between src/sys/vm/vm_page.c which is the resident memory management module (and src/sys/vm/vm_pageout.c which handles swapping to disk, specifically) and, as an example, src/sys/amd64/amd64/pmap.c which is the pagemap for amd64.
The reason they live in different parts of the tree is that FreeBSD uses a machine-dedependent/machine-independent code separation, which it inherited from BSD, to try and keep code-reuse to a minimum.

You know what, you're right. Paging does specifically refer to swapping. My bad :eng99:.

But even in the absence of a swap device the OS can map in parts of files to the fs cache, or to any VMA a process requests it to. My point was swap is good because it gives the OS the option to use physical memory in a more useful way, even when the working set does not exceed physical memory capacity.

Yaoi Gagarin
Feb 20, 2014

BlankSystemDaemon posted:

I swear, I didn't know this was going to happen, but following the conversation VostokProgram and I had on paging, Mark Johnson - probably one of the smartest people in the FreeBSD project - wrote an article on how FreeBSD handles swap.

Good article! Makes me wonder if TrueNAS enables swap.

Yaoi Gagarin
Feb 20, 2014

BlankSystemDaemon posted:

So if it's bit-level parity, doesn't that mean that there's an implication that if you have, say, 4 drives of full of 10TB each, that your parity drive should be 40TB?


I don't think so? It only needs to be as large as the largest drive. XOR bit i from every drive, write to bit i of the parity drive.

Yaoi Gagarin
Feb 20, 2014

Scuttle_SE posted:

Not that much...maybe 150-200TB

What hardware and software do you use

Yaoi Gagarin
Feb 20, 2014

Paul MaudDib posted:

so it's to store the metadata and act kind of like a WAL/SLOG then? What's the difference between an allocation class device and a SLOG then?

SLOG is only used for synchronous writes, e.g. through a file descriptor opened with O_SYNC. It's not a general purpose write cache. Non synchronous writes are cached in memory.

IIRC special device isn't even really a cache for metadata, it's actually just the primary storage for it?

Yaoi Gagarin
Feb 20, 2014

BlankSystemDaemon posted:

Writes aren't the big issue with DM-SMR drives, in so far as while they're worse than PMR drives they're not that much worse.
The issue with DM-SMR drives is when it comes to any kind of I/O that isn't strictly sequential - and even then, PMR drives are better at sequential I/O than SMR).

There is, in theory, one use-case where DM-SMR drives would make sense, if DM-SMR was used to make the biggest capacity drives:
Using the drives as a sequential access tape drive, where you write the standard I/O stream from zfs send to the character device itself, and then either use zfs receive or zfs corrective receive (once that lands) to restore data seems like it'd get you better-than-tape density for the biggest drives, all the while making use of the only workload where DM-SMR has any chance.

If we had HM-SMR, things might look different - but even then it'd require a pretty substantial amount of code-addition to ZFS, the various device drivers, I/O schedulers (such as in CAM on FreeBSD), and other places, for it to make sense.

Here's a question - let's say you do this zfs send/receive once, fully overwriting the drive. Then you go to do it again - won't the drive go crazy trying to rewrite all the shingles, because it still thinks that is useful information?

Yaoi Gagarin
Feb 20, 2014

BlankSystemDaemon posted:

The entire point of WORM is to write once and read many. If you end up overwriting anything, you're not using it as WORM media.



I mean there's WORM, and then there's "mostly" WORM. I can't imagine a use case where you would literally only write to a drive once in it's entire life.

Yaoi Gagarin
Feb 20, 2014

BlankSystemDaemon posted:

My point is, that's all DM-SMR is good for - and given that tape comes out to about ~110USD for 12TB, the price difference isn't as big as you'd expect.

If the drives had some factory reset command so you could tell it "everything on this is useless pretend you never wrote anything at all", they would be so so much more useful :(

Yaoi Gagarin
Feb 20, 2014

Why does a duplicate picture finder need to run in a docker....?

Yaoi Gagarin
Feb 20, 2014

Legends tell of a forbidden spell, "unlink(2)", but no one has used such power in millenia...

Yaoi Gagarin
Feb 20, 2014

BlankSystemDaemon posted:

I'm not sure I've ever been responsible for the name of a thread before.

The Separate Intent Log only records synchronous writes - so unless you're dealing with databases, a short list of other userspace programs doing (A|F|O)_SYNC, or doing administrative tasks, there is absolutely no need to have one.
You also need it to be mirrored, because if the SLOG drive disappears, so does any data that was on it prior to being flushed to disk.

The ZIL exists to replay lost transaction groups in case of a power outage, crash, or failure modes that aren't catastrophic enough to take down the entire array. While the system is in normal operational mode, the ZIL isn't used at all.
You might be thinking of the dirty data buffer that ZFS has, which is a 5 second/1GB buffer (or until an administrative command is issued, since that triggers an automatic flush) where data is stored until it gets flushed to disk as a transaction group.

AFAIK the ZIL always exists, its just that if no SLOG device is provided it is stored on the pool with everything else.


Twerk from Home posted:

If I wanted a fast NAS and was willing to splash for a couple terabytes of all flash, what's a sane way to do that?

Is ZFS RAIDZ going to be a huge bottleneck for nVME disks? Do SSDs fail so rarely that people just span them together with LVM or run RAID0? Are SATA disks still cheaper enough to do 2.5" SATA SSDs instead of nVME?

Whether you want redundancy, or just a striped/spanned config, really just comes down to: how painful is restoring in your backup scheme?

Redudant option: Since you want fast, and only "a couple terabytes", IMO a striped ZFS mirror with 4x 1 TB nvme drives is the way to go. Meaning, 2 mirror vdevs, each with 2 of the drives. And you've got no need for an SLOG then even if running a high speed database because your pool is already fast.

Non-redundant option: Buy 2x 1 TB NVME drives, stripe them in ZFS, and just rely on restoring from backup. In this scenario your first tier "backup" can even just be 2x spinning rust drives in a mirror within the same machine, just on a second pool you never use directly. Then a cron job backs up the SSD pool to the HDD pool periodically using ZFS send/recv.

Obviously in either scenario you have more backups like an offsite one or w/e depending on how valuable this data is.

Yaoi Gagarin
Feb 20, 2014

Z the IVth posted:

Embarrassing confession time.

So after messing about with drive health checking etc, my system rebooted itself overnight and the problem with my un-interactable mystery files revealed itself. I think because the drive had come from another PC the files required administrator privileges to move, and prior to the reboot explorer decided not to show me the popups, making everything hang.

Now the popups appear, I click authorise and everything moves as it should.

There isn't a :blush: big enough to describe my utter failure as a goon.

its ok, windows just be like that sometimes

Yaoi Gagarin
Feb 20, 2014


Various thoughts in no particular order:

- im pretty sure you cant install normal 120mm fans in a 3U case, so youre going to be stuck with LOUD fans. a 4U is probably better from that perspective
- water cooling unnecessary assuming you have enough airflow going through this case
- UPS is a must have
- transfers between VMs over the hypervisor's network bridge are pretty much at RAM speed
- if your gaming machine is a VM on this server you'll have to look at GPU passthrough. doable but takes effort
- you'll definitely want all the VMs to be installed on SSD-based storage (especially the VM you use as a PC), but hard drives will be fine for bulk data
- RAID is not a backup, it exists purely to improve availability (uptime) so plan to have a proper backup for all this stuff
- consider having your main PC as a dedicated desktop anyway so you can still check your email and pay your bills online whenever this homelab inevitably explodes, figuratively or literally
- consider going AMD and getting a 5950X
- personally I would rather do all this on TrueNAS than unraid but that will take away your GPU passthrough option. there's also Proxmox as an option
- do NOT buy a single DIMM of RAM, buy a kit that populates your channels (which is 2 on most platforms)

Yaoi Gagarin
Feb 20, 2014

really weird rear end question but i figure this is also the unofficial data hoarder thread - anyone got tribute.avi?

Yaoi Gagarin
Feb 20, 2014

ive never used bsd but i also enjoy the bsd-posting

Yaoi Gagarin
Feb 20, 2014

BlankSystemDaemon posted:

A new feature named vdev properties just landed in OpenZFS.
Some people might remember me talking about it, because it makes it possible to correlate SES paths with disks used in ZFS, so that when you type zpool status it gives you the location information in a SAS enclosure without having to use GPT information or GEOM labels (or similar meta-data).

However, I also just learned today that another thing it's paving the way for is a feature called removal queueing - basically being able to remove one drive after another, simply by turning off allocations to a device temporarily.
What was also mentioned in that conversation is that that feature can be used as a way of more efficiently rebalancing pools with multiple vdevs, since the ability to turn off allocations is simply a property that can be toggled at runtime. So that's kinda neat.

Well, raidz expansion should land within the next year or two, if we assume that people are going to put in effort to review it.
Unfortunately there's never enough domain experts to do review for any opensource project, so there are no guarantees that can be made.

What you can do sort-of depends on how much allocated disk-space you need.
Let's say you've got 3 3TB disks which adds up to roughly 8TB allocated space. If you buy 4 8TB disks (the minimum size if you want to avoid SMR when shucking WDs) and put them into a RAIDz2 that should give you approximately double the amount of diskspace you have now, and still leave room to use 8 disks in RAIDz2 when you've eventually expanded your array.

I imagine you'd want to keep an eye out on ebay/craigslist/whatever's convenient and equivalent for you - the recent sales have probably made a lot of people upgrade, so that's likely the best bet.

Could you in theory write a script to rebalance a zfs pool by disabling allocations on the more full vdevs and then cp-ing a bunch of files around until the vdevs are mostly balanced?

Yaoi Gagarin
Feb 20, 2014

Re: lack of apps on truenas - can't you install whatever you want into a jail?

Yaoi Gagarin
Feb 20, 2014

5436 posted:

That a good price?

Maybe you can swing a discount if you buy all 17?

Yaoi Gagarin
Feb 20, 2014

Arivia posted:



they've done a bunch of videos where they make a NAS server for iJustine or similar other youtubers.

It's extremely funny to me that there's a class of "tech" YouTubers, in comparison to whom Linus seems like a professional. Like what the heck do those other people do

Yaoi Gagarin
Feb 20, 2014

It's pretty normal to code a lookup table or a blob like that, I don't think it's a smoking gun or anything. You just generate the .c file with an external script. Also I'd rather have that in a .c file than a .h file, so you aren't relying on the linker to clean up multiple definitions.

Yaoi Gagarin
Feb 20, 2014

Yeah the difference is that if you were using ext4 you still would have ended up with hundreds of GB of bad data on disk but you wouldn't even have the console logs, just the random crash.

Also, can't zfs be made to send an email or something when it detects a bad block? I'm pretty sure truenas has a feature like that

Yaoi Gagarin
Feb 20, 2014

We live in a strange time. You make more money by making products worse.

Yaoi Gagarin
Feb 20, 2014

Jellyfin has a bunch of apps too though? https://jellyfin.org/clients/

Its a Plex clone so I don't see why it would be any harder for non-technical people to use.

Yaoi Gagarin
Feb 20, 2014

If you use striped mirrors you can expand your pool easily and you can even use different size drives as long as each mirrored pair is matched. It's very convenient

Yaoi Gagarin
Feb 20, 2014

BlankSystemDaemon posted:

It also means if two disks in a mirror fail, you lose your entire pool.

With large enough numbers of mirroed vdevs, you end up having lower MTBDL than a a single disk.

You should have backups!

Yaoi Gagarin
Feb 20, 2014

BlankSystemDaemon posted:

The ironic part is that the first time they tried to opensource Solaris was back in the 90s, but they couldn't because there was a shitload of drivers in Solaris written by second-party companies or subcontractors who they couldn't source release forms from.

Getting Pandora to come out of her box is not as easy as all that, and it's a marvel of evil that Larry Ellison managed to put her back in there once she got free.

nit: Pandora is the one who opens the box, she isn't in it

Yaoi Gagarin
Feb 20, 2014

Thanks Ants posted:

If you have guest Wi-Fi at work then get a cheapish tablet and use that to watch your Plex stuff. Doing VPNs on your work PC is just going to give them excuses if they decide they want you gone one day.

I think it would get them immediately fired or at least in hot water. To the infosec people won't it look like an employee doing industrial espionage or something?

Yaoi Gagarin
Feb 20, 2014

This whole space is just begging for someone to create a convenient and easy to use wireguard endpoint in a box

Yaoi Gagarin
Feb 20, 2014

I'd probably stick all the online-facing services in a VM or something. I'm pretty sure Docker does not provide any security guarantees

Yaoi Gagarin
Feb 20, 2014

Wild EEPROM posted:

I have truenas running on an old dell workstation with an E3 v3 xeon. it has 2 pools, each with 1 pair of hdds (2x 14tb mirror, 2x 8tb mirror)

After maxing out the ram, what's the next thing I can do to improve performance? it's most noticible when I navigate to a network share with lots of small files in finder, or when i try to find a specific video by opening them one after another.

l2arc ssd? zfs "special" vdev? slog ssd?

SSDs for a special vdev should help with the navigation part. I believe there's a setting to allow small files to live on it but by default it only holds metadata. Nothing will really help with opening a bunch of videos one by one though

Yaoi Gagarin
Feb 20, 2014

IOwnCalculus posted:

Once you get to "servers sold to businesses", interoperability standards with things like power supplies go right back out the window. HPE doesn't care that you can't swap that PSU with a generic one, their entire concern for that server is that you either buy Official Spare HPE parts to repair it, or replace it with a More Better HPE Server when things do start breaking.

I think system vendors would love to go back to the days where everybody had their own ISA and their own flavor of Unix, but that business model is not viable so instead we get almost-but-not-quite interchangeable commodity hardware

Yaoi Gagarin
Feb 20, 2014

BlankSystemDaemon posted:

It's not exactly difficult to setup headscale which lets you self-host a beacon for tailscale.

It's even available in FreeBSD Ports so can be installed in a jail.

I feel like we're getting close to the point where it'll be possible to buy a cheap device that runs a wireguard endpoint and send it to your non-technical friends and family, tell them to plug it in, and it automatically bridges to your network and everything is super easy

Yaoi Gagarin
Feb 20, 2014

IOwnCalculus posted:

It's this. If you're really that worried about the attack surface of Plex into your home network, set up Plex on its own VLAN, which is locked down to only access the internet and read-only access to your NAS.

At some point you have to trust whatever software you're exposing to the internet to be regularly updated and patched against known vulnerabilities, whether it's nginx, Wireguard, or Plex.

sure but your trust for nginx or wireguard should be about 100 times higher than plex

Yaoi Gagarin
Feb 20, 2014

Klyith posted:

Yeah btrfs is gonna be vastly easier for someone doing a "linux learning project", since it's a first-class citizen on linux. Depending on what distro and what install options, it may well be the out-of-the-box default.

And for a simple mirror setup does ZFS have super-compelling advantages?

As compared to btrfs, probably not. Compared to something like md it does

Yaoi Gagarin
Feb 20, 2014


Lol that's worse than I thought. That's such terrible UX

Yaoi Gagarin
Feb 20, 2014

With any sort of journaling filesystem a hard power off is unlikely to corrupt or lose your data. And with something like ZFS I think it would be fair to say it's impossible. Anything actually on the disk will be perfectly safe.

However your OS will probably buffer a few seconds worth of data in system RAM, and some SSDs will do that again in their own RAM. So there's a risk of corrupting data there, if for example you had just edited and saved a file and the SSD was still writing the new blocks.

Yaoi Gagarin
Feb 20, 2014

The word is write-back cache

Yaoi Gagarin
Feb 20, 2014

Jim Silly-Balls posted:

Its a 10GB SPF Direct cable

Just to be extremely clear: gigabits or gigabytes?

Adbot
ADBOT LOVES YOU

Yaoi Gagarin
Feb 20, 2014

I thought each vdev only gets the bandwidth of its slowest drive?

I was going to recommend ZFS with striped mirrors. You'll only get half the space but if you really want to saturate the network it might be worth it

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply