|
Thanks Ants posted:Did nobody run the numbers of trading it in or is there a reason you're keeping it around?
|
# ? Mar 25, 2017 04:45 |
|
|
# ? May 11, 2024 16:48 |
|
Walked posted:Sure; we're aiming for something 20-30tb for a very generic "virtualization" workload; no particular high io workloads; but significant VM usage. Clustered environment. We just had to look into few cheap storage solutions for our hpc cluster and quantum qxs was the cheapest option that still works for us feature wise.
|
# ? Mar 27, 2017 19:29 |
|
Has anybody ever quoted out FreeNAS certified servers? I'm doing a little consulting with an academic lab who is looking to improve their setup from "~30TB of usable storage on a single RAID controller in a single SUSE server". They are needing at least 50TB space, preferably more like 70TB. Budget is more like Synology or QNAP filled with 8TB WD Reds than all-flash Netapp. The only hard requirements are reasonable continuous read performance for less than 5 users at once, NFS and CIFS shares, 10gigE. No need for high availability and it will be backed up to tape regularly. It's a bioinformatics lab needing somewhere to centralize storage of research data. No home folders stored or high IOPS needed, but I would think that one of the certified FreeNAS boxes would give them a better long-term experience than the QNAP or Synology 12-bay NASes. Any other budget options that would give a reasonable experience?
|
# ? Mar 28, 2017 22:17 |
|
Twerk from Home posted:Has anybody ever quoted out FreeNAS certified servers? I'm doing a little consulting with an academic lab who is looking to improve their setup from "~30TB of usable storage on a single RAID controller in a single SUSE server". The claim is that they don't need high IOPS, but I've never known a quantitative biology group to estimate this correctly. Is this just archival storage for occasional data analysis, or is this actually supposed to be hooked up to a compute cluster?
|
# ? Mar 28, 2017 22:26 |
|
Vulture Culture posted:What kind of bioinformatics are they doing? How many files are on their filesystem? (It sounds like a ridiculous question, but I once supported a lab that routinely filled their filesystems with a billion tiny files doing de novo assembly.) Good question, what data I've seen involves fewer much larger files. They do mostly GWAS and pedigree reconstruction, so the genomes are coming to them already assembled. Also their worker machines have 384GB of RAM each or more, and they tend to slurp whole files into memory rather than doing random disk i/o or paging, but that also might be to deal with how much their current storage sucks. This is where the data is going to be analyzed from, not an archival backup. Good point about they might not know what they want. Twerk from Home fucked around with this message at 22:30 on Mar 28, 2017 |
# ? Mar 28, 2017 22:28 |
|
Twerk from Home posted:Good question, what data I've seen involves fewer much larger files. They do mostly GWAS and pedigree reconstruction, so the genomes are coming to them already assembled. Also their worker machines have 384GB of RAM each or more, and they tend to slurp whole files into memory rather than doing random disk i/o or paging, but that also might be to deal with how much their current storage sucks.
|
# ? Mar 28, 2017 22:44 |
|
Vulture Culture posted:They may also legitimately have no idea how much their storage is costing them in terms of time to complete a run. They might just think it's supposed to be that slow, and they would never say anything unless a known quantity starts taking significantly longer to finish. It's worth doing an analysis to see the access patterns over time (thankfully easy on Linux with minimal setup). So what's the best bang for buck way to make it faster? They're hoping to spend less than $15k to get that 50-70TB of space, hence my comment that the price range is more like "Synology with WD Reds in it".
|
# ? Mar 28, 2017 22:52 |
|
Twerk from Home posted:So what's the best bang for buck way to make it faster? They're hoping to spend less than $15k to get that 50-70TB of space, hence my comment that the price range is more like "Synology with WD Reds in it". When it comes down to it, their budget is their budget, but a holistic approach would be to look at their whole tech spend and see if they can avoid, say, adding cluster nodes by making their existing runs faster. Vulture Culture fucked around with this message at 01:34 on Mar 29, 2017 |
# ? Mar 29, 2017 01:30 |
|
Twerk from Home posted:Has anybody ever quoted out FreeNAS certified servers? I'm doing a little consulting with an academic lab who is looking to improve their setup from "~30TB of usable storage on a single RAID controller in a single SUSE server". Twerk from Home posted:This is where the data is going to be analyzed from, not an archival backup. Good point about they might not know what they want. Vulture Culture posted:When it comes down to it, their budget is their budget, but a holistic approach would be to look at their whole tech spend and see if they can avoid, say, adding cluster nodes by making their existing runs faster. evil_bunnY fucked around with this message at 13:35 on Mar 29, 2017 |
# ? Mar 29, 2017 13:33 |
|
maxallen posted:Curious if anyone has any thoughts why this happened: https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2006849 Looks like a microsoft/iscsi error that was kicked off by a snapshot
|
# ? Mar 29, 2017 19:03 |
|
Twerk from Home posted:Has anybody ever quoted out FreeNAS certified servers? I'm doing a little consulting with an academic lab who is looking to improve their setup from "~30TB of usable storage on a single RAID controller in a single SUSE server". We talked to iXsystems about their stuff for secondary storage and the quotes they came back with were substantially higher than our primary storage costs. I went through things with them a number of times to make sure they didn't misunderstand our use cases/requirements and that did not appear to be the case.
|
# ? Mar 29, 2017 19:47 |
|
maxallen posted:Curious if anyone has any thoughts why this happened: The scsi status 0x18 indicates a reservation conflict, meaning some other iqn reserved the device. If this is a cluster then it could indicate an issue with the cluster configuration where one node tried to take control of the resource without the services failing over properly. If it's a standalone server then it probably means someone hosed up and presented that LUN to another server and attempted to initialize it there. The NTFS error just means that it could not complete write operations to the device, which could be any number of issues, absent other info.
|
# ? Mar 29, 2017 20:20 |
|
Twerk from Home posted:Has anybody ever quoted out FreeNAS certified servers? I'm doing a little consulting with an academic lab who is looking to improve their setup from "~30TB of usable storage on a single RAID controller in a single SUSE server". Would this be such a bad option? The performance requirements don't sound that high and it doesn't sound like you would need the advanced features Netapp and such could provide. At work I could get 10*8TB PowerEdge server for 5k. Buy a second for backup duties, everyone can just switch to using the backup server until the repair man shows up NBD. Hardware support from the manufacturer, software support from Red Hat/Microsoft.
|
# ? Mar 29, 2017 23:06 |
|
X-Post from the Packrat thread.eames posted:The CTO of iXsystems and project leader of FreeNAS is leaving the company in a week to move on to "the nanotech / biomedical field"
|
# ? Mar 30, 2017 15:35 |
|
Whenever we've gone down the ix/nexenta/tintri/(other zfs-derivative) route for quotes they always seem to be priced at 10% less than everyday netapp instead of the commodity plus 20% that id prefer to see. That may be an unrealistic expectation; ymmv.
|
# ? Mar 31, 2017 03:43 |
|
Vulture Culture posted:They may also legitimately have no idea how much their storage is costing them in terms of time to complete a run. They might just think it's supposed to be that slow, and they would never say anything unless a known quantity starts taking significantly longer to finish. It's worth doing an analysis to see the access patterns over time (thankfully easy on Linux with minimal setup). I'd be interested in a thumbnail sketch (which tools?) on how you are doing that analysis if you wouldn't mind sharing.
|
# ? Mar 31, 2017 03:49 |
|
I just got a quote for two zfs appliances from Oracle, both fully licensed, both with 68 1.2tb disks and the associated SSDs for ZIL and l2arc for a fabulous price that I can't share, but trust me, it's low enough that it is worth looking into if you are looking at storage.
|
# ? Mar 31, 2017 04:06 |
|
PCjr sidecar posted:I'd be interested in a thumbnail sketch (which tools?) on how you are doing that analysis if you wouldn't mind sharing. We ran a couple of different storage systems. BlueArc (now Hitachi HNAS), IBM SONAS (now hopefully dead in a loving fire), Isilon, and a bunch of one-off Linux/OpenIndiana/FreeBSD boxes, so we had a mixture of different vendor-specific tools, standard Linux utilities, and a bunch of client-side analysis we were doing. Understanding the user workloads was the first step. We dealt with a lot of the same applications between research groups (ABySS, Velvet, etc.), so we had a pretty good idea of how they were supposed to perform with different file formats. If someone's runs were taking abnormally wrong on the cluster, we'd start looking at storage latencies, access patterns, etc. Some file formats like HDF5 do well with random access, while others like FASTA generally work better when the whole thing is slurped into memory sequentially (though random access is possible if it's preprocessed with .fai or FASTQ indexes first). You want to stagger those sequential jobs so they aren't all doing their I/O from the same volume at the same time. Most storage vendors don't give you very good insight into what the workload is doing on disk. Where we could, we relied a lot on ftrace and perf_events to see what was happening. Brendan Gregg has an awesome utility called iolatency that can give you a latency histogram sort of like VMware's vscsiStats. This is mostly useful once you've already reached the saturation point where your latencies plateau, and you want to figure out what's going on in the workload underneath. That helps you figure out stuff like, okay, should I be adding SSD caching, should I be adding more disks to serve reads, or should I look at replacing this whole volume with a RAID-10 or adding a massive fast ZIL to speed writes? For some really insidious cases, we ended up trawling /proc/self/mountstats on each compute cluster node to get per-mount counters on each NFS operation. I actually wrote a Diamond mountstats collector to pull these numbers at 60-second intervals and forward them onto Graphite where they could be filtered, aggregated, and graphed -- this gave us a lot of really useful heuristic data on stuff like "how often is this job opening files?" and "does this application stream reads, or does it slurp everything in at start?" (We actually spotted a regression in SONAS's GPFS file locking behavior by diving deep into the performance characteristics associated with each specific NFS operation.) PCjr sidecar posted:Whenever we've gone down the ix/nexenta/tintri/(other zfs-derivative) route for quotes they always seem to be priced at 10% less than everyday netapp instead of the commodity plus 20% that id prefer to see. That may be an unrealistic expectation; ymmv. Vulture Culture fucked around with this message at 13:24 on Mar 31, 2017 |
# ? Mar 31, 2017 04:13 |
|
If you run your own storage on Linux, one other fun thing you can do when you're profiling applications is to use dm-delay, which allows you to insert I/O delays into the stream. This gives you a very easy and very reproducible path to see what happens to a specific workload when disk latency spikes. e: for dynamically-linked binaries, you could probably do this for NFS by using LD_PRELOAD to override glibc's read and write functions with a wrapper that inserts local delays Vulture Culture fucked around with this message at 04:29 on Mar 31, 2017 |
# ? Mar 31, 2017 04:22 |
|
Vulture Culture posted:This is also true when you're looking at Gluster, Lustre, Ceph, etc. I don't get it. The offering isn't bad, but these companies never provide you the global logistics and support of the big vendors, so people buying must either be clueless or pressing them down to discount levels way lower than I was ever able to get.
|
# ? Mar 31, 2017 10:40 |
|
adorai posted:I just got a quote for two zfs appliances from Oracle, both fully licensed, both with 68 1.2tb disks and the associated SSDs for ZIL and l2arc for a fabulous price that I can't share, but trust me, it's low enough that it is worth looking into if you are looking at storage.
|
# ? Mar 31, 2017 12:24 |
|
evil_bunnY posted:This has been my experience. And then on top of that you can't go out and hire a semi-decent storage engineer because there's fuckall install base.
|
# ? Mar 31, 2017 13:30 |
|
Vulture Culture posted:The terrible irony of storage: if you can't afford a storage engineer, you have to run a roll-your-own storage system. Whoever rolls their own because they "can't afford" an engineer 100% has a politics problem (you're paying the dude less, but he'll be busy forever). When we take over other research groups/departments, we spend half our time just ripping out bullshit string and tape contraptions and replacing them with standard stuff that's maybe 50% more expensive on paper but in practice ends up with a 50% lower TCO because you just plonk it down and update it once in a while, and when it fails all you do is let the nerd with the spare through the door. evil_bunnY fucked around with this message at 13:55 on Mar 31, 2017 |
# ? Mar 31, 2017 13:53 |
|
I will never run a SAN in production without a maintenance contract, period. If the company is willing to accept that risk, then it's time to find a new company. At the job I just left, I migrated the data center to a colo space on a new SAN, but there was a 3 month period where the old SAN was out support and still running business critical applications, due to various delays. I still puckered anytime a disk failed in the old SAN, even though I had spare drives sitting on the shelf ready to go.
|
# ? Mar 31, 2017 14:51 |
|
Vulture Culture posted:I haven't worked there in a good amount of time, but sure. I'm going to ignore general stuff on managing storage performance like keeping an eye on your device queue sizes, because there's plenty of good information out there already on that. Thanks; we're already pulling per-client NFS stats into Graphite but per-mount will be more useful; we've been looking server-side (perf, network, etc.) for usage information. Lustre's generic per-client stats aren't bad but I want to start using the jobstats feature to tag each in-flight IO with a job ID. Brendan Gregg's book is good. quote:This is also true when you're looking at Gluster, Lustre, Ceph, etc. I don't get it. The offering isn't bad, but these companies never provide you the global logistics and support of the big vendors, so people buying must either be clueless or pressing them down to discount levels way lower than I was ever able to get. In our experience if you look at GB/s/$ Lustre is very difficult for NetApp/EMC to come close to, for sufficiently large values of GB/s. That's assuming your workload will actually run well on Lustre (or any other clustered FS); while it is POSIX-compliant it really isn't a general-purpose file system. Highly recommend a partner that has experience with Lustre and has a contract with Intel for L2/L3 support.
|
# ? Mar 31, 2017 15:42 |
|
devmd01 posted:I will never run a SAN in production without a maintenance contract, period. If the company is willing to accept that risk, then it's time to find a new company. Yeah no poo poo. Servers these days are just the delivery boys and processing oomph for what the san delivers (If you are doing virtualization like a smart person). You could take a hammer to every server in my rack and I wouldn't flinch. I can recover. My SAN? Makes me pucker.
|
# ? Apr 7, 2017 21:43 |
|
Rhymenoserous posted:Yeah no poo poo. Servers these days are just the delivery boys and processing oomph for what the san delivers (If you are doing virtualization like a smart person). You could take a hammer to every server in my rack and I wouldn't flinch. I can recover. My SAN? Makes me pucker.
|
# ? Apr 7, 2017 23:21 |
|
They'll be on the by 2030 then!
|
# ? Apr 7, 2017 23:25 |
|
Catalogic ECX: rough price? I'm reading something like $5k / controller? I may be getting a client off tape forever
|
# ? Apr 13, 2017 03:45 |
|
Thanks Ants posted:They'll be on the by 2030 then! Its just someone else's server.
|
# ? Apr 17, 2017 01:27 |
|
RIP nimble... you were great. HPE is the loving worst they destroy everything they touch. It has already started, nimble canceled their guardians of the galaxies moving screening in my area. I can already smell the HPE stench all over this. I just hope they can keep their support at least half decent until our nimbles are EOL.
|
# ? May 4, 2017 00:07 |
|
tehfeer posted:RIP nimble... you were great. HPE is the loving worst they destroy everything they touch. It has already started, nimble canceled their guardians of the galaxies moving screening in my area. I can already smell the HPE stench all over this. I just hope they can keep their support at least half decent until our nimbles are EOL.
|
# ? May 4, 2017 00:20 |
|
adorai posted:Honestly, the movie poo poo is annoying anyway. Free movies that you're not obligated to attend are annoying?
|
# ? May 4, 2017 00:34 |
|
I'll take a short sales pitch for a free screening in a nice cinema
|
# ? May 4, 2017 08:54 |
|
big money big clit posted:Free movies that you're not obligated to attend are annoying?
|
# ? May 4, 2017 13:39 |
|
tehfeer posted:RIP nimble... you were great. HPE is the loving worst they destroy everything they touch. It has already started, nimble canceled their guardians of the galaxies moving screening in my area. I can already smell the HPE stench all over this. I just hope they can keep their support at least half decent until our nimbles are EOL. Client I'm working with bought Nimble for their AIX system, now I can't wait to see which way that goes...
|
# ? May 5, 2017 16:16 |
|
I really, really like Nimble's offerings and their integration with Veeam. It's a real shame they got bought by HPE because they were my #1 choice for hardware replacement coming up next year, but HPE ruins everything they touch.
|
# ? May 5, 2017 17:12 |
|
loving robocopy is steadfastly refusing to copy security info and it's making my job way harder than it has to :<
|
# ? May 17, 2017 15:18 |
|
Internet Explorer posted:I really, really like Nimble's offerings and their integration with Veeam. It's a real shame they got bought by HPE because they were my #1 choice for hardware replacement coming up next year, but HPE ruins everything they touch. What are you going to look at instead?
|
# ? May 17, 2017 15:44 |
|
|
# ? May 11, 2024 16:48 |
|
I'm really not sure. We're finally having a discussion internally on what we are going to do when this hardware comes up for renewal, whether continue maintaining our private cloud or go all-in on public cloud IaaS/SaaS. Hoping things will shake up a bit in the storage world before I have to make a decision, but quick, granular backups and replication is a huge selling point to me. Veeam is great but having a ~12 hour RPO (backups+replication) kind of sucks and I do not want to give our Veeam VMs read/write access to our VMFS formatted LUNs. Just sounds like a bad day waiting to happen.
|
# ? May 17, 2017 15:57 |