Enterprise Storage Megathread: Why is my NAS a SAN?

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > Enterprise Storage Megathread: Why is my NAS a SAN?

«‹›207 »

Crowley: Mar 13, 2003

FISHMANPET posted:

Well Crowley's here now so he can respond, but I think for bulk storage he just uses some homebuilt SuperMicro chasis stuffed with drives, right?

That's just emergency storage for the graphics department since the data use went far beyond the quadrupling of media data usage we expected when we went from SD to HD production. We're looking into getting something serious to take over the task.

For non-media file types we have a regular ol' HP EVA 4000, and for media archiving we use a HP X9720. Media projects that are currently being worked on is housed in the AVID ISIS.

Edit: I just filled the last disk slots in the Supermicro server last week. 46 TB total, and I'm already down to 15 TB free. :cry:

# ? Aug 14, 2013 21:09

Adbot: ADBOT LOVES YOU

# ? May 11, 2024 16:19

Erwin: Feb 17, 2006

Goon Matchmaker posted:

We want local copies of the VMs in a local repository. It'd be somewhat stupid to backup straight off site since we might (and have) needed to restore a VM at some point and waiting for the thing to come back over a 100Mb line would be painful to say the least.

You can have more than one backup job per VM. Backup locally, also backup offsite.

# ? Aug 14, 2013 21:11

Syano: Jul 13, 2005

Erwin posted:

You can have more than one backup job per VM. Backup locally, also backup offsite.

I do exactly this. One job backups up VMs every 4 hours. Another grabs them over night to the offsite repository

# ? Aug 14, 2013 21:38

Goon Matchmaker: Oct 23, 2003; I play too much EVE-Online

Edit: Wrong button, sorry.

# ? Aug 14, 2013 21:59

evil_bunnY: Apr 2, 2003

larchesdanrew posted:

Essentially they want a vast place to dump terabytes of video files that will, apparently, sort said files and allow for them to be searched for later. Anyone familiar with something along these lines?

This is 2 layers: back end storage and metadata management solution.

# ? Aug 14, 2013 22:35

Vulture Culture: Jul 14, 2003; I was never enjoying it. I only eat it for the nutrients.

For the type of workload, I'm thinking Isilon should be your first call on the video storage front.

# ? Aug 15, 2013 01:23

Goon Matchmaker: Oct 23, 2003; I play too much EVE-Online

Looks like we're going to wait for veeam 7. It does wan acceleration and will apparently do what we need. I'm going to confirm with a sales engineer tomorrow.

# ? Aug 15, 2013 03:13

Aquila: Jan 24, 2003

Oh god what have I done:

HUS 150 with SSD Tier is what I've done :getin:

# ? Aug 15, 2013 04:35

Thanks Ants: May 21, 2004; #essereFerrari

Crazy design, I've heard good things though. What sort of stuff are you storing on it?

# ? Aug 15, 2013 04:48

Vanilla: Feb 24, 2002; Hay guys what's going on in th

larchesdanrew posted:

I just got my first real "IT" job at the local news studio and have been tasked with researching a networked storage solution to archive news footage. Essentially they want a vast place to dump terabytes of video files that will, apparently, sort said files and allow for them to be searched for later. Anyone familiar with something along these lines?

So you are looking for two different things. A storage platform and some kind of media management software. If they want to be able to search old video someone needs to review it all and categorise it! Otherwise all they have is Windows search and file names...

Dilbert as gently caress raises a lot of the initial, key, questions which you need to be asking upwards - they are all questions for the business to decide and answer, not you.

Isilon is a big thing in the media space. See below for a list of some customers back in 2011 and note the number of media outlets
http://www.storagenewsletter.com/news/business/apple-isilon-itunes

Isilon fits the media mould well as it's often not maintained by IT but by media techies (big difference). Very easy to manage - there's just one big pot of storage. Not multiple LUNs, configuration, etc. It's all disk - so instantly accessible.

With regards to fast search you can place metadata on SSD so users receive fast results.
Start of by getting answers to Dilbert as gently caress's questions and also:

- Current capacity of files to be moved over (total GB/TB)
- Daily ingest (GB/TB)
- Growth rates for the next five years
- Protocol required, CIFS, NFS, etc

# ? Aug 15, 2013 12:45

Syano: Jul 13, 2005

Goon Matchmaker posted:

Looks like we're going to wait for veeam 7. It does wan acceleration and will apparently do what we need. I'm going to confirm with a sales engineer tomorrow.

Veeam NOW will do what you need. Once the initial backup repository is seeded all you are going to transfer is changed blocks.

# ? Aug 15, 2013 13:45

Aquila: Jan 24, 2003

Caged posted:

Crazy design, I've heard good things though. What sort of stuff are you storing on it?

Postgresql with both lots of little stuff that needs to be very fast and lots of big stuff that also needs to be very fast. Lots of joins. I'm trying out dynamic tiering between sas and ssd, but also testing straight ssd:

/dev/mapper/360060e80101392b0058b38fb00000000 493G 70M 467G 1% /mnt/ssd
/dev/mapper/360060e80101392b0058b38fb00000009 493G 70M 467G 1% /mnt/sas
/dev/mapper/360060e80101392b0058b38fb00000005 493G 70M 467G 1% /mnt/hdt

I also kept one raid group of sas out for VM root volumes, because why not, but db's are bare metal right now. Hitachi threw in a free file module, but our san consultant didn't even want to hook it up, that's the box in the side of the picture. I am not sure what I'm going to do with that.

I am wondering though, does anyone know of a FC to IP solution for long distance replication that's either cheaper than a Brocade SW7800, or has 10GigE ports? The boss is very interested in Hitachi TrueCopy Extended Distance between colos, but right now the hardware cost is kinda nuts (~$25k on each end).

# ? Aug 15, 2013 20:14

YOLOsubmarine: Oct 19, 2004; When asked which Pokemon he evolved into, Kamara pauses.

"Motherfucking, what's that big dragon shit? That orange motherfucker. Charizard."

Aquila posted:

Postgresql with both lots of little stuff that needs to be very fast and lots of big stuff that also needs to be very fast. Lots of joins. I'm trying out dynamic tiering between sas and ssd, but also testing straight ssd:

/dev/mapper/360060e80101392b0058b38fb00000000 493G 70M 467G 1% /mnt/ssd
/dev/mapper/360060e80101392b0058b38fb00000009 493G 70M 467G 1% /mnt/sas
/dev/mapper/360060e80101392b0058b38fb00000005 493G 70M 467G 1% /mnt/hdt

I also kept one raid group of sas out for VM root volumes, because why not, but db's are bare metal right now. Hitachi threw in a free file module, but our san consultant didn't even want to hook it up, that's the box in the side of the picture. I am not sure what I'm going to do with that.

I am wondering though, does anyone know of a FC to IP solution for long distance replication that's either cheaper than a Brocade SW7800, or has 10GigE ports? The boss is very interested in Hitachi TrueCopy Extended Distance between colos, but right now the hardware cost is kinda nuts (~$25k on each end).

What filesystem are you running on top of this?

Regarding FC over IP we used McData 1620s for that but they are discontinued now. They worked fine replicating data via HUR. Any FCIP router *should* work and HP probably has some pretty cheap ones given that their network gear is always pretty cheap. Bandwidth likely won't be a concern since you'll be limited by what's available at the LAN side, but since you're talking about synchronous replication you'll want something that doesn't introduce much latency. Speaking of syncrep is really tough to get right and on something like a DB workload where it's going to be highly sensitive to write latencies I can see it being very problematic.

# ? Aug 15, 2013 22:53

Aquila: Jan 24, 2003

NippleFloss posted:

What filesystem are you running on top of this?

Regarding FC over IP we used McData 1620s for that but they are discontinued now. They worked fine replicating data via HUR. Any FCIP router *should* work and HP probably has some pretty cheap ones given that their network gear is always pretty cheap. Bandwidth likely won't be a concern since you'll be limited by what's available at the LAN side, but since you're talking about synchronous replication you'll want something that doesn't introduce much latency. Speaking of syncrep is really tough to get right and on something like a DB workload where it's going to be highly sensitive to write latencies I can see it being very problematic.

ext4

Hitachi TCED can do sync or async replication, I'm think I'll do async for a near realtime copy of data in another location in case things go down I can bring my operation up in the other location quickly. I don't see how this is limited by LAN speed, I'll likely get a 10gig link for these purposes.

# ? Aug 15, 2013 23:51

YOLOsubmarine: Oct 19, 2004; When asked which Pokemon he evolved into, Kamara pauses.

"Motherfucking, what's that big dragon shit? That orange motherfucker. Charizard."

Aquila posted:

ext4

Hitachi TCED can do sync or async replication, I'm think I'll do async for a near realtime copy of data in another location in case things go down I can bring my operation up in the other location quickly. I don't see how this is limited by LAN speed, I'll likely get a 10gig link for these purposes.

Whoops, I meant to say WAN. 10G is overkill for replication traffic because unless you have dedicated fiber you aren't getting nearly that once it hits the WAN anyway. And yea, I've used async Hitachi replication before, and it sucked. It worked fine technically but it was an absolutely giant pain in the rear end to manage. HORCM sucked as a management product, the requirement for separate journal volumes for each consistency group ate up a lot of additional storage, and because replication happened on the device level and was completely divorced from actual hosts or applications it was common that LUNs got provisioned or reclaimed from hosts but the replication sets never got updated.

A lot of that would have been solved by better management tools, but those weren't available at the time. I haven't spent any time with the new Command Suite stuff so I have no idea if it makes replication more manageable. In general I think replicating at the app level is a better proposition in most cases than trying to do it at the array level, particularly the logical device level.

# ? Aug 17, 2013 04:24

Pile Of Garbage: May 28, 2007

Aquila posted:

Postgresql with both lots of little stuff that needs to be very fast and lots of big stuff that also needs to be very fast. Lots of joins. I'm trying out dynamic tiering between sas and ssd, but also testing straight ssd:

/dev/mapper/360060e80101392b0058b38fb00000000 493G 70M 467G 1% /mnt/ssd
/dev/mapper/360060e80101392b0058b38fb00000009 493G 70M 467G 1% /mnt/sas
/dev/mapper/360060e80101392b0058b38fb00000005 493G 70M 467G 1% /mnt/hdt

I also kept one raid group of sas out for VM root volumes, because why not, but db's are bare metal right now. Hitachi threw in a free file module, but our san consultant didn't even want to hook it up, that's the box in the side of the picture. I am not sure what I'm going to do with that.

I am wondering though, does anyone know of a FC to IP solution for long distance replication that's either cheaper than a Brocade SW7800, or has 10GigE ports? The boss is very interested in Hitachi TrueCopy Extended Distance between colos, but right now the hardware cost is kinda nuts (~$25k on each end).

I've used IBM SAN06B-R (Re-branded Brocade 7800) MPR devices in the past to do FCIP tunnelling and they are pretty solid. Of course as you mentioned they are drat expensive. From memory they were around $20k but there was also extra licensing on top of that.

# ? Aug 17, 2013 05:09

Aquila: Jan 24, 2003

NippleFloss posted:

Whoops, I meant to say WAN. 10G is overkill for replication traffic because unless you have dedicated fiber you aren't getting nearly that once it hits the WAN anyway. And yea, I've used async Hitachi replication before, and it sucked. It worked fine technically but it was an absolutely giant pain in the rear end to manage. HORCM sucked as a management product, the requirement for separate journal volumes for each consistency group ate up a lot of additional storage, and because replication happened on the device level and was completely divorced from actual hosts or applications it was common that LUNs got provisioned or reclaimed from hosts but the replication sets never got updated.

A lot of that would have been solved by better management tools, but those weren't available at the time. I haven't spent any time with the new Command Suite stuff so I have no idea if it makes replication more manageable. In general I think replicating at the app level is a better proposition in most cases than trying to do it at the array level, particularly the logical device level.

I am budgetting for 10G because I don't really know how much I'll need til we're up and running in production. Also I have an almost free 10Gig dark fiber to separate geo I think through my current facility. Of course the CEO wants to do it to EU, I've told him how spendy trans atlantic 10G is. Command Suite is totally acceptable for now, though our vendor did pull a fast one on us and not inform us that it only runs on windows until we were pretty late in the purchasing process (we're 100% ubuntu), I just made them throw in a Hitachi Windows server. We're hoping there's enough command line type support to completely automate volume creation and allocation, and snapshotting. My (amazing) systems dev already has made some significant improvements to Foreman with respect to choosing multipath devices which we are going to be submitting back upstream. Our goal is to make as much of this kind of stuff we do available back to the open source community, just in case there are any other startups out there crazy enough to buy Hitachi SAN's. Also we're slowly working on Hitachi to get them to realize there's more to Linux than RHEL and SUSE. Many steps along the way have been an interesting collision of startup vs (very) big business methodology.

As for replication, I haven't tried it yet obviously, but so far everything this vendor has told me works as stated.

cheese-cube posted:

I've used IBM SAN06B-R (Re-branded Brocade 7800) MPR devices in the past to do FCIP tunnelling and they are pretty solid. Of course as you mentioned they are drat expensive. From memory they were around $20k but there was also extra licensing on top of that.

My SAN consultant actually quoted two 7800's on each end, which seemed excessive to me unless I get redundant paths, which I supposed I probably should.

And because I'm working on it right now, here's the mount options I think I'm going to use:

/dev/mapper/360060e80101392b0058b38fb00000014 /mnt/ssd-barrier-tests ext4 defaults,data=writeback,noatime,barrier=0,journal_checksum 0 0

This is still untested. I'm also trying noop and deadline schedulers, and turning off dirty page ratios and swappiness (not that I have swap allocated) (echo 0 > /proc/sys/vm/swappiness echo 0 > /proc/sys/vm/dirty_ratio echo 0 > /proc/sys/vm/dirty_background_ratio). I'm still investigating discard, which I think should be issuing scsi UNMAP in our case (as opposed to sata TRIM) and is probably desirable for the san.

This is the format command I'm using:

mkfs.ext4 -E lazy_itable_init=0 -E lazy_journal_init=0 /dev/mapper/blah

The format is super fast due to san magic or something. I am investigating if I need a custom stride and stripe setting.

I've been using iostat (v10+), ioping, and fio for testing so far, plus actually loading lots of data into postgresql.

# ? Aug 17, 2013 06:36

Pile Of Garbage: May 28, 2007

Aquila posted:

My SAN consultant actually quoted two 7800's on each end, which seemed excessive to me unless I get redundant paths, which I supposed I probably should.

IIRC you can have multiple circuits in a FCIP tunnel so you could utilise multiple links with only a single MPR at each end. I'd say that getting a redundant link would be more important than getting redundant MPRs but it all really depends on the level of availability you want (And how much you're willing to spend).

# ? Aug 17, 2013 12:19

Blame Pyrrhus: May 6, 2003; Me reaping: Well this fucking sucks. What the fuck.; Pillbug

Goon Matchmaker posted:

I don't remember seeing a backups thread so I'll ask here. What's the best way to replicate/copy a backup repository (Veeam) to an offsite location? Right now we're looking at a contrived scheme of converting our iscsi backup lun into an NFS lun and using rsync to move the files offsite, which I think is just pure lunacy. There's been rumblings of "data domain" but the budget for this particular project is $0. We're trying to move ~250GB of weekly backups over a 100Mb line shared with some other stuff.

We use Exagrid to facilitate some site-to-site replication and de-duplication of things that aren't best served by our Avamar nodes. It's dirt cheap and pretty bulletproof, and I _think_ it works natively with Veeam.

# ? Aug 17, 2013 18:00

Blame Pyrrhus: May 6, 2003; Me reaping: Well this fucking sucks. What the fuck.; Pillbug

ZombieReagan posted:

drat that is a nice setup, we're in the process of ripping out all of our NetApp hardware and replacing it with VNX after 2 years of our 7500 not having a problem that the end users noticed vs lots of problems with the NetApp arrays. VPLEX looks great on paper and all, but I'm just not sure I can justify the expense to our upper management at all right now. At least our EMC rep found out they were competing with HP 3PAR and sold us all 2nd gen VNX equipment which is being delivered in late August.

Otherwise, it sounds like you handle things similar to us - same blades and all and little to no over-committing going on. Not worth the risk, especially with the hardware being relatively cheap. We also have a lot of AIX, Solaris and HPUX systems as well, which was why VPLEX was really interesting to me. I'd like to have the ability to have something similar to vmotion on the backend for things that aren't able to run under vmware. I'm just assuming there's no way I can justify the cost to management since it's mostly a "nice to have" sort of thing.

If any of you guys want to come check out our VPlex deployment after it's complete and are in the Phoenix Area, our doors are pretty much open. We love showing off this kind of stuff.

I suspect it will be fully deployed around the end of Sept.

# ? Aug 17, 2013 18:05

Langolas: Feb 12, 2011; My mustache makes me sexy, not the hat

ZombieReagan posted:

drat that is a nice setup, we're in the process of ripping out all of our NetApp hardware and replacing it with VNX after 2 years of our 7500 not having a problem that the end users noticed vs lots of problems with the NetApp arrays. VPLEX looks great on paper and all, but I'm just not sure I can justify the expense to our upper management at all right now. At least our EMC rep found out they were competing with HP 3PAR and sold us all 2nd gen VNX equipment which is being delivered in late August.

Otherwise, it sounds like you handle things similar to us - same blades and all and little to no over-committing going on. Not worth the risk, especially with the hardware being relatively cheap. We also have a lot of AIX, Solaris and HPUX systems as well, which was why VPLEX was really interesting to me. I'd like to have the ability to have something similar to vmotion on the backend for things that aren't able to run under vmware. I'm just assuming there's no way I can justify the cost to management since it's mostly a "nice to have" sort of thing.

Next gen VNX is loving awesome. Been playing with it the past few weeks, been solid. Anyone looking for mid tier storage should wait a few months to compare the new vnx to everyone elses offerings.

# ? Aug 17, 2013 21:35

Italy's Chicken: Feb 25, 2001; cs is for cheaters

Does anyone know how the file system of a LTO tape works? Specifically, if I'm only ever using 50GB of a tape, am I wearing out the beginning of the tape by constantly overwriting to the first 50GB over and over again or does the tape evenly wear itself out?

(if you're wondering why I'm wasting a tape, its the only way modern backup solutions work with VAX systems... if you're wondering what a VAX system is, DON'T! ...and pray you never come across one)

# ? Aug 19, 2013 00:10

Wicaeed: Feb 8, 2005

Small/Mid-Size storage time:

I'm right in the middle of a poorly planned project, which is partly my fault (first time doing virtualization on this scale (~150 VMs, all in a test environment)), partly the fault of the fact that I had absolutely 0 budget aside from spare parts laying around our shop, and partly the fault of our parent company coming and saying "Lets test this new product feature that would require a massive hardware purchase to do this on physical boxes."

Our shop is small enough/new enough (and in the past, poorly managed enough) to not really have much in the way of properly implemented shared storage. The only Enterprise level SAN equipment we have is an aging Equallogic PS3000 series array, and a a new Equallogic PS4110/PS6110 array duo that is going to be used in our production Datacenter to host a new billing environment.

I'm looking into storage solutions right now for this Virtualization project (hopefully with the additional storage/performance capacity to handle another upcoming Virtualization project as we rebuild our company server room (~30 servers, most lightly used)) with roughly the following requirements:

~20TB Raw capacity
1Gbit/10Gbit redundant controllers
Snapshot support
Thin Provisioning support
Deduplication Support (This one is huge. I don't understand very much about dedup, other than the fact that the potential savings on storage is too important to ignore). I've been toying with Dedup on Windows Server 2012, and am impressed so far.

I should mention at this point that the budget I've been tentatively given is in the $20,000 area.

As I said before we have roughly 150 VM's in a test environment, with plans to do further virtualization projects as we can. I'm starting to look at vendor offerings with the above requirements, and so far I've found the following:

EMC VNXe3150
Nimble CS220
Tigile HA2100

I found the EMC VNXe3150 around that area, but I'm curious as to the other two vendors. Has anyone worked/partnered with them and are familiar with their pricing/drawbacks/feature sets?

edit: I should also mention at this point that we are a Dell shop, so we may be getting good pricing on EMC hardware.

Wicaeed fucked around with this message at 09:26 on Aug 19, 2013

# ? Aug 19, 2013 08:35

Pile Of Garbage: May 28, 2007

Italy's Chicken posted:

Does anyone know how the file system of a LTO tape works? Specifically, if I'm only ever using 50GB of a tape, am I wearing out the beginning of the tape by constantly overwriting to the first 50GB over and over again or does the tape evenly wear itself out?

(if you're wondering why I'm wasting a tape, its the only way modern backup solutions work with VAX systems... if you're wondering what a VAX system is, DON'T! ...and pray you never come across one)

Here is an excellent guide on how data is written to LTO tapes: http://www.lto.org/technology/primer1.html.

To answer your question, no you are not increasing the level of wear on the tape. The main measure of wear on a tape is the number of end-to-end passes. The less you write to a tape, the fewer end-to-end passes are performed.

Edit: I should mention that there's also limit to the number of cartridge load/unload operations which is around 5,000 however your unlikely to hit that number and the cartridges will probably exceed the recommended maximum number of end-to-end passes well before that.

Pile Of Garbage fucked around with this message at 13:52 on Aug 19, 2013

# ? Aug 19, 2013 13:40

skipdogg: Nov 29, 2004; Resident SRT-4 Expert

Wicaeed, the budget is too low. Your not getting the SAN you need at that price point.

# ? Aug 19, 2013 14:05

YOLOsubmarine: Oct 19, 2004; When asked which Pokemon he evolved into, Kamara pauses.

"Motherfucking, what's that big dragon shit? That orange motherfucker. Charizard."

skipdogg posted:

Wicaeed, the budget is too low. Your not getting the SAN you need at that price point.

Eh, he might with a smaller vendor who is eager to make sales. Tintri and Nimble come to mind as two vendors that have been very aggressive with pricing in an effort to establish some marketshare.

# ? Aug 19, 2013 20:36

skipdogg: Nov 29, 2004; Resident SRT-4 Expert

NippleFloss posted:

Eh, he might with a smaller vendor who is eager to make sales. Tintri and Nimble come to mind as two vendors that have been very aggressive with pricing in an effort to establish some marketshare.

For some reason I thought he was putting 150 VM's on it, not 30 now that I reread it. Still I think 20TB raw with redundant 10G capable controllers is going to be really tough for 20K.

My VNXe cost more than that with less than half the space and no 10G networking.

# ? Aug 19, 2013 20:46

Wicaeed: Feb 8, 2005

No, you're right. Ideally we would be putting all of our vms on this storage. As I said before though, its for a test environment so our IOPS requirements aren't insane.

We might be able to get away with multiple Gbit nics instead of 10 Gbit, or a single Gbit controller.

# ? Aug 19, 2013 20:57

Agrikk: Oct 17, 2003; Take care with that! We have not fully ascertained its function, and the ticking is accelerating.

Over in the lab thread I made a post about building a DIY SAN for shits and experience and I am looking at the venerable Brocade Silkworm 200E 4gb fiber switches for sale there for cheap as the center of my FC SAN.

It's been a few years since I've touched a Brocade switch and I'm wondering if there are any compatibility issues with it and newer clustering technologies found in Windows Server 2012 or HA in ESXi 5.1.

The reason why I ask is that once upon a time I had a MSA1000 that wouldn't cluster on Server 2008 R2 due to it failing the "Validate a Configuration" tests (I cannot remember which it failed though).

Before I lay down some cash for a switch and a grip of fiber cards, I want to make sure they'll actually work.

Can anyone shed some light the Silkworm switches?

Agrikk fucked around with this message at 23:36 on Aug 19, 2013

# ? Aug 19, 2013 23:29

Pile Of Garbage: May 28, 2007

What HBAs are you going to be using?

# ? Aug 20, 2013 02:28

YOLOsubmarine: Oct 19, 2004; When asked which Pokemon he evolved into, Kamara pauses.

"Motherfucking, what's that big dragon shit? That orange motherfucker. Charizard."

Wicaeed posted:

No, you're right. Ideally we would be putting all of our vms on this storage. As I said before though, its for a test environment so our IOPS requirements aren't insane.

We might be able to get away with multiple Gbit nics instead of 10 Gbit, or a single Gbit controller.

I still think it's doable. Our sales teams (NetApp) see some really crazy discounts from startups when they are competing against us and other major vendors. Just get some quotes from NetApp and EMC and then let Nimble or TinTri, or Tegile, and whoever else you are talking to know that you find their products interesting but you're really concerned about their lack of history and customer base in the industry. And don't tell them that the big players are out of your price range. See how low they will go to try to win your business. You might be surprised.

# ? Aug 20, 2013 02:59

Dilbert As FUCK: Sep 8, 2007; by Cowcaster; Pillbug

Wicaeed posted:

Our shop is small enough/new enough (and in the past, poorly managed enough) to not really have much in the way of properly implemented shared storage. The only Enterprise level SAN equipment we have is an aging Equallogic PS3000 series array, and a a new Equallogic PS4110/PS6110 array duo that is going to be used in our production Datacenter to host a new billing environment.

I'm looking into storage solutions right now for this Virtualization project (hopefully with the additional storage/performance capacity to handle another upcoming Virtualization project as we rebuild our company server room (~30 servers, most lightly used)) with roughly the following requirements:

Just because I am curious you are planning to use that PS3000 in your new environment as some "low performance storage"? Reclaiming that device, maybe applying any needed firmware updates and renewing a warranty can be a load cheaper than replacing another storage device; Granted of course there are no known issues/incompatibilities with it.

quote:

~20TB Raw capacity
1Gbit/10Gbit redundant controllers
Snapshot support
Thin Provisioning support
Deduplication Support (This one is huge. I don't understand very much about dedup, other than the fact that the potential savings on storage is too important to ignore). I've been toying with Dedup on Windows Server 2012, and am impressed so far.

I should mention at this point that the budget I've been tentatively given is in the $20,000 area.

You could probably get all that for 20k, aside from the Depude. 3Par, Dell/Equallogic, or Nimble might squeeze it in there but I would be crossing my fingers really hard. Maybe find a small VAR with someone looking to make that first sale and can bend some.

Aside from price, while de-dupe is cool, is the data residing on that Storage viable to be deduped? There are a good amount of arrays out there that won't start Depuding until the data has been 'cold' for X days, not to mention depude processes can destroy IOPS available on the array.

quote:

As I said before we have roughly 150 VM's in a test environment, with plans to do further virtualization projects as we can. I'm starting to look at vendor offerings with the above requirements, and so far I've found the following:

EMC VNXe3150
Nimble CS220
Tigile HA2100

I found the EMC VNXe3150 around that area, but I'm curious as to the other two vendors. Has anyone worked/partnered with them and are familiar with their pricing/drawbacks/feature sets?

edit: I should also mention at this point that we are a Dell shop, so we may be getting good pricing on EMC hardware.

The VNXe 3150 is really good for the price if you are getting all those features(dedupe) + support that is a very, very good deal.

I guess my question are;
Is Dedupe a hard requirement? Are you planning to run VM's on this storage or is it just going to be for cold data? You could run into some serious headaches trying to run Dedupe where VM's are active.
If you want 10Gb do you have the line of site from storage-to-switch-to-host of 10Gb, why are you going 10Gb?
Without Dedupe what kind of storage capacities are you looking at when this project goes live to 2-3 years or when you have the budget to add additional storage?
What are your IOPS averaged at for this storage? What kind of activity? Mostly reads?

Dilbert As FUCK fucked around with this message at 03:46 on Aug 20, 2013

# ? Aug 20, 2013 03:42

Agrikk: Oct 17, 2003; Take care with that! We have not fully ascertained its function, and the ticking is accelerating.

cheese-cube posted:

What HBAs are you going to be using?

QLogic QLE2460 4GB Fibre Channel HBA. They have Server 2012 support and are cheap as hell on eBay. Will these guys work with a Silkworm switch and behave properly in a 2012/ESXi cluster?

# ? Aug 20, 2013 05:23

Wicaeed: Feb 8, 2005

Dilbert As gently caress posted:

Just because I am curious you are planning to use that PS3000 in your new environment as some "low performance storage"? Reclaiming that device, maybe applying any needed firmware updates and renewing a warranty can be a load cheaper than replacing another storage device; Granted of course there are no known issues/incompatibilities with it.

I'm not really sure what's running on it, which kind of scares me. I would assume that at some point in the future as we move away from our old equipment, this hardware would be freed up, but there no eta at this point.

Dilbert As gently caress posted:

The VNXe 3150 is really good for the price if you are getting all those features(dedupe) + support that is a very, very good deal.

I guess my question are;
Is Dedupe a hard requirement? Are you planning to run VM's on this storage or is it just going to be for cold data? You could run into some serious headaches trying to run Dedupe where VM's are active.
If you want 10Gb do you have the line of site from storage-to-switch-to-host of 10Gb, why are you going 10Gb?
Without Dedupe what kind of storage capacities are you looking at when this project goes live to 2-3 years or when you have the budget to add additional storage?
What are your IOPS averaged at for this storage? What kind of activity? Mostly reads?

Dedup would be a "nice as gently caress to have" feature. The 150 VM's that we currently run have many of the same files on them (same OS, mostly the same packages), but I'm not quite sure how dedup works for virtualization workloads vs file-level workloads. I suppose if it's getting dedup vs. getting an extra raw storage, we could make do with the extra storage.

That's actually a really good point about the Gbit thing. Right now we don't have any 10Gbit networking available to us, and we have enough quad port gigabit nics laying around that we could easily make do without the 10Gbit, and upgrade at a later date.

As for the IOPS load, I'm pretty sure that 95% of the time these machines are idle/light IO for reading, except for when some dev hops on 8 members of the cluster to run an update script which absolutely cripples the current DAS storage(4x 600GB 10K RPM drives in a raid-5 array). The script they run deploys a ~4GB package to each clusters member machines. Each cluster manager PXE boots it's own members, so it's 8 managers deploying a 4GB package to three members of it's own cluster at once (4GB file * 3 members on each cluster * 8 clusters = 96GB of writing occurring). The devs aren't used to virtualization so they assume that they can run this script with no performance hit on the cluster :rolleyes:

I think, per VM, the write IOPS is about 200 when updating. It's pretty negligable otherwise.

# ? Aug 20, 2013 05:33

Vulture Culture: Jul 14, 2003; I was never enjoying it. I only eat it for the nutrients.

The most important thing to remember re: virtualization and dedupe: It might sound neat to remove a bunch of that redundant data so you don't need to buy as much disk, but with modern disks as large as they are relative to the IOPS they provide, chances are you're not buying enough disk to satisfy your I/O requirements anyway if you're filling them. Dedupe will only make the situation worse.

It's a lot more useful for backups than your raw data stores, but most VM backup products will deduplicate already anyway.

# ? Aug 20, 2013 05:43

Aquila: Jan 24, 2003

So one of those tuning's I mentioned managed to make a 20MB local copy take 18 seconds to return on my test server. That's on the root partition (local samsung 840's in raid1) mind you. Oops.

# ? Aug 20, 2013 07:04

Dilbert As FUCK: Sep 8, 2007; by Cowcaster; Pillbug

quote:

Dedup would be a "nice as gently caress to have" feature. The 150 VM's that we currently run have many of the same files on them (same OS, mostly the same packages), but I'm not quite sure how dedup works for virtualization workloads vs file-level workloads. I suppose if it's getting dedup vs. getting an extra raw storage, we could make do with the extra storage.

Dedupe is great for things such as archived data, backups, and "cold" data (ie not touched but a few times a year). De-dupe would be great if say, you have a bunch of backups or files that shared many consistencies among one another. For things like VM's it may not even see the data as dedupable, and if run, may cause a meltdown of the environment as Deduping is a very I/O intensive process.

quote:

As for the IOPS load, I'm pretty sure that 95% of the time these machines are idle/light IO for reading, except for when some dev hops on 8 members of the cluster to run an update script which absolutely cripples the current DAS storage(4x 600GB 10K RPM drives in a raid-5 array).

Raid 5 Carries a higher write penalty, is RAID 0+1 doable? it may not fix everything, but may make it a bit more bearable

Dilbert As FUCK fucked around with this message at 17:45 on Aug 20, 2013

# ? Aug 20, 2013 17:38

sanchez: Feb 26, 2003

Remember that there is a difference between netapp style dedupe and compression like Nimble too, the latter might be a better fit.

# ? Aug 20, 2013 18:39

YOLOsubmarine: Oct 19, 2004; When asked which Pokemon he evolved into, Kamara pauses.

"Motherfucking, what's that big dragon shit? That orange motherfucker. Charizard."

Dilbert As gently caress posted:

Dedupe is great for things such as archived data, backups, and "cold" data (ie not touched but a few times a year). De-dupe would be great if say, you have a bunch of backups or files that shared many consistencies among one another. For things like VM's it may not even see the data as dedupable, and if run, may cause a meltdown of the environment as Deduping is a very I/O intensive process.

This is all highly dependent on how the dedupe process is implemented. File level dedupe might be problematic but most SAN vendors are performing dedupe at the level of a block or page which is generally small enough that redundant data within VMDKs is going to lead to a ton of shared blocks or pages. The performance hit is also wholly dependent on the implementation details. If done properly dedupe jobs will be paced to cede resources to user work and run in the spare cycles in between, and will run on cores that aren't heavily used. It can also provide some performance improvements if your cache is dedupe aware since reading a block into cache will effectively read that block into cache for every machine that shares it, which can be very useful during a boot storm.

Some vendors are doing inline dedupe or compression as an always on feature now, so that the performance penalty is baked into the spec sheet. Nimble does this with their compression and from what I've heard from users they get very respectable compression ratios on general VM data. Not having to worry about scheduling the process or potential negative interactions when turning it on is a nice feature and one of the things I think Nimble is doing better than many more established vendors.

Misogynist posted:

The most important thing to remember re: virtualization and dedupe: It might sound neat to remove a bunch of that redundant data so you don't need to buy as much disk, but with modern disks as large as they are relative to the IOPS they provide, chances are you're not buying enough disk to satisfy your I/O requirements anyway if you're filling them. Dedupe will only make the situation worse.

It's a lot more useful for backups than your raw data stores, but most VM backup products will deduplicate already anyway.

I've seen this problem come up with some frequency, but on hybrid arrays the IO capacity of slow disk tier is generally a lot less important and you'll often be trying to get as much space out of that disk as possible without much regard for how many IOPs it can drive, because it's going to be lightly used, and used in very efficient ways. As mentioned above, dedupe can actually improve performance if you have a hybrid array where the cache is dedupe aware. As long as you can handle the writes and sequential read IO on the SATA disks you'll generally be fine, and those things are a lot easier to manage without throwing a ton of spindles at the problem.

Wicaeed posted:

As for the IOPS load, I'm pretty sure that 95% of the time these machines are idle/light IO for reading, except for when some dev hops on 8 members of the cluster to run an update script which absolutely cripples the current DAS storage(4x 600GB 10K RPM drives in a raid-5 array). The script they run deploys a ~4GB package to each clusters member machines. Each cluster manager PXE boots it's own members, so it's 8 managers deploying a 4GB package to three members of it's own cluster at once (4GB file * 3 members on each cluster * 8 clusters = 96GB of writing occurring). The devs aren't used to virtualization so they assume that they can run this script with no performance hit on the cluster

Even with a better SAN you may still have trouble with that sort of activity. That's a lot of write IO all at once and write IO can be much harder to keep up with than read IO since it can't truly be cached. That's down to pure spindle count and if you're truly hitting 4800 write IOPs during those operations you'll need a lot more spindles than I think you'll get for your budget. Some arrays do writes better than others, so how badly it affects things will be dependent on the architecture you settle on. In any case, buying something with a substantial amount of read cache will lessen the impact to users since reads can still come out of cache while the disks are being crushed by the write workload.

# ? Aug 20, 2013 18:55

Adbot: ADBOT LOVES YOU

# ? May 11, 2024 16:19

Vulture Culture: Jul 14, 2003; I was never enjoying it. I only eat it for the nutrients.

NippleFloss posted:

I've seen this problem come up with some frequency, but on hybrid arrays the IO capacity of slow disk tier is generally a lot less important and you'll often be trying to get as much space out of that disk as possible without much regard for how many IOPs it can drive, because it's going to be lightly used, and used in very efficient ways. As mentioned above, dedupe can actually improve performance if you have a hybrid array where the cache is dedupe aware. As long as you can handle the writes and sequential read IO on the SATA disks you'll generally be fine, and those things are a lot easier to manage without throwing a ton of spindles at the problem.

Very true. My earlier assumption was based on the $20k budget, which prices hybrid arrays right out, but this is a valuable clarification for others looking for similar recommendations (or this one, if the budget ends up being more realistic).

# ? Aug 20, 2013 19:23

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > Enterprise Storage Megathread: Why is my NAS a SAN?

«‹›207 »