Enterprise Storage Megathread: Why is my NAS a SAN?

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > Enterprise Storage Megathread: Why is my NAS a SAN?

«‹›3 »

1000101: May 14, 2003; BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY FRUITCAKE!

Vanilla posted:

Not doing the big I-am but I spend all day talking to different customers, mostly in finance, about all aspects of virtualisation and storage in a pre-sales manner and honestly DeDup on production is still a hot pototato. A lot wont deploy it on production even though it is free - a simple code upgrade and you're ready.

We might be talking to a lot of the same sorts of people then. Some people look at it (especially in the virtualization space) and are genuinely excited about what its going to bring.

quote:

It'll come along slowly, but in my opinion dedup on production is making a very slow entrance to the world and the only people i've seen use it in production are using it only on systems deemed unimportant. Usually this is to grab themselves another few weeks before they need another shelf, but not hugely reduce storage.

I know of two pretty large customers who make 100% of their revenue from de-duplicated volumes.

I think NetApp's a little bit ahead of the curve in regards to where they are with their de-duplication technology.

Also consider things like single file flexclone as an option to "de-duplicate" on the fly. We're seeing pretty some pretty awesome results with RCU with an automation product integrated with VMware.

# ¿ Jul 31, 2009 01:43

Adbot: ADBOT LOVES YOU

# ¿ May 13, 2024 13:23

1000101: May 14, 2003; BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY FRUITCAKE!

Honestly, the CPU in the 2020 is a little anemic; that said you can probably see some pretty good benefits. How frequently do those files get read? Are you actually performing 1200 IOPS or do you just have enough capacity to do it?

How far away is the replication target going to be? What connectivity? You might even get synchronous depending on how much data actually changes.

As far as VAR shopping, you can probably get a little bit more off the top since they're going to hope that you'll keep using them for other poo poo.

# ¿ Jul 31, 2009 18:13

1000101: May 14, 2003; BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY FRUITCAKE!

That sounds like it should be fine then.

There is some CPU overhead from putting things back together and there isn't a hell of a lot of cache in the box. The biggest hit to the CPU typically comes around when it checks the blocks to de-dup them.

I don't think anyone offers half shelf options on the 2020 but they might on the 2050.

'find' won't be too expensive since its just going to do a bunch of inode lookups. Opening all of them at once could make certain disk blocks get hammered to hell (it should go up to read cache at that point unless you've got a shitload of writes going in)

Are you buying SnapMirror too?

# ¿ Jul 31, 2009 20:25

1000101: May 14, 2003; BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY FRUITCAKE!

2050 is nice since you can also expand the heads with a PCI card and of course its got a little more growth room.

SnapMirror is the NetApp product thats used to replicate data from one site to the next. It's pretty simple to setup and its WAAS/WAN acceleration friendly. It can be kicked off as frequently as you like so long as there is enough bandwidth to copy the changes over to the other site.

# ¿ Jul 31, 2009 23:15

1000101: May 14, 2003; BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY FRUITCAKE!

adorai posted:

We are currently looking at replacing our aging ibm san with something new. The top two on our list are a pair of Netapp 3020s and a 2050 for our offsite or a pair of EMC Clarions. I am interested in looking at a dual head Sun Unified Storage 7310 and a lower end sun at our offsite. The numbers seem to be literally half for the Sun solution, so I feel like I have to missing something on it.

For usage, the primary purpose will be backend storage for about 100 VMs, some cifs, and some iSCSI/fibre storage for a few database servers.

Any thoughts from you guys?

Is your offsite storage intended to be a replication target?

What applications are you looking at running on the storage? MSSQL? Exchange, etc?

A NetApp box will do CIFS, NFS, iSCSI, and FCP all in one unit. I think the EMC Clariion only does iSCSI/FCP out of the box but you can throw a celerra in front of it or build out a fileserver.

What licensed options did you include with your storage from the other two vendors?

# ¿ Aug 26, 2009 18:11

1000101: May 14, 2003; BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY FRUITCAKE!

To clarify some things on the NetApp:

It supports RAID 4 and RAID 6 (NetApp calls it RAID DP).

It in fact does support replication based on your RPO or it can do synchronous.

One thing I found with some Sun Storage is that it doesn't appear to do active/active very well. Which is to say if I have a LUN presented off controller A and somebody tries to access that same LUN via controller B, B ends up usurping the LUN.

I had a customer who build their SAN design where controller A was entirely on fabric A, and controller B was entirely on fabric B. When one of his ESX hosts had a path failure the LUNs kept hopping back and forth between controllers which drug performance down.

Not sure if this is still an issue (it's simple enough to fix) so if someone has a similar setup and could confirm that would be great.

# ¿ Aug 27, 2009 18:04

1000101: May 14, 2003; BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY FRUITCAKE!

I think NetApp GX may address this particular need.

Here's a whitepaper on using a CX4-960 for data warehousing:
http://www.emc.com/collateral/hardware/white-papers/h5548-deploying-clariion-dss-workloads-wp.pdf

Either way you go you're going to have to aggregate multiple devices into a single logical volume. If you stick with NetApp you'll just have to create several 16TB LUNs and stripe them together using LVM or veritas or something like that. On EMC (correct me if I'm wrong Vanilla) you'll end up with a bunch of 32TB metaLUNs to achieve the same goal.

# ¿ Sep 19, 2009 04:05

1000101: May 14, 2003; BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY FRUITCAKE!

Got KarmA? posted:

Oracle will handle the replication for you. Don't do it at the SAN level. I'm not sure any kind storage level replication technology will net you anything better than crash-equivalent data.

http://www.oracle.com/technology/products/dataint/index.html

EMC Recoverpoint, and potentially NetApp Snapmirror can both give you application consistent replication.

Depending on distance between endpoints, RPO, and latency you may need to live with crash consistent replication for business continuity/disaster recovery.

How does oracle's built in replication handle it? Does it wait for the other side to acknowledge a write before committing it to logs at the local site? How would say 70ms of latency impact the performance of the production database?

# ¿ Dec 22, 2009 04:25

1000101: May 14, 2003; BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY FRUITCAKE!

rage-saq posted:

I'm pretty sure synchronous replication (where it ACKs a mirrored write before continuing on the source side) and 70ms of latency would destroy nearly any production database performance. At a certain point you can't overcome latency.

Clearly this is the case but I'm not sure how Oracle handles it. Is Oracle going to acknowledge every write? If so then yeah 70ms would make the DBA want to blow its brains out. Or does it just ship logs on some kind of interval? More of an Oracle question than a storage question but it would be helpful in deciding if you should use array based replication or application based.

Typically when we're doing offsite replication and the RTT is >10ms we tend to use async replication but it's often crash consistent. Exceptions are when you use tools like SnapManager to do snapmirror updates as part of your DR process. It's a larger RPO but you're going to be application consistent on the other side.

quote:

Hrms. What would you suggest as the best route to copy the database off-site to another SAN? We actually just want a snapshot every hour or so, not live replication. I'm very new to Oracle, but have good MySQL experience, and to a lesser extent MSSQL experience. Could we just do similar to a dump and import like would be done with MySQL? Although that wouldn't be optimal since we'd have to lock the entire DB to dump it, no?

Knowing little about Oracle what I might do is something like this on an hourly schedule:

1. Place oracle in hot backup mode
2. Dump an oracle DB backup (if feasible, may not be depending on DB size)
3. Snapshot array
4. Take oracle out of hot backup mode
5. Replicate recent snapshot offsite.

Step 2 may be completely redundant though. This is not unlike how something like NetApp Snapmirror works (kick off exchange VSS writers, kick off array snapshot, turn off VSS writers and tell the array to update snapmirror which sends the recent snap offsite.)

Bandwidth requirement is basically whatever it takes to replicate the difference between each snapshot. So if you're ready heavy you could probably use less than 128kb or if you're write heavy it could get pretty insane. It is definitely something to keep an eye on.

# ¿ Dec 23, 2009 01:39

1000101: May 14, 2003; BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY FRUITCAKE!

adorai posted:

On our NetApp 3140, when running the command: priv set diag; stats show lun; priv set admin

I am seeing a large percentage of partial writes. There does not seem to be a corresponding number of misaligned writes, as you can see in the below output:

lun:/vol/iscsivol0/lun0-W-OMCoT2A9Iw:write_align_histo.0:72%
lun:/vol/iscsivol0/lun0-W-OMCoT2A9Iw:write_align_histo.1:0%
lun:/vol/iscsivol0/lun0-W-OMCoT2A9Iw:write_align_histo.2:0%
lun:/vol/iscsivol0/lun0-W-OMCoT2A9Iw:write_align_histo.3:0%
lun:/vol/iscsivol0/lun0-W-OMCoT2A9Iw:write_align_histo.4:0%
lun:/vol/iscsivol0/lun0-W-OMCoT2A9Iw:write_align_histo.5:0%
lun:/vol/iscsivol0/lun0-W-OMCoT2A9Iw:write_align_histo.6:0%
lun:/vol/iscsivol0/lun0-W-OMCoT2A9Iw:write_align_histo.7:0%
lun:/vol/iscsivol0/lun0-W-OMCoT2A9Iw:read_partial_blocks:0%
lun:/vol/iscsivol0/lun0-W-OMCoT2A9Iw:write_partial_blocks:27%

Should I be worried about this? All of my vmdks are aligned (i went through every single one to double check), plus there are no writes to .1-.7 so the evidence also shows no alignment issue. I submitted a case to have an engineer verify there is no issue, but I was wondering if anyone else has seen partial writes like this on a VMware iSCSI lun. The partial writes typically hover around 30%, but can vary. They were at 75% at one point today.

This is odd....

Are you using flexclone or ASIS or anything like that? When you allocated the LUN, which LUN type did you set?

# ¿ Dec 24, 2009 18:42

1000101: May 14, 2003; BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY FRUITCAKE!

Lots of local storage? If you've got no specific need to replicate and you're okay running in diminished capacity for a short term and you can't get budget for another HP EVA it may not be worth building another SAN off site.

Does that EVA support data replication? If it does and you buy something that its not compatible with today you might be shooting yourself in the foot a year from now.

Optionally if thats not a big deal, then iSCSI with LeftHand might be appropriate.

# ¿ Jan 9, 2010 22:38

1000101: May 14, 2003; BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY FRUITCAKE!

adorai posted:

you need logzillas and readzillas to get any kind of performance out of the sun gear.

You should align both the VMFS volume and the VMDKs. Otherwise, a single 4k write could require 2 reads and 3 writes, instead of just one write. By aligning all of your data you will likely see a 10% to 50% performance improvement.

edit: about the alignment. Here is your unaligned data:
code:
VMDK                    -------111111112222222233333333
VMFS             -------11111111222222223333333344444444
SAN       -------1111111122222222333333334444444455555555 
Each set of numbers is a 4k block, and each ------- is the final 3.5k of your 63 sector (31.5k) offset. Notice how to write the block of 2s at the VMDK level, you have to write to both the 2s and the 3s of the VMFS level, which will require you to write to the 2s, 3s, and 4s at the SAN level. More importantly, a partial write requires you to read the existing data first. The problem is amplified at the single block level, with larger datasets the impact is less dramatic however it still exists. Here is what it would look like with an extra 512 bytes (aligned at the 32k boundary):
code:
VMDK                      ------- 111111112222222233333333
VMFS              ------- 11111111222222223333333344444444
SAN       ------- 1111111122222222333333334444444455555555 
A write of the 2s at the VMDK level requires you to write the 3s of the VMFS level, which only requires a write of the 4s at the SAN level, and does not require ANY reads of the SAN.

I'm stealing your 1's 2's and 3's diagram to help explain this concept to customers. It communicates this very well.

# ¿ Jan 14, 2010 00:33

1000101: May 14, 2003; BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY FRUITCAKE!

Rolling your own or software solutions do have their place but I wouldn't say they were a better alternative to purpose built devices. Just an alternative.

Depending on requirements and desired features and how much management pain you want to deal with, even a NetApp FAS 2040 could be a decent solution. Also, I think most if not all the volume size limitations go away with OnTap 8.0.

That and sometimes its nice to have a drive sitting on your desk when one failed the night before because of call home support features.

# ¿ Mar 31, 2010 23:31

1000101: May 14, 2003; BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY FRUITCAKE!

Vanilla posted:

So I have a ton of perfmon stats from a certain server.

What tools do you use to analyse these? I know there's the windows Performance Monitor tool but i've found it a bit 'hard'.

Do you know of any third party tools for analysing permon outputs?

Export .csv's and you can probably feed the data into esxplot:

http://labs.vmware.com/flings/esxplot

I regularly use this to parse through 1-2GB of esxtop data at a time when doing performance troubleshooting.

It might work with generic windows counters too. I'm guessing it just plots whatever is in the csv.

# ¿ Apr 28, 2010 22:15

1000101: May 14, 2003; BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY FRUITCAKE!

adorai posted:

i always see how many nics people use for iSCSI and wonder, are they going overkill or are we woefully underestimating our needs. We use 3 nics on each host for iscsi, 2 for the host iscsi connections and 1 for the guest connections to our iSCSI network. We have 6 total nics on each of filers, setup as 2 3 nic trunks in an active/passive config. We have roughtly 100-120 guests (depends if you include test or not) and don't come close to the max throughput of our nics on either side.

All I can say is monitor your server's throughput and see if you're bouncing off the 1 gig ceiling. Most of my customers use 1 active gigE NIC for iSCSI with a standby and have no performance issues.

# ¿ Aug 13, 2010 01:36

1000101: May 14, 2003; BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY FRUITCAKE!

adorai posted:

I certainly agree with the idea, and as an individual system administrator I am all for combining the links, but it's hard to go against such an easy to implement best practice. However, I am very glad you posted the links, because we are rebuilding our entire VMware infrastructure over the next few weeks so we'll certainly be able to consider doing so.

One thing you can do is setup your vSwitch uplinks as active/standby so your VMs and VMotion traffic all hang out on one 10 gig link while iSCSI uses the other. The only point in which they share is going to be when a NIC/path fails.

I think with 10gigE that best practice will gradually phase out and be replaced with logical separation and traffic shaping. It takes a LOT of work to saturate a 10 gig link.

# ¿ Sep 24, 2010 18:44

1000101: May 14, 2003; BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY FRUITCAKE!

Some Consultant posted:

CONCLUSION
It is the conclusion of the consultingCompany team that during the times of high disk utilization, the four (4) 1Gbe iSCSI connections are experiencing periods of link saturation. Due to the overhead associated with the TCP/IP protocol, 1Gbe connections typically have a sustained throughput ceiling of approximately 40MB (megabytes) per second, as is reflected in the collected data. This limitation is causing disk contention as the demand for disk access is higher than the available throughput. The high level of disk queuing during these periods of increased utilization further supports this finding.

Is this screenshot all the data he is basing his findings on? I can't get anything conclusive out of that chart except that whatever system he is on is queuing on occasion.

There is even a spike to the far right which appears to refute his claim.

Did he look at the storage via any sort of performance tools they have at all? You can't really troubleshoot a storage performance issue by looking at one host connected to it. I hope he had more data to share but its possible he doesn't.

quote:

RECOMMENDATIONS AND NEXT STEPS
The performance data shows that the current four (4) 1Gbe connections to the iSCSI SAN does not offer enough throughput for the hosted workload during times of high utilization. The configuration of the underlying iSCSI implementation adheres to most of Microsoft's best practices. Due to the throughput limitations associated with implementing iSCSI over 1Gbe, there are few options for increasing the disk performance with this current infrastructure. Setting the Delayed Acknowledgement option on DB1 and DB2 to �0� could reduce the number of packet retransmits and enabling Jumbo Frames (MTU size increased to 9000 bytes) will provide a lower number of packets sent across the network. These two options could provide better efficiency over the iSCSI network; however, the 1Gbe links will still pose a limitation to the total available throughput to the storage system.

As the system's workload continues to increase, it would be recommended to consider upgrading the existing iSCSI implementation to 10Gbe or converting the storage system to 4Gb or 8Gb Fiber Channel. Both alternatives provide greater throughput to the storage system and can better accommodate the increased workload.
It was noted while on site that companyName has a number of similar storage system within the production environment. It would be further recommended to investigate the performance of these systems as well and determine if an enterprise class consolidated storage system would better suit the business needs. An enterprise level storage systems can also provide multiple connection options, such as iSCSI, FCP, CIFS and NFS and provides the flexibility to allocate storage based on service requirements and performance demand while lowering the associated administrative overhead.

Before you jump on this boat I'd make sure of the following:

1. You have enough disks to support the workload
2. You're using an appropriate RAID level for the workload
3. Enable flow control on your switches if your storage supports it
4. Make sure your storage is on the same switch as your database servers (multiple hops with iSCSI can be bad depending on network topology)
5. Look at storage specific performance counters and look for queuing etc.
6. What are your server's CPUs doing at these peak times? Its possible that your application is choking out the CPU which means less time spent on a software iSCSI initiator.
7. What are your switches doing?

Basically he is proposing a complete rip and replace which can be costly (and low probability of actually being needed) but I don't think he has done an adequate job of deep diving the root cause. I've personally seen 90-100MB/sec throughput on Celerra and NetApp on gigabit. There is just no way there is over a 60% protocol overhead from TCP at play here.

Basically any one of the factors I listed could cause things to look like "gigabit sucks" when it might actually be something like using RAID 6 on a platform with a slow implementation (I picked this because EQL RAID 6 is pretty slow.) So what happens is the disks spend a shitload of time writing parity and the controllers are going apeshit and ignoring everything while your host is spraying shitloads of data at it and all the controller does is drop it and tell it to resend later.

If your the host you start queuing things up and your network utilization goes up but the iSCSI initiator isn't getting any sort of confirmation on writes or even able to read hence network looks higher.

# ¿ Oct 18, 2010 17:48

1000101: May 14, 2003; BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY FRUITCAKE!

oblomov posted:

I keep waiting for someone other then Sun to do decent SSD caching deployment scenario with SATA back-end. NetApp PAM cards are not quite same (and super pricey). EMC is supposedly coming out (or maybe it's already out) with SSD caching, but we are not an EMC shop, so I am not up to date with all EMC developments.

Equallogic has a mixed SSD/SAS array for VDI which we are about to start testing right now, not sure how that's going to work out in larger quantities due to pricing. They really need dedupe and NFS as part of their feature suite.

EMC sort of does it via what they call FAST; specifically with "sub LUN tiering."

Essentially they break the LUN up into 1GB pieces and promote data to its relevant spot. Where data lands depends on frequency of access. I dunno if writes default to SSD though and I don't have a Powerlink account to get the information.

# ¿ Oct 28, 2010 18:29

1000101: May 14, 2003; BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY FRUITCAKE!

Vanilla posted:

Good kit. Simple to use, some good features, does the job but expensive - the cost is between mid tier and high end. Those that use it love it.

They get great survey ratings. Those industry surveys that ask if you'd use your current vendor again - 3PAR get 100%.

Also - 3PAR is now owned by HP just FYI.

We generally hear nothing but positive feedback about 3par in the field as well. In fact the only real negative thing I've heard is that its hard to find people that really understand 3par/know it inside and out for services.

quote:

How am I getting myself in trouble though? If the customer has all software on their Oracle application tier (not just our product) running off this mount and the mount locks up during heavy load times how is this our problem? All we hear is third-hand information from the unix admins who relay to the DBAs who relay to the application support people who tell us that there is "some sort of iNode limit that has been reached and you need to fix it". Our product has no file leaks and we verify all calls to open/read/write to files. At peak times we maybe have a maximum of 20 files open.

I guess the problem is its hard to put a finger on your core issue because you keep flipping between SAN/NAS contexts. If its an inode limit on an NFS volume then you can check it pretty quick with a 'df -i' on the NFS server.

I can't be absolutely certain though because all of this inode talk is peppered in with SAN talk and you said yourself a lot of this information is 57th hand so god knows what the telephone game has done with it.

quote:

How are we supposed to anticipate any type of issue that may come up with whatever file system implementation our customer may have? I suppose it would be easy to say hire somebody with experience who knows about Veritas clusters if we only supported Veritas clusters but we have a vast number of customers on SANs and NAS running Linux, AIX, Solaris, HP-UX, Windows, etc. and god knows how many different storage vendors.

I think really getting people trained up or knowledgeable of core technologies can be tremendously helpful. I don't think its really cost effective to go get an EMC expert, a NetApp expert, an HDS expert, etc. You can however look for folks with some experience managing iSCSI/FCP environments who can help you understand what you're getting into and more importantly to clearly communicate to both your support and your customers.

At the end of the day, iSCSI and FCP are means to get SCSI commands from your server to your disks and nothing more. If your filesystem says "oh poo poo I'm out of inodes!" then the SAN is just going to deliver the message back up to the host but it plays no part in the game beyond that.

quote:

I guess my point is that I don't think it is out of the realm of reasonable thought for a software vendor to expect a customer to provide a stable file system for our software to run on, yet so many people think it is.

Remember that iSCSI is no more a filesystem than SCSI, SATA, or SAS. It's a means to get to a disk; plumbing if you will.

Once you present your iSCSI device to a server (either via a hardware HBA or software initiator) it shows up as just another "plain Jane" SCSI device. You're going to fdisk it and mkfs it just like you would if you'd plugged a disk into a SAS card.

Here is the caveat; you want to make sure the network supporting iSCSI is absolutely healthy and at the very least gigabit. You probably want switches with relatively deep packet buffers and flow control (read: not dlink/linksys specials from bestbuy.)

quote:

In general SANs had been fine up until lately when we have one customer with a SAN that has mounted partitions hanging indefinitely when they run into some type of iNode limit.

Are you sure this is the problem? It doesn't make a lot of sense in my head when I think about it. inode limitations would be a filesystem issue not a block device issue.

# ¿ Dec 13, 2010 22:37

1000101: May 14, 2003; BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY FRUITCAKE!

adorai posted:

maybe he has a netapp with a million luns in a single volume

You win this round Adorai.....

That said if it is a NetApp then provision a new volume (not a qtree in an existing volume) to house your oracle database. Ideally you want this on an aggregate not already supporting a billion other things.

I would only do this with Oracle though. Never MSSQL/MySQL/PGSQL/etc. With those I would use iSCSI with NTFS/ext3/whatever if I didn't have FC available.

# ¿ Dec 14, 2010 01:07

1000101: May 14, 2003; BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY FRUITCAKE!

Dell! Where good technology goes to die!

I hope they're forking out good amounts of cash to keep the engineers on for a couple more years. I just can't fathom much good coming out of that merger.

# ¿ Dec 14, 2010 21:38

1000101: May 14, 2003; BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY FRUITCAKE!

InferiorWang posted:

The piece I really need to dig into is how I would then make that DR site live. I'm thinking a third VMWare host would be at the DR site and we'd bring the most critical virtual machines online. I work at a K-12 school district so downtime doesn't mean millions of dollars in business being lost every second of downtime. Automatic fail over is not a concern. However, having a relatively straight forward and documented plan to cut over to a DR site is what I'm looking for, even if that cut over takes a couple of hours to get going.

I think you're walking down the right road for your budget. I assume the DR site is probably running free ESXi so you'll have to do a couple things:

1. Enable sshd/remote support shell on the ESXi server
2. Create yourself a handy recovery script to script bringing the volumes up! Now you have something you can type in, get a cup of coffee then come back.

Things you'll want to know how to use in the script:

Check with the VSA to figure out how to present your replicated LUN to your server. It probably entails breaking replication and making it writable. Ideally this can be scripted via the CLI.

Next some VMware specific stuff:
'esxcfg-volumes' -This command lets you tell VMware its okay to mount a replicated volume. You'll want to let it resignature the LUNs in question.

'esxcfg-rescan' -Use this to rescan the iSCSI initiator after you present the LUN and allow for a re-sign (don't recall if this is 100% required in 4.X anymore, the last script I wrote for this was for 3.5.)

Since its ESXi you're going to want to fiddle about in 'vmware-vim-cmd' (this is a way to get into the VMware management agent via the CLI) and feed it arguments via a find of .vmx files in your new datastore after the re-scan.

At this point you can actually use vmware-vim-cmd to power everything on for you and answer the question "hey did you move this or copy this?" (you probably just want to say you moved it.)

I had to build something like this for one of my customers who's outsourced IT is probably worse than herding cats. I use ssh key authentication for everything and all some guy has to do is run "StartDR.sh" and the script does everything he needs.

# ¿ Jan 6, 2011 09:14

1000101: May 14, 2003; BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY FRUITCAKE!

szlevi posted:

Well, MPIO can be configured several ways... generally speaking it's typically not for redundancy but rather for better bandwidth utilization over multiple ports - if you are running on a dual or quad eth card then it cannot protect you from any failure anyway.
Also vendor's DSM typically provides LUN/volume location awareness as well - when you have larger network it makes a difference.

I picked this post because MPIO typically is for redundancy in conjunction with distributing load over a lot of front end ports on an array. Even if you're using iSCSI with a quad port gigabit card and you lose half your switches you'll still be able to get traffic to your storage. Every IP storage design we've ever done has involved a pair of physical switches just to provide redundancy.

quote:

Err,

1. FOM is exactly for that, right, not to manage anything....
2. ...maybe you're confusing it with P4000 CMC?
3. Wait, that's free too...

I have no idea what FOM is; but does it also send the correct instructions to VMware to attach the volumes? Will it also allow you to automatically re-IP your virtual machines? Does it handle orchestration with external dependencies as well as reporting/event notification? Does it also provide network fencing and set up things like a split off 3rd copy to avoid interrupting production processing during DR testing? Does it handle priorities and virtual machine sequencing? Does it integrate with DRS and DPM?

quote:

Which is exactly HA, with a remote node, right.
DR would be if he would recover from it, as in Disaster Recovery.

HA is typically referred to for localized failures. i.e. one of my server's motherboards just died on me and I want to bring my applications up from that failure quickly.

When we talk in terms of DR, we typically speak in one of two things:

1. Someone just caused a major data loss
2. My data center is a smoking hole in the ground

HA does NOT protect you against #1 and you're still going to tape. That said, in the event of #1 you're not going to necessarily do a site failover (unless you're like one of my customers who had a major security breach.)

In the event of #2; we're talking about moving all of the business processes and applications to some other location which goes far above and beyond typical HA.

quote:

They exist but not for free.

Which features that are worth it are you talking about? I guess you could say a centralized management console (which I believe actually is still free with Xen, as well as live migration.)

Also, for the love of whatever's important to you, consolidate your posts into one. There's no reason to reply to 6 different people separately.

# ¿ Jan 17, 2011 08:31

1000101: May 14, 2003; BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY FRUITCAKE!

This is longer than I intended it to be.

szlevi posted:

You're right - hey, even I designed our system the same way - but I think you're forgetting the fact these iSCSI-boxes are the low-end SAN systems, mainly bought by SMB; you can argue about it but almost every time I talked to someone about MPIO they all said the only reason they use it is the higher throughput and no, they didn't have a second switch...

Maybe in the low end market where sales guys let customers make bad decisions this is true.

I've been hard pressed to find a lot of small businesses that actually need to have more than ~1gbp/s of bandwidth to the storage. I've been to plenty of shops running 1500+ user exchange databases over a single gigabit iSCSI link with a second link strictly for failover.

What you're seeing is not the norm in any sane organization.

I guess the exception is in dev/QA environments. I'll give you that!

quote:

Well, that's the whole point: some I'd think you do at SAN level - eg detecting you need to redirect everything to the remote synced box - and some you will do in your virtual environment (cleaning up after the fallback etc.) I don't run VMware on Lefthand so I'm not the best argue about details but I read about it enough in the past few months to know what does it supposed to do..

Redirecting data to a remotely synced up site doesn't provide you everything. If I move 2000 or 20 virtual machines from one physical location to another physical location then there are good odds I have a shitload more work to do than just moving the systems.

The parts you do at the SAN level would be getting the data offsite. Once the data is at the new site you have a host of questions to answer. Stuff like:

Am I using spanned VLANs? If not how am I changing my IP addresses?
How are my users going to access the servers now?
Since I only have so much disk performance at my DR site, how do I prioritize what applications I bring online first?
What about data that I'm not replicating that needs to be restored from tape/VTL?
Do I need to procure additional hardware?
...

What about testing all of this without impacting production OR replication?

quote:

Well, that's kinda moot point to argue about when Lefthand's big selling point is the remote sync option (well, up to ~5ms latency between sites) - they do support HA over remote links (and so does almost every bigger SAN vendor if I remember correctly.)

This is synchronous replication and can really only happen within about 60-75 miles. Hardly appropriate for a good portion of disaster recovery scenarios people typically plan for (hurricanes, earthquakes, extended power failures.)

Yes I can use SRDF/S or VPLEX and move my whole datacenter about an hour's drive away. Is this sufficient for disaster recover planning and execution? Probably not if say hurricane katrina comes and blows through your down and knocks out power in a couple hundred mile radius.

quote:

Correct but HA has never been against data corruption, I have never argued that - data corruption or loss is where your carefully designed snapshot rotation should come in: you recover it almost immediately.
OTOH if I think about it a lagged, async remote option might be even enough for corruption issues...

Are you a sales guy?

I said that there are a lot more components to DR than just "failing over" as a generic catch-all.

Depending on async replication to protect you against data corruption issues is insane. "Oh poo poo I've got about 30 seconds to pull the plug before it goes to the remote site!" There's no "might even" about it. Depending on that is a good way to cause a major data loss.

No one does this.

Also snapshots shouldn't be the only component of your disaster recovery plan.

quote:

Not anymore. Almost every SAN vendor offers some sort of remote sync with failover - I'd consider these HA but it's true that lines are blurring more and more.

HA != DR which was the point I was trying to make earlier. Lets take one of my customers as an example:

Site A and Site B are approximately 50 miles apart. Site C is approximately 2000 miles from Site B.

Site A and Site B are basically a giant geocluster. Same exact layer 2 networks spanned between datacenters and sync replication between the facilities. Site B is intended to provide HA to Site A.

Site B also does asynchronous replication to site C. This is intended to provide DR recovery in the event of a regional outage (say, an earthquake.) Coincidentally this site also house all of the datadomain remote backups.

So site B provides HA to site A; but site C is strictly for disaster recovery purposes. You plan completely different for either event.

In the event of a major disaster a whole lot of things need to happen at site C. Examples include firewall reconfiguration; attaching volumes to servers and accessing the data and the lengthy restore process from the datadomains. ESX for example won't just attach a snapshot copy of a LUN, you need to instruct it that its okay to do so.

quote:

The very feature we're talking about here: failover clustering.

No, we're talking about "oh poo poo my local site is gone and I need to move my business to a remote site"

Depending on a number of factors it may not be as simple as just clicking a button in an interface and saying "well there we go! We are up and running!"

Failover clustering won't handle ANY of the orchestration thats required for bringing a shitload of systems online in DR nor does it actually provide much functionality for just testing without impact to production.

quote:

I think it's more polite especially when I wrote about different things to each person - they don't need to hunt down my reply to them...

It needlessly fills up the thread with a bunch of posts and makes it harder to follow what you're saying. In a way its less polite to keep doing this.

Anyway the TL;DR portion was that not every product your company sells fills every niche/use case and you should really look at core customer requirements before you run off screaming gospel.

Mausi posted:

Well unless your definition of SMB scales up to state government then your assertion draws from limited experience. Lefthand kit is used, in my experience, in both state government of some countries as well as enterprises, albeit outside the core datacentre.

There's a company hosting around 6000 virtual machines for their production/public facing infrastructure that is entirely supported by scaled out LeftHand boxes and they have over 10,000 employees.

iSCSI is a very popular storage protocol because its cheap and easy and completely appropriate for a lot of applications. So I'd say his definition of SMB should include the enterprise space as well.

# ¿ Jan 18, 2011 20:42

1000101: May 14, 2003; BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY FRUITCAKE!

szlevi posted:

No offense but that's the typical problem with all "regular" storage architects: they can only think about Exchange/OLTP/SQL/CRM/SAP etc.

I look at the core business processes and the applications that support them. I build from there.

If these core applications don't work/behave then the storage is a waste of money and I have done my customer a huge disservice.

Or are you proposing we do "cowboy IT" and just assume that more bandwidth and more disks is the solution to every problem?

quote:

Let's step back for a sec: you know what most SMBs say? The "network is slow" which in reality means they are not happy with file I/O speeds. Your 1Gb/s is 125MB/s theoretical - which is literally nothing when you have 10-20 working with files especially when they use some older CIFS server (eg Windows 2003 or R2).

I don't understand what you're saying? Are you saying that 100+MB/sec isn't sufficient or are you saying that the older CIFS server isn't sufficient? 10-20 people working on gigabit is ~6-12MB/sec which is generally pretty close to the performance of local disks so its not going to be a plumbing issue. This of course assumes that 100% of the users are trying to use 100% of the bandwidth at the same time.

It sounds more to me like there may not be enough disks to reach that peak performance number so maybe I'd look there instead of saying "dude you need to add more bandwidth via MPIO!"

The rare exception might be media production in which case yes, 125MB/sec isn't enough. Of course we're talking about core business applications (like for some it might be Exchange) and that's the primary driver for storage.

I'll say it again since you seem to have missed it the first time:

...you should really look at core customer requirements before you run off screaming gospel.

quote:

True but that's what these vendors promise when they market their sync replication and failover.

What vendors? If we look at your favorite, EMC, they promote SRDF, MirrorView, and RecoverPoint as a way to get data somewhere else. They speak nothing of the recovery of that data but they do provide enough of a framework to make it happen. It's still up to me to do things like replay transaction logs or start services or whatever has to happen.

I guess you could look at NetApp but they also offer some nice application integration tools to easily restore.

quote:

I'm not sure why are you asking me but here's how I understand their claims: your site A goes down but your FOM redirects everything to your site B, sync'd all the time, without any of your VMs or file shares noticing it (besides being a bit slower.) This is in lieu w/ your virtual HV and its capabilities, of course.
Heck, even Equallogic supports failover albeit it won't be completely sync'd (they only do async box-to-box replication.)
Am I missing something?

For this to work you have to assume that the remote site has the same IP address space in it, routers, switches, other dependencies, firewalls, load balancers, etc. You also have to consider whether or not your end users can even access the servers in the new location and how that access happens.

You're basically missing the other 98% when we talk about moving poo poo into a DR site.

So yeah, my VMs might come up at the other side but what can they talk to? FOM sounds like its great if you're scaling out in the same datacenter for availability. If I need to move all of my virtual machines to a remote site 600 miles away it sounds like a completely inappropriate solution without layer 2 stretching.

quote:

I disagree.

If you actually do believe this then you're nuts and probably have no business managing, touching, or consulting on storage and especially disaster recovery.

quote:

Never said that - I said it's good against a crazy sysadmin or idiotic user, that's it.

Will cont, have to pick up my daughter.

It's not nuts. Lets assume the systems admin is a moron and he goes to replace a failed disk in the array "oops that whole raid group is gone!" and there went his snapshots.

At that point you're going to the part of your DR plan which involves restoring from tape. Depending on your storage you may end up re-initializing replication after the restore (highly likely.)

# ¿ Jan 19, 2011 19:31

1000101: May 14, 2003; BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY FRUITCAKE!

Misogynist posted:

I don't get it. If you had replication in the first place, why would you need to go to tape?

If you're replicating to another node in the same datacenter then yeah hit the failover button because thats why you're doing that.

If its offsite though things might change.

Lets assume I got 500GB at site A being replicated offsite to another datacenter and my link to that datacenter is 10mbp/s.

Some reasons to not fail over completely might be that its only 1 of 3 or 4 raid groups so only one major application is actually down. Do we move all of our operations over to the DR site for this one outage?

When you answer that question, consider the costs of re-syncing arrays if you do fail everything over. We had a customer do this to a 800GBish data volume by accident over a 6mbp/s MPLS circuit. Thankfully it was NetApp so we snapmirrored everything to a tape; overnighted the tape and gave the remote array a good starting point. Instead of copying 800GB of data we only had to do about a day and a half worth of changes.

Other vendors you might be shipping entire disk arrays back and forth.

If my whole business is run off that one raid group then yeah it may qualify as a disaster so lets go ahead and move everything offsite.

I see it happen a lot when people talk about DR planning. The focus is always on getting the data somewhere and maybe getting the data restored. I end up worrying about all the crap that comes afterwards though. Stuff like "okay I moved my poo poo to the houston datacenter but how do the users access it now?" My general assumption is that small to medium businesses aren't likely to have stretched VLANs or anything like that so there are probably a lot of other things that need to happen at the DR site to bring your apps online.

# ¿ Jan 19, 2011 22:43

1000101: May 14, 2003; BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY FRUITCAKE!

conntrack posted:

You mean getting the same effect as giving the applications dedicated spindles?

On our vmax we have different service tiers defined so we can get the best bang for our buck. This is a process/SLA driven thing to make sure apps that need performance all the time can get it.

Some apps will live entirely on dedicated spindles and we charge back more to the business unit that owns the app.

quote:

That depends on who you talk to, I personally share your view in this matter. A lot of people see it as a way to fire all those crusty storage guys though.

Why doesn't the vmax virtualize external storage? Any one?

It isn't really what the vmax was designed to do. The VMAX was built as a powerhouse array that's intended to be extremely reliable and extremely fast. I think EMC support/engineering would rather focus on keeping it that way rather than spend resources making it talk to other arrays.

Edit:

There is always the comedy Invista option! It's just not something EMC has been very interested in doing.

1000101 fucked around with this message at 18:05 on Mar 4, 2011

# ¿ Mar 4, 2011 18:02

1000101: May 14, 2003; BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY FRUITCAKE!

Maneki Neko posted:

iSCSI is generally free on NetApp filers, its NFS that costs $$$ (although I'm not sure what kind of terrible stuff IBM might pull licensing wise on these). There's some nice things about Netapp + NFS + ESX, but iSCSI works fine there too if you're in a pinch and your company doesn't want to get bent over on the NFS license.

There's some circumstances where you might want to have a VM mounting an NFS volume or an iSCSI LUN vs just creating a local disk on a datastore, but that really depends on application needs, etc.

As you mentioned, CIFS is generally just used on the filers as a replacement for an existing windows file server.

I believe NetApp has switched to a "First protocol is free" pricing scheme.

# ¿ Mar 30, 2011 04:41

1000101: May 14, 2003; BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY FRUITCAKE!

Serfer posted:

Good thing Windows Servers rarely have important security patches that require reboots.

Maybe you can reboot the storage servers while you're rebooting the servers that are connecting to it for their patches.

edit:

Looks like it supports some form of HA so you can patch node A; reboot it; then when its back repeat on node B.

edit2:
http://technet.microsoft.com/en-us/library/gg232621%28WS.10%29.aspx

1000101 fucked around with this message at 18:46 on Apr 5, 2011

# ¿ Apr 5, 2011 18:40

1000101: May 14, 2003; BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY FRUITCAKE!

Syano posted:

This may be better suited to the virtualization thread but whatev I will give it a go here: We are setting up a Xenserver environment on top of HP Lefthand storage. Reading through the best practice guide published by HP they recommend as best practice having 1 VM per 1 volume per 1 storage respository. Is there any reason why?

It likely comes from Xenserver not necessarily having access to a clustered filesystem out of the box besides NFS. I THINK it can work with GFS which can allow you to break that rule but I'd check with your Xen vendor's support matrix just in case.

Otherwise you're probably using ext3 which isn't clustered and you'll have to do 1 VM per volume. Otherwise you'd have to migrate ALL the VMs to another box at the same time when you're doing things like XenMotion or experiencing failover, etc.

# ¿ Jun 16, 2011 00:07

1000101: May 14, 2003; BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY FRUITCAKE!

Internet Explorer posted:

That doesn't sound right to me unless I am misunderstanding you. You can have two Xenserver hosts sharing the same data store. That is how they can move VMs from one server to the other. Not sure what the LUNs are formatted with specifically.

Edit2:

I think things have changed enough since I last used Xen to safely ignore anything I have to say on the subject!

Thanks for the clarification!

Thats probably why they say 1 VM per volume though.

edit: hard as hell to confirm; but can someone with Xen confirm whether or not 2 or more hosts can have concurrent write access to the same LUN?

1000101 fucked around with this message at 04:32 on Jun 16, 2011

# ¿ Jun 16, 2011 00:32

1000101: May 14, 2003; BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY FRUITCAKE!

Internet Explorer posted:

Like I said, maybe I am misunderstanding. I run a small XenServer farm and have a single LUN that is mapped to multiple servers, using iSCSI, and they both can allocate space to it at the same time, both can put vDisks in it, and both can read or write on the vDisks. Obviously they can't write to the same vDisk at the same time, but as far as I know no one can do that if Windows is the guest OS.

So basically it is setup like this:

XenServerGeneric LUN, which has GuestOS1 vDisk, GuestOS2 vDisk, GuestOS3 vDisk. Both hosts, XenServer1 and XenServer2 can access the XenServerGeneric LUN and read/write to vDisks on it.

[Edit: Just remoted in to the XenServer farm in question and XenCenter calls it specifically an iSCSI SR (Storage Repository) and under type it says LVM over iSCSI. But this is presented from our SAN as just a standard iSCSI LUN, so I am assuming LVM is the file system XenServer uses for that. I think locally they use ext3.]

Ah, so its smart enough to let multiple hosts access the LUN but not screw with other VMs! Snazzy and I appreciate your confirming! It's been a few years and most of what I do is in the VMware world.

# ¿ Jun 16, 2011 04:33

1000101: May 14, 2003; BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY FRUITCAKE!

adorai posted:

for the second time in two years, NetApp has told us that we have to completely power down our HA pair to clear an error.

Thanks alot assholes.

Just curious; what error was it?

# ¿ Sep 9, 2011 05:00

1000101: May 14, 2003; BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY FRUITCAKE!

marketingman posted:

Even more simply, when I'm doing a requirements gathering exercise I am without fail told "oh don't worry about those data sets, they aren't important" and my response is the same every single time:

"Not important? Then why are you storing it?"

Well to be fair there are plenty of use cases for transient storage that are not worth spending a lot of money to protect.

A good example might be a bunch of VMs in VMware lab manager/vCloud director. These things all get spun off of base templates that might actually be backed up but the storage you need is mostly a scratch space because you need a place to store this temporary data to run tests.

The test results and source code of course are safely stored/backed up/protected but the actual VMs hold little to no value. If I lost 100 VMs because my cheap storage fell over then I can get them back in a day's worth of work. Unless that day's work of lost productivity costs me more than buying a bigger/better storage array then it doesn't make sense.

That said I ask that question too and hope to get some sort of intelligent answer back. Some people can articulate the above and others can't. If they can't I tend to err on the side of caution.

# ¿ Jan 12, 2012 07:55

1000101: May 14, 2003; BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY FRUITCAKE!

somecallmetim posted:

Anyone here have any experience with Dot Hill? We have about a 20k budget for this project and have looked at EMC's offering already (VNXe3300). No room in the budget for a second SAN right now, but we do have monies for a DR project later on this year. I might be able to cut out some of the software off of the quote, but I hear it is more expensive later on.

We are looking at using it for a small SAP upgrade running on VMWare. As of now we are using VMWare (Vsphere 4.1) using local storage.

Dot Hill... Now that's a name I haven't heard in a long time. Mostly because they typically OEM to folks like Sun and HP. The products are okay but won't have a lot of frills.

That said you can probably get into NetApp or EMC the same price range. You'll want to make sure you consider the replication technologies available with each vendor if you're planning to do that. Also any recovery tools you might be using (such as VMware site recovery manager, which also offers host based replication.)

I'm personally a huge advocate of SnapMirror from NetApp and RecoverPoint from EMC myself. I've worked with both products extensively for the last few years and have had nothing but glowing feedback and successes.

# ¿ Jun 20, 2012 07:30

1000101: May 14, 2003; BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY FRUITCAKE!

Internet Explorer posted:

For a small business, the difference in administering an Equallogic SAN or an EMC/NetApp SAN is like night and day. The only time I ever had to go into command line on our Equallogic boxes was for one of the very first firmware upgrades and the single and only time I had to call in for support. In over 5 years I went into command line twice and called in for support once.

One of my old clients is a small/medium sized business running Exchange on a NetApp and I don't think they've ever touched the CLI once. They however do love Snap Manager for Exchange and Single Mailbox Restore for Exchange. NetApp has been fairly easy to manage for even non-storage people for at least since OnTap 7. If you can figure out how to run a windows server you can figure out how to manage a NetApp filer.

The other plus with NetApp is I find that its harder for companies without dedicated storage people to back themselves into a corner with their configuration.

Also FlexClone is awesome.

Not being confrontational either I just thought I'd share my experiences. NetApp 3rd party tie-in tools are fantastic things to have and usually worth the extra cost.

# ¿ Aug 2, 2012 01:38

1000101: May 14, 2003; BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY FRUITCAKE!

Madsushi is absolutely right with most low end to midrange (and some higher end) though some storage arrays can support every disk being accessed/written to by every controller at the same time. The EMC VMAX for example is just such an array. I could have 8 controllers in it and access the same LUN from all 8 controllers without having to go through any cluster interconnects or resort to "LUN trespassing."

I think it's more true to say that a NetApp and VNX array would be active/passive and passive/active. i.e. you'll expose half your storage resources from one controller node; then the other half through the other controller node. Then you turn on ALUA to sort everything out for your hosts connecting.

Here's a blogpost I dug up on the subject (not mine): http://virtualeverything.wordpress.com/2011/04/25/vmax-on-a-clariion-planet-part1/

1000101 fucked around with this message at 22:32 on Aug 12, 2012

# ¿ Aug 12, 2012 22:08

1000101: May 14, 2003; BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY FRUITCAKE!

BnT posted:

In general I agree with you, but one of the major issues with non-dedicated iSCSI switches is you then have to consider and design for STP convergence. A convergence can have a nasty impact on iSCSI traffic, even if the failed switch doesn't touch your iSCSI traffic.

You can avoid this by turning on 'portfast' (assuming cisco switches) to prevent iSCSI initiators/target ports from sending/responding to a TCN.

You can also opt to use rapid spanning tree (also with portfast) or just using a VLAN that doesn't have loops in the topology and shutting off spanning tree for that specific VLAN.

Also upping your SCSI timeout to account for STP convergence is a good idea as well.

quote:

The reason FC is considered more stable is because it uses it's own switches and people don't usually gently caress with it and gently caress it up.

FC also scales better as your SAN gets larger (more initiators and targets.) I would hate to have to manage 1000 hosts on my SAN using iSCSI but it's not uncommon to see that many initiators on one SAN in larger datacenters. FC load balancing is also generally better overall (at least until more of these datacenter fabrics become commonplace.) Four 8gig FC links used as an ISL can generally be treated as a logical 32 gig link whereas four 10 gig links may not balance traffic perfectly among all members of that bundle so it's possible not to see even half of 40 gigabit.

Also FC switches are aware of all the other switches in the topology and where everyone is plugged in. This means you don't need to worry about things like loop prevention since it's inherent to the protocol.

Note when I say "SAN" I'm not just referring to the storage but the network, servers and the storage.

VMware MPIO (and EMC PowerPathVE) sorts out the multi-path issues on the host side but once you have to get traffic to another switch you're beholden to all the rules of ethernet and the baggage it carries.

edit: your point about "people don't tend to gently caress it up" holds true as well since most FC deployments have two separate fabrics and it's not uncommon to change one and wait 24 hours before changing the other. So if you do gently caress it up you're generally okay as long as you follow the process.

1000101 fucked around with this message at 06:12 on Sep 11, 2012

# ¿ Sep 11, 2012 05:50

1000101: May 14, 2003; BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY FRUITCAKE!

NippleFloss posted:

The question is how many thousands of dollars that fifth or sixth 9 of uptime is worth, and that is a business decision, not a technical one.

Or when we factor reality in, that 3rd or 4th nine!

# ¿ Sep 11, 2012 05:52

Adbot: ADBOT LOVES YOU

# ¿ May 13, 2024 13:23

1000101: May 14, 2003; BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY BIRTHDAY FRUITCAKE!

evil_bunnY posted:

Nex-One# sho vpc
Legend:
(*) - local vPC is down, forwarding via vPC peer-link

vPC domain id : 1
Peer status : peer adjacency formed ok
vPC keep-alive status : peer is alive
Configuration consistency status: success
Per-vlan consistency status : success
Type-2 consistency status : success
vPC role : primary
Number of vPCs configured : 7
Peer Gateway : Disabled
Dual-active excluded VLANs : -
Graceful Consistency Check : Enabled

vPC Peer-link status
---------------------------------------------------------------------
id Port Status Active vlans
-- ---- ------ --------------------------------------------------
1 Po100 up 1,3-4,50,730-749,920

vPC status
----------------------------------------------------------------------------
id Port Status Consistency Reason Active vlans
------ ----------- ------ ----------- -------------------------- -----------
1 Po1 up success success 731,738-739
2 Po2 up success success 731,738-739
11 Po11 down* Not Consistency Check Not -
Applicable Performed
12 Po12 down* Not Consistency Check Not -
Applicable Performed
13 Po13 down* Not Consistency Check Not -
Applicable Performed
14 Po14 down* Not Consistency Check Not -
Applicable Performed
15 Po15 down* Not Consistency Check Not -
Applicable Performed
[/code]

I'm looking into the encapsulation now.

What does:
show vpc consistency-parameters interface po 11

show?

Also have you shut/no shut the interfaces in recent memory?

# ¿ Sep 21, 2012 03:06

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > Enterprise Storage Megathread: Why is my NAS a SAN?

«‹›3 »