|
Syano posted:That post scares the hell out of me. We are virtualizing our entire network at the moment all on top of a Dell MD3200i. Now when I say our entire network I am really just talking 15ish servers, give or take. But hearing a story of a SAN go tits up even with 'redundant' options gives me nightmares. Lots of people use great resiliency features as an excuse to be lazy and stupid with everything else. I can't even wait to see the havoc that Exchange 2010 wreaks on the most retarded of administrators. H110Hawk posted:Redundant is another word for money spent to watch multiple things fail simultaneously.
|
# ? Sep 21, 2010 13:10 |
|
|
# ? May 11, 2024 16:49 |
|
Misogynist posted:I can't even wait to see the havoc that Exchange 2010 wreaks on the most retarded of administrators.
|
# ? Sep 21, 2010 13:21 |
|
adorai posted:I hope you aren't implying that lazy admins will use replication as an excuse to not backup their data.
|
# ? Sep 21, 2010 13:31 |
|
Misogynist posted:SANs are as redundant as you design them to be. If you put your backups in the same shelf as your production data, you're very likely to be hosed over by your own ignorance at some point no matter how good your hardware is. Well lucky for me I spin my backups to a direct attached disk array on my backup server. Plus each VM host actually has enough local storage to run a VM or two on its own if it had to without the SAN. Still scares me to death though. EDIT: You know what is funny. Since we are actively building this environment I have been keeping my eyes glued to this thread. One of things I saw a couple pages ago was someone mentioning how a network card could go bad. It made me stop and completely redesign my hosts to have 3 NICs minimum, all connected to different layer 2 devices. Syano fucked around with this message at 13:45 on Sep 21, 2010 |
# ? Sep 21, 2010 13:35 |
|
Syano posted:It made me stop and completely redesign my hosts to have 3 NICs minimum, all connected to different layer 2 devices.
|
# ? Sep 21, 2010 14:15 |
|
Well what I have is 3 dual port NICs so a total of 6 gigabit ports, 4 for iSCSI and 2 for guest network traffic. I have been keeping a close eye on network usage and so far it doesnt appear bandwidth is going to be a problem. I could be wrong though
|
# ? Sep 21, 2010 14:32 |
|
Is there anywhere I can find, in plain English, the real-life implications, causes, and resolutions of common Brocade port errors like swFCPortTooManyRdys and swFCPortRxBadOs?
|
# ? Sep 21, 2010 15:56 |
|
Is anyone out there looking at 10GbE?
|
# ? Sep 21, 2010 21:44 |
|
Misogynist posted:Is there anywhere I can find, in plain English, the real-life implications, causes, and resolutions of common Brocade port errors like swFCPortTooManyRdys and swFCPortRxBadOs? Brocade usually says replace or upgrade the firmware
|
# ? Sep 21, 2010 21:55 |
|
Cultural Imperial posted:Is anyone out there looking at 10GbE? Sure, simple answer: Most people do not need 10GbE beyond their core network infrastructure, if then. Give us a little bit of description of where and how you are considering implementing it, and you'll probably get a less general answer.
|
# ? Sep 21, 2010 22:16 |
|
Cultural Imperial posted:Is anyone out there looking at 10GbE? We moved to 10GbE to replace our aging FC infrastructure, any more specific questions?
|
# ? Sep 21, 2010 22:30 |
|
ghostinmyshell posted:Brocade usually says replace or upgrade the firmware Cultural Imperial posted:Is anyone out there looking at 10GbE? Vulture Culture fucked around with this message at 22:43 on Sep 21, 2010 |
# ? Sep 21, 2010 22:33 |
|
Cultural Imperial posted:Is anyone out there looking at 10GbE? Already using it (in limited cases) along with 8Gbit FC.
|
# ? Sep 21, 2010 22:41 |
|
TobyObi posted:Already using it (in limited cases) along with 8Gbit FC. Using 10gbe (over fiber though) with NetApps and Equallogic. Works just peachy. Also using 10gbe from our vmware boxes to save on number of cables from blade enclosures. On Exchange 2010, we are going to go no backup route. However, we'll have 3 DAG copies, including 1 lagged DAG over wan to remote datacenter, and we have redundant message archiving infrastructure as well. MS is also not doing backups on Exchange 2010 as well, and they don't even have lagged copies.
|
# ? Sep 22, 2010 02:04 |
|
oblomov posted:On Exchange 2010, we are going to go no backup route. However, we'll have 3 DAG copies, including 1 lagged DAG over wan to remote datacenter, and we have redundant message archiving infrastructure as well. MS is also not doing backups on Exchange 2010 as well, and they don't even have lagged copies.
|
# ? Sep 22, 2010 02:22 |
|
Misogynist posted:I can sort of understand not keeping backups if you're going with the lagged copies, but running without offline backups is insanely risky in the event that a sysadmin goes rogue and tries to trash the whole environment. Well, yes, that's always a risk. We have gone through some risk / mitigation exercises and we shall see what turns out. There is still a chance we'll do backups. Now, the more interesting problem with Exchange is that right now we are on physical boxes with local DAS storage and may be going virtual for Exchange 2010. That may prove to be interesting doing SAN calcs for that. I figure with new and improved I/O handling for 2010, we could get away running this off SATA SAN, either NetApp or Equallogic. We'll see how our testing goes. I am not sure that virtuals will turn out to be cheaper, considering SAN costs, increased license costs (smaller virtual servers vs. larger hardware ones) and vmware costs, but we'll be doing the number crunching. Anyway, enough derailing the thread .
|
# ? Sep 22, 2010 02:34 |
|
oblomov posted:I figure with new and improved I/O handling for 2010, we could get away running this off SATA SAN, either NetApp or Equallogic. We'll see how our testing goes. I am not sure that virtuals will turn out to be cheaper, considering SAN costs, increased license costs (smaller virtual servers vs. larger hardware ones) and vmware costs, but we'll be doing the number crunching. Anyway, enough derailing the thread . Right now we're running on FC because we had literally terabytes of spare FC capacity just sitting here, but I don't really see any compelling reason why we couldn't run on SATA with the I/O numbers we're pulling off the SAN.
|
# ? Sep 22, 2010 12:52 |
|
This customer has implemented a shiny new V-Max behind SVC for their VMware environment I like SVC I like V-Max SVC + V-Max
|
# ? Sep 22, 2010 13:25 |
|
Reading all these posts about people aggregating 6+ ethernet ports together, I was curious if anyone had thought about using 10GbE instead.
|
# ? Sep 22, 2010 19:42 |
|
Cultural Imperial posted:Reading all these posts about people aggregating 6+ ethernet ports together, I was curious if anyone had thought about using 10GbE instead.
|
# ? Sep 22, 2010 20:19 |
|
Cultural Imperial posted:Reading all these posts about people aggregating 6+ ethernet ports together, I was curious if anyone had thought about using 10GbE instead.
|
# ? Sep 22, 2010 22:20 |
|
adorai posted:It's not just about speed, it's also about redundancy and network segregation. I am not going to run iSCSI, management, and guest networking on one link, even if it is 10GbE, because I don't want a vmotion to suck up all of my iscsi bandwidth. http://blog.aarondelp.com/2010/09/keeping-vmotion-tiger-in-10gb-cage-part.html http://www.mseanmcgee.com/2010/09/great-minds-think-alike-%E2%80%93-cisco-and-vmware-agree-on-sharing-vs-limiting/ http://bradhedlund.com/2010/09/15/vmware-10ge-qos-designs-cisco-ucs-nexus/ Putting aside the concerns about link redundancy, it certainly seems in the virtualization spirit to consolidate links and let the bandwidth be partitioned out as needed, letting software handle the boundaries dynamically, rather than installing dramatically underutilized hardware to enforce resource boundaries.
|
# ? Sep 22, 2010 23:16 |
|
pelle posted:We are looking into getting new storage. We are a small vfx studio with 15-20 artist (depending on freelancers / project ) and about 15 render nodes. We are hoping to double that within the coming year. We do most work in Maya,Renderman,Nuke and have over the last year done a lot of fluid simulations resulting in rather big data sets. Click here for the full 973x754 image. We're in crunch mode as you can see. At times the Isilons can easily saturate our switches. They're impressive in their performance. I suppose you pay for that. edit: whoops, for what you're looking at getting this is a better comparison - this is our nearline (35 nodes with slower disks etc.): Click here for the full 986x829 image. We don't use it for production, as you can tell vs. the above graph, but it is the 'maximum' view of cluster throughput and CPU usage. Djimi fucked around with this message at 00:05 on Sep 23, 2010 |
# ? Sep 22, 2010 23:32 |
|
Misogynist posted:Putting aside the concerns about link redundancy, it certainly seems in the virtualization spirit to consolidate links and let the bandwidth be partitioned out as needed, letting software handle the boundaries dynamically, rather than installing dramatically underutilized hardware to enforce resource boundaries.
|
# ? Sep 22, 2010 23:47 |
|
Misogynist posted:We ended up going fully-virtual for our Exchange 2010 environment -- it really does gently caress-all in terms of CPU utilization and with the disk I/O reductions being what they are there's really no good reason not to consolidate it into the same VM hosting environment as everything else anymore. We just made sure to set up our DRS/HA hostgroups in 4.1 to keep the servers apart from one another and set our resource pools to prioritize Exchange appropriately. I think we're using 16GB RAM per mailbox server, which means Exchange takes up about 1/3 of the memory on our ESXi environment. Yeah, that's what we were thinking (minus the SAN). The thing is that with say 1500-2000 users per each virtual mailbox server, that's a lot of VMs for DAGs in each datacenter, especially with multiple copies. We'll eat up a lot of ESX "bandwidth" and if you count SAN cost and VMware cost, savings are not really there. MS now allows you to have multiple roles on your DAG nodes, so you can get your CPU utilization even higher. Plus, half the point is gone if not using DRS/vMotion anyway.
|
# ? Sep 23, 2010 05:10 |
|
adorai posted:I certainly agree with the idea, and as an individual system administrator I am all for combining the links, but it's hard to go against such an easy to implement best practice. However, I am very glad you posted the links, because we are rebuilding our entire VMware infrastructure over the next few weeks so we'll certainly be able to consider doing so. Well, you would have multiple 10GB links per server so you should still have MPIO. Here is the thing. Look at switch/datacenter/cabling costs, and 10GB starts making sense. Our 2U VMware servers each used to have 8 cables (including Dell drac) and now we have 3. It's similar with our NFS/iSCSI storage. You would be surprised how much cabling, patch panels and all that stuff costs and how much pain in the rear it takes to run say 100 cables from a blade enclosure. We are going all 10GB for new VMware and storage infrastructure, and cost analysis makes sense.
|
# ? Sep 23, 2010 05:16 |
|
oblomov posted:Well, you would have multiple 10GB links per server so you should still have MPIO. Here is the thing. Look at switch/datacenter/cabling costs, and 10GB starts making sense. Our 2U VMware servers each used to have 8 cables (including Dell drac) and now we have 3. It's similar with our NFS/iSCSI storage. You would be surprised how much cabling, patch panels and all that stuff costs and how much pain in the rear it takes to run say 100 cables from a blade enclosure. Yeah, the ability to carve up those 10GbE pipes and deal with less cabling made it worthwhile for us (as well as being a nice chance to bail on FC).
|
# ? Sep 23, 2010 14:32 |
|
oblomov posted:Our 2U VMware servers each used to have 8 cables (including Dell drac) and now we have 3. Is that 3 10GbE interfaces? I'm curious about how people are using 10GbE with esx.
|
# ? Sep 24, 2010 07:34 |
|
Cultural Imperial posted:Is that 3 10GbE interfaces? I'm curious about how people are using 10GbE with esx. Given that remote access cards only support standard ethernet, I'm going to guess that he's running 2x 10GbE for data, and 1x 100Mb ethernet for the remote access. I've not seen anyone plan for more than 2x 10GbE into a single server. Tell you what tho, 10GbE upsets the VMware health check utility - starts complaining that all your services are on a single switch.
|
# ? Sep 24, 2010 08:29 |
|
From the earlier discussions on snapshots i got my rear end in gear for testing out snapdrive on our 6040 netapp. I mounted a qtree over NFS on say /mnt/goon and did: touch /mnt/goon/somefile (snaps poo poo themselves if i restore an empty share) snapdrive snap create -snapname start -fs /mnt/goon This gets me a snap with the share almost empty. Then i created 100k empty files with some bash loops. To get rid of all the files an get a clean share: snapdrive snap restore -snapname start I started this when i left work and it was still running 14 hours later. Sounds a tad bit slow?
|
# ? Sep 24, 2010 12:00 |
|
conntrack posted:To get rid of all the files an get a clean share:
|
# ? Sep 24, 2010 13:17 |
|
adorai posted:Is there something preventing the volume from unmounting? Snapdrive will unmount the volume before rolling it back i believe, and if there is a file open the unmount may fail. It got unmounted ok, double checked in the logs. The restore deleted about 60% of the files before i called it quits for taking to long and remounted it. A 1k test i ran before the larger test took about 5 minutes but completed ok.
|
# ? Sep 24, 2010 13:46 |
|
Mausi posted:Given that remote access cards only support standard ethernet, I'm going to guess that he's running 2x 10GbE for data, and 1x 100Mb ethernet for the remote access. Any server HP sells that's larger than a DL380 has 4 10GbE links. I've got a few customers using all four, though probably not to saturation. It's more for fault tolerance. We generally present each 10 gig link as 4 2.5 gig links to VMWare.
|
# ? Sep 24, 2010 16:55 |
|
Nomex posted:Any server HP sells that's larger than a DL380 has 4 10GbE links. I've got a few customers using all four, though probably not to saturation. It's more for fault tolerance. We generally present each 10 gig link as 4 2.5 gig links to VMWare. I'm curious, I've never worked with 10GE but do you run in the queue depth issues like this? I'm assuming it would split the queue to appropriate levels?
|
# ? Sep 24, 2010 18:02 |
|
adorai posted:I certainly agree with the idea, and as an individual system administrator I am all for combining the links, but it's hard to go against such an easy to implement best practice. However, I am very glad you posted the links, because we are rebuilding our entire VMware infrastructure over the next few weeks so we'll certainly be able to consider doing so. One thing you can do is setup your vSwitch uplinks as active/standby so your VMs and VMotion traffic all hang out on one 10 gig link while iSCSI uses the other. The only point in which they share is going to be when a NIC/path fails. I think with 10gigE that best practice will gradually phase out and be replaced with logical separation and traffic shaping. It takes a LOT of work to saturate a 10 gig link.
|
# ? Sep 24, 2010 18:44 |
|
conntrack posted:It got unmounted ok, double checked in the logs. The restore deleted about 60% of the files before i called it quits for taking to long and remounted it. What version of ontap and snapdrive are you using? edit: Try your test again but with the following snapdrive restore command: snapdrive snap restore /mnt/goon -snapname goon_snapshot -vbsr preview https://now.netapp.com/Knowledgebase/solutionarea.asp?id=kb47140 The default behaviour for snapdrive restores is single file snap restore. The -vbsr flag will tell snapdrive to do a volume based snap restore. namaste friends fucked around with this message at 23:18 on Sep 24, 2010 |
# ? Sep 24, 2010 22:59 |
|
1000101 posted:I think with 10gigE that best practice will gradually phase out and be replaced with logical separation and traffic shaping. It takes a LOT of work to saturate a 10 gig link.
|
# ? Sep 26, 2010 22:34 |
|
Nebulis01 posted:I'm curious, I've never worked with 10GE but do you run in the queue depth issues like this? I'm assuming it would split the queue to appropriate levels? I haven't had any problems.
|
# ? Sep 29, 2010 14:50 |
|
We're seeing around 45MBps per NIC to our iSCSI Equallogic SAN from our database server. We had a consultant come in and check out one of our products, and he is stating that ~45MBps is around the maximum of 1Gbps iSCSI connections, due to overhead of the protocol. How true is this? How much better is 10GbE in this regard? I imagine it's not literally 10x. He was really pushing FC.
|
# ? Oct 12, 2010 00:28 |
|
|
# ? May 11, 2024 16:49 |
|
three posted:We're seeing around 45MBps per NIC to our iSCSI Equallogic SAN from our database server. We had a consultant come in and check out one of our products, and he is stating that ~45MBps is around the maximum of 1Gbps iSCSI connections, due to overhead of the protocol. How true is this? In terms of 10 gigabit: I imagine this is the area where you will find performance relative to wire speed is very, very heavily vendor-dependent (and obviously heavily dependent on your backend spindles and layout as well). I'm under the impression that iSCSI at 10 gigabit is still rather CPU-intensive and this is one circumstance where you might want to consider one of the new Broadcom iSCSI 10GbE HBAs. Then again, if you need 10 gigabits of iSCSI throughput on a single box, you probably have somewhere in the vicinity of 32 cores anyway. The real sell of 10 gigabit for VMware environments at this point is reduced cabling -- use two 10 gigabit uplinks as your network connections to each VMware box and use the 1000V's QoS features to sort out and prioritize management, VMotion/FT, and VM traffic. It's still pretty rare to see a system that will suck down all that iSCSI bandwidth. Remember that most disk throughput is measured in IOPS, not gigabits. Vulture Culture fucked around with this message at 00:59 on Oct 12, 2010 |
# ? Oct 12, 2010 00:48 |