|
chutwig posted:I got stuck with trying to sort out the FCoE plans of somebody who had left my last company. He had armed all the servers with Intel X520-DA2s. VMware and bare-metal Linux both used software FCoE initiators with these cards and were completely unable to keep up with even modest traffic, so VMs would constantly kernel panic when FCoE on the hypervisor poo poo the bed and all their backing datastores dropped off. I ordered a couple of Emulex OneConnect cards that obviated the need for the software FCoE crap and they worked really well, right up until one of the cards crashed due to faulty firmware. At that point I'd had enough of trying to salvage the FCoE poo poo, sent everything back and bought a pair of MDSes and some Qlogic FC HBAs, and never thought about my SAN crashing again.
|
# ? Jul 3, 2015 20:32 |
|
|
# ? May 8, 2024 06:56 |
|
Vulture Culture posted:I see "DevOps" being used to mean "tool engineer," and boy oh boy, everyone's sick of hearing me go down this road again. Fencing is a quadrilateral and STONITH is a rhombus. Fencing is good. Self-fencing is good. STONITH is not a synonym for either of these things. yeah im not gonna invest tons of manhours attempting to correct a once in a few weeks issue that is solved by rebooting if it's not impacting my customers at all
|
# ? Jul 3, 2015 20:37 |
|
Ahdinko posted:Yes thank you VMware, I was curious about the throughput and performance of NSX, it is good to know that Bridge gets 1 more receive than VXLAN, this chart has been very useful for me. A few years ago the performance team at VMware did a comparison of PCoIP vs. Blast(which is really just beefed up VNC now) and they did this same type of thing. They made everything a unitless number. We asked for actual FPS information and they wouldn't give us those numbers, not even internally. They don't like to publish real number with performance results, because then they will be held to a standard of meeting those numbers. It is all marketing fluff.
|
# ? Jul 3, 2015 20:38 |
|
I've got a legacy application that needs XP. It's hard to find XP keys now so I was thinking of using nested virtualization: ESXi -> Win7 -> XP I don't know if that'll work. I also worry that if I convert the XP mode computer into a VMware vm it'll lose its licensing. What's the right way to do this?
|
# ? Jul 3, 2015 21:41 |
|
Nested Virt is supported, so it should...
|
# ? Jul 3, 2015 23:10 |
|
Ahdinko posted:Yes thank you VMware, I was curious about the throughput and performance of NSX, it is good to know that Bridge gets 1 more receive than VXLAN, this chart has been very useful for me. That should be in gigabits per second. Essentially if a hardware bridge can do ~19gbps (split between transmit and receive) on a 10GbE port that it can do only slightly less using VXLAN.
|
# ? Jul 4, 2015 00:16 |
|
Our new CommVault Snap Backup system went haywire over the weekend when some datastores ran out of space, actually managed to hit the VM snapshot hierarchy depth in vSphere:
|
# ? Jul 6, 2015 05:51 |
|
Is LACP usage still only officially supported with the vSphere Enterprise license? It's been a while since I fudged around with it, but I recall that the free (and lower licensed tiers) never really worked "right" or something. Or maybe I'm just retarded., Wicaeed fucked around with this message at 17:04 on Jul 7, 2015 |
# ? Jul 7, 2015 16:54 |
|
Wicaeed posted:Is LACP usage still only officially supported with the vSphere Enterprise license? You need the enterprise plus license.
|
# ? Jul 7, 2015 17:12 |
|
Wicaeed posted:Is LACP usage still only officially supported with the vSphere Enterprise license? Real 802.3ad LACP link aggregration only works if you're using distributed vSwitches, which are only available on Enterprise Plus. Most people don't need true LACP link aggregration anyway. The standard vSwitch active/active NIC teaming is simpler to configure anyway. I'm not a networking expert, but I think the only benefit LACP gives you is better load balancing on the trunk, since it's considered one logical "link" and you let the magic of the protocol balance across the physical links for you.
|
# ? Jul 7, 2015 17:14 |
|
Cidrick posted:Real 802.3ad LACP link aggregration only works if you're using distributed vSwitches, which are only available on Enterprise Plus. The LACP specification doesn't define any hashing or traffic distribution algorithms. These are entirely vendor-dependent on both the switch and operating system sides, and you still need to configure them to distribute traffic using a method that's appropriate for your environment (MAC hash, IP hash, etc.). As a result, it only very rarely gives you better load balancing capabilities on an aggregate -- specifically, each side of the pair can be alerted when a physical port is going down and make topology changes automatically, rather than having to detect (often incorrectly) the down port on the other end. The real benefit of LACP is that you don't end up with configuration mismatches on both sides re:, for example, the number of ports in the aggregate or, say, some kind of condition where the aggregate isn't brought up correctly on the other end of the link.
|
# ? Jul 7, 2015 18:20 |
|
LACP adds some complexity but it may give you better link utilization. Just make sure you're still connecting to 2 physical switches and that they support MLAG/VPC/VSS/whatever your switch vendor provides) so you can have a switch die and not take down your ESXi hosts. If you're still running on Cisco catalyst without VSS then just avoid LACP and load balance on originating port-ID or use load based teaming. quote:The LACP specification doesn't define any hashing or traffic distribution algorithms. These are entirely vendor-dependent on both the switch and operating system sides, and you still need to configure them to distribute traffic using a method that's appropriate for your environment (MAC hash, IP hash, etc.). As a result, it only very rarely gives you better load balancing capabilities on an aggregate -- specifically, each side of the pair can be alerted when a physical port is going down and make topology changes automatically, rather than having to detect (often incorrectly) the down port on the other end. While this is all technically true (the best kind of true) there are good odds your dvswitch and the upstream switch can now agree on a pretty specific hash that'll most likely give you pretty solid distribution of flows across the bundle's member links. As of 5.5 you can load balance based on any of these hashes (which are supported by every cisco switch I've touched): Destination IP address and TCP/UDP port Destination IP address, TCP/UDP port and VLAN Source IP address and TCP/UDP port Source IP address, TCP/UDP port and VLAN Source and destination IP address and TCP/UDP port Source and destination IP address, TCP/UDP port and VLAN I pruned out everything but the most specific stuff. It's come a long way from src-dst-ip. That said, if I'm using 10GbE I still don't bother with LACP since it's pretty hard to make even an ESXi host chew up a pair of 10 gig interfaces long enough for anyone to notice. 1000101 fucked around with this message at 18:30 on Jul 7, 2015 |
# ? Jul 7, 2015 18:24 |
|
Vulture Culture posted:I'm going to switch to the term "aggregate" since "trunk" is overloaded and has other specific meanings in a VLAN context. Yeah, our Jr. Networking guy keeps doing this, it's actually a bitch to troubleshoot since it doesn't always appear until you VMotion a vm from one host to another when it's the only vm on the vlan or something stupid, ugh.
|
# ? Jul 7, 2015 19:03 |
|
Wait so is the general rule on the Cisco-switch side to not use port-channels?
|
# ? Jul 7, 2015 19:06 |
|
PernixData is making a free version of their ram/SSD acceleration software for VMware hosts. http://www.pernixdata.com/declare-your-storage-independence Only does read acceleration and support is through the community forum, read caching is distributed and fault-tolerant. Paid enterprise version get your write caching and some other stuff. It operates on a block level instead of per VMDK like how VMware does ssd acceleration, so you basically build a big collective pool of redundant SSD (or ramdisk) between all your hosts and the software takes care of caching hot blocks and intercepting the disk request so they are satisfied either by local SSD or SSD on a peer host instead of going back to your NAS.
|
# ? Jul 7, 2015 20:10 |
|
BangersInMyKnickers posted:PernixData is making a free version of their ram/SSD acceleration software for VMware hosts. Lol it's SAP HANA for your entire infrastructure. Sounds entirely unscalable and clicking through their website I still can't find hard numbers as to performance. I can see a potential use-case of squeezing all possible performance from a small SMB setup but beyond that the gains are doubtful, especially since they talk about caching between host DAS.
|
# ? Jul 7, 2015 20:37 |
|
One of the major hosting providers in the area run it for all their production VDI. For bursty workloads it seems to hold up fine for what they let me touch, hundreds of sessions running with zero measurable storage latency backed against the shittiest 12 spindle SATA Dell storage appliance they could buy. What makes you think it isn't scalable? The redundancy is against two hosts (or more, its configurable) in the pool, not every single host. And I can't say I'm surprised they don't publish performance numbers, its completely based on how much money you are willing to throw that your SSD, be it sata, sas, pcie, or even a pcie ramdisk.
|
# ? Jul 7, 2015 20:59 |
|
Lol come on, I've literally never heard of PernixData until now and their model of distributed cache over DAS on multiple hosts is entirely unscalable as your interconnect requirements will scale linearly with workload and exponentially with the number of hosts. Also you can throw as much money you want at storage but the bottleneck is always going to be interconnect and hey you can only install so many 10Gb quad-port adapters in a host
|
# ? Jul 7, 2015 21:21 |
|
BangersInMyKnickers posted:One of the major hosting providers in the area run it for all their production VDI. This is one of the niches that local RAM/SSD based caching tend to play best in, since on VDI you're normally using internal disks or lovely DAS, and host memory is cheap. As soon as you're running on a SAN just let the SAN take care of caching for you. -edit- See also CBRC, Infinio, ILIO. ragzilla fucked around with this message at 21:26 on Jul 7, 2015 |
# ? Jul 7, 2015 21:22 |
|
cheese-cube posted:Lol come on, I've literally never heard of PernixData until now and their model of distributed cache over DAS on multiple hosts is entirely unscalable as your interconnect requirements will scale linearly with workload and exponentially with the number of hosts. Also you can throw as much money you want at storage but the bottleneck is always going to be interconnect and hey you can only install so many 10Gb quad-port adapters in a host Yeah, you're still misunderstanding how it works under the hood. If its servicing a read request it will attempt to pull out of the local cache so ideally you're at pcie bus speed and latency and the network interface doesn't get touched. If the data isn't on the local host's cache then it looks for it on an adjacent host which may have populated those blocks and pulls it host to host over the san so again you're talking microsecond latencies without touching the uplink to your storage appliance. If neither of those are available then it gets processed normally. I don't know how much money they are throwing around where you are for storage but flash pool/cache expansion on our netapp is an expensive proposition. On write it commits to the local cache disk and (by default) one other host's cache and those systems become authoritative for those blocks to the cluster while they are waiting to commit back to the san. If one of the hosts up and dies before the write cache has committed back to the san then the other takes over for it. Again, I'm not seeing how there is a scaling issue, especially on storage fabric bandwidth. On the environment I got to see they were satisfying 70% of read requests directly from the pcie bus which cut down on san traffic considerably and the host to host traffic was going over lag interconnects on the storage switches that were barely being used anyhow.
|
# ? Jul 7, 2015 21:45 |
|
ragzilla posted:This is one of the niches that local RAM/SSD based caching tend to play best in, since on VDI you're normally using internal disks or lovely DAS, and host memory is cheap. As soon as you're running on a SAN just let the SAN take care of caching for you. I would have been incredibly useful for me about 2 years ago when management kept loving me over for storage funding so I was stuck on year 8 of a FAS3020c that was falling over with 100ms storage latency ever hour or so from bursty workloads. $10k for some pcie cards is a drat sight easier to swallow than $200k to rip and replace end of life units, and I doubt I am the only person in the world who found themselves in that boat. It could also be helpful in a situation where you need to normalize storage latency because you're forced to run VDI and virtual servers against the same storage appliance.
|
# ? Jul 7, 2015 21:50 |
|
BangersInMyKnickers posted:Yeah, you're still misunderstanding how it works under the hood. If its servicing a read request it will attempt to pull out of the local cache so ideally you're at pcie bus speed and latency and the network interface doesn't get touched. If the data isn't on the local host's cache then it looks for it on an adjacent host which may have populated those blocks and pulls it host to host over the san so again you're talking microsecond latencies without touching the uplink to your storage appliance. If neither of those are available then it gets processed normally. I don't know how much money they are throwing around where you are for storage but flash pool/cache expansion on our netapp is an expensive proposition. You're assuming an entirely unsaturated interconnect. And you assuming best-case scenario for cache distribution wherein the nearest-neighbour has the data. Just to clarify I'm not directly making GBS threads on this system, it can and will work for small, contained workloads spread statically across a number of hosts. However my contention is that this system is not scalable. This again comes back to my point of interconnect which drives home a simple truth: cache ceases to be useful when its access time exceeds that of the targeted data. Scaling out a system where an intermediate storage-layer cache is bound to DAS which increases exponentially with the number of compute nodes added to said system is insane!
|
# ? Jul 7, 2015 21:58 |
|
cheese-cube posted:Lol come on, I've literally never heard of PernixData until now and their model of distributed cache over DAS on multiple hosts is entirely unscalable as your interconnect requirements will scale linearly with workload and exponentially with the number of hosts. Also you can throw as much money you want at storage but the bottleneck is always going to be interconnect and hey you can only install so many 10Gb quad-port adapters in a host They certainly send me enough marketing emails. I'm sure they'd be happy to add you to their list.
|
# ? Jul 7, 2015 22:09 |
|
cheese-cube posted:You're assuming an entirely unsaturated interconnect. And you assuming best-case scenario for cache distribution wherein the nearest-neighbour has the data. Just to clarify I'm not directly making GBS threads on this system, it can and will work for small, contained workloads spread statically across a number of hosts. However my contention is that this system is not scalable. This again comes back to my point of interconnect which drives home a simple truth: cache ceases to be useful when its access time exceeds that of the targeted data. Scaling out a system where an intermediate storage-layer cache is bound to DAS which increases exponentially with the number of compute nodes added to said system is insane! Vmotion network is the interconnect. Most shops can't saturate a single 10GbE link with storage IO so if you've got two per host for vmotion you're never going to saturate it with cache IO. VMWare clusters aren't really hugely scalable either. They can get very big, sure, but the largest container for a single workload is a single host so there isn't a ton of benefit to building very large clusters. Also, a bunch of systems currently work this way like ExtremeIO, ScaleIO, VSAN, Simplivity, and Nutanix. Distributed caching is pretty common and there are more than enough fast, low latency, interconnect options out there to make it feasible at all but the most massive scales, and those tend to use sharding or some other form of partitioning. YOLOsubmarine fucked around with this message at 23:13 on Jul 7, 2015 |
# ? Jul 7, 2015 23:07 |
|
cheese-cube posted:Wait so is the general rule on the Cisco-switch side to not use port-channels? It depends! I've done it both ways depending on what my virtualization hosts are running, what workloads are running on them and what my upstream switches are doing. 10GbE is a lot of bandwidth. If I'm on 1 gig I might consider it if my upstream switches support some form of MLAG.
|
# ? Jul 8, 2015 00:35 |
|
Erwin posted:They certainly send me enough marketing emails. I'm sure they'd be happy to add you to their list. A couple of the popular VMware bloggers went over to Pernix a while ago too.
|
# ? Jul 8, 2015 01:06 |
|
I'm having troubles getting nested esxi hosts to work in two networks, one being the labeled vm network (with management port), and the second being a labeled iscsi network on a different vswitch. Here is a screenshot of the current setup: 10.10.10.10 is the physical/top level host and the screenshot shows the networking setup. vswitch0 has the labeled vm network and the management port and uses both physical nics in teaming. vswitch1 has the labeled iscsi network and also has a port in the iscsi subnet and uses no physical nics (because they aren't needed as all the networking should be in the vswitches). The 4 vms are two nested esxi hosts, one vcenter vm, and one freenas with iscsi. Each VM has two nics, one in the labeled vm network and one in the labeled iscsi network. The vcenter vm doesn't really need to be in the iscsi network but i have one nic in there for troubleshooting and pinging other vms. The vcenter vm can ping the freenas and vice versa. The problem is that I don't know how to setup the nested esxi host networking. Here is a screenshot of how I envisioned it should look like (.13 is esxi1 and .14 is esxi2): Like I said, the vcenter and freenas vms can ping each other on either subnet, how come I'm having problems with the esxi vms? edit: I can ping the management port on the nested esxi vms, just not the iscsi port. edit2: Solved. Apparently I need to enable promiscuous mode on the physical hosts vswitches, though, I don't really know why. I saw it in this article: http://www.vladan.fr/nested-esxi-6-in-a-lab/ and it appears to have solved my networking woes. If someone wants to explain why this is required, I'm all ears. kiwid fucked around with this message at 03:33 on Jul 9, 2015 |
# ? Jul 9, 2015 01:56 |
|
Enable MAC change, forged transmits and promiscuous mode for any vSwitch or portgroup that a nested ESXi host is using. That should sort you out. (make sure you do all 3 or your guests may not have connectivity) edit: It happens because the nested ESXi server's vnic is going to go need access to traffic that isn't destined for the MAC address in it's .vmx file. Whenever you create a vmk (say for management or NFS) it's going to be another virtual interface that gets it's own MAC that is going to differ from the underlying "physical" MAC. 1000101 fucked around with this message at 04:23 on Jul 9, 2015 |
# ? Jul 9, 2015 04:21 |
|
What are your guys opinions on Nutanix?
|
# ? Jul 11, 2015 02:33 |
|
I've got a VCSA6 server at home that I'm deploying some test VMs on. I need to get access to this from work for reasons, just so I can tweak VMs, etc. I've forwarded https://mydynds.url.com:44481 to https://vcsa.localdomain:443 and while the initial connection works fine (get screen prompting me to click here to access Web UI), once I click trough it tries to redirect me to https://vcsa.localdomain:443/something/something, which I obviously won't be able to resolve externally. Is there any way to tell VCSA to not force a URL redirect and just use the URL I came in on? Yes, I realize that this is ridiculously insecure but I don't really have the cycles to set up a VPN home right now and I was hoping this would be a quick workaround. This isn't a long term solution, but I'd like to get this working today, if it's at all possible. So far my googling hasn't come up with much.
|
# ? Jul 13, 2015 17:09 |
|
Martytoof posted:I've got a VCSA6 server at home that I'm deploying some test VMs on. What happens if you just hit the IP address instead?
|
# ? Jul 13, 2015 17:17 |
|
Martytoof posted:I've got a VCSA6 server at home that I'm deploying some test VMs on. Set up a rewriting (rewriting probably optional) reverse proxy
|
# ? Jul 13, 2015 17:22 |
|
kiwid posted:What happens if you just hit the IP address instead? Still tries to redirect to https://vcsa.localdomain:443 :[ evol262 posted:Set up a rewriting (rewriting probably optional) reverse proxy To be honest if I have to set up proxy I'll just go to the trouble to set up the VPN instead; I was hoping there was some XML thing I could change to "yes" in one of the arcane configs on the VCSA to prevent a redirect but it looks like it wasn't meant to be :\
|
# ? Jul 13, 2015 17:22 |
|
Looks like I actually tracked down my NTFS corruption issue I mentioned a few months ago. It was a VMWare bug on 6.0. Tracked down to error messages in my logs. Creating cbt node 7cb4c6-cbt failed with error Cannot allocate memory (0xbad0014, Out of memory). Could not attach vmkernel change tracker: ESXi tracking filter failed (0x143c). Disk will be opened, but change tracking info vill be invalidated. http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2114076 Those errors occurred at the exact minute that the first NTFS corruption showed up. Looks like CBT was borked in the initial releases of 6.0. So, if I would have sVMotioned these on the 5.1 hosts, I would have had no problems. It was 6.0 that caused it all along.
|
# ? Jul 13, 2015 21:50 |
|
Is it recommended that we have at least one physical domain controller? Also, do you guys virtualize vCenter or does that sit on a physical box as well?
|
# ? Jul 14, 2015 15:19 |
|
kiwid posted:Is it recommended that we have at least one physical domain controller? All DCs and vCenter virtual. Been doing this for like 5 years with no issues. If you have a huge cluster, maybe pin your vCenter server to a host or two so if it goes down, you can find it faster.
|
# ? Jul 14, 2015 16:27 |
|
Moey posted:All DCs and vCenter virtual. Been doing this for like 5 years with no issues. Are you doing CPU reservations or putting the DC in a high CPU priority resource pool? I'm running a fairly small domain with one DC vm and one physical (physical holds all the roles) but I was considering going virtual on both once I have a second cluster at the DR site running so I'm active-active instead of active-standby. Clock drift is always a concern, but the virtual DC is hanging at .9% cpu latency so that's probably not an issue.
|
# ? Jul 14, 2015 18:20 |
|
BangersInMyKnickers posted:Are you doing CPU reservations or putting the DC in a high CPU priority resource pool? I'm running a fairly small domain with one DC vm and one physical (physical holds all the roles) but I was considering going virtual on both once I have a second cluster at the DR site running so I'm active-active instead of active-standby. Clock drift is always a concern, but the virtual DC is hanging at .9% cpu latency so that's probably not an issue. Nope. While it wouldn't be a bad idea, I have lots of headroom resource wise in my current and past environments, so it has not been an issues. DCs really just loaf along all day. I run AD DS, DNS. DHCP and NAP on my DCs. 1vCPU and 4gb memory. Server 2012.
|
# ? Jul 14, 2015 18:46 |
|
I run (well, ran, just changed jobs) at least one physical box for any super critical service. DNS, LDAP, DHCP etc. But that's because I was forced to use OpenStack + KVM as our virtual environment by management and the entire thing would not infrequently tip over. Let me tell you how fun it is to troubleshoot an outage when you can't loving resolve hostnames and have to dig fallback credentials out of the password manager to log in because LDAP is down. On a properly licensed and configured VMware cluster, I wouldn't have any reservations.
|
# ? Jul 14, 2015 21:42 |
|
|
# ? May 8, 2024 06:56 |
|
DCs in openstack just seems like such a bad idea. Nova doesn't even set up libvirt auth or anything. Why don't they run DNS/LDAP/DHCP/etc in normal VMs (on KVM if they want) with some kind of HA manager instead of dumping it in openstack?
|
# ? Jul 14, 2015 21:53 |