|
And then not getting updates? Sounds good mate.
|
# ? Aug 30, 2012 22:34 |
|
|
# ? May 15, 2024 03:56 |
|
evil_bunnY posted:And then not getting updates? Sounds good mate. Yeah this is the real gotcha. I've never had to actually call VMware for help. I want to get us onto 5.x but my current company let their support lapse years ago and the cost to renew with back payments is hilarious. A sales rep did the numbers both ways and it's literally cheaper to buy new licenses and support contracts than to renew the ones we have. That probably changes once you have a good enough relationship with VMware, but we don't, so welp.
|
# ? Aug 30, 2012 22:52 |
|
evil_bunnY posted:And then not getting updates? Sounds good mate. This. Unless you want to run ESXi 5.1 in 2016.
|
# ? Aug 30, 2012 22:53 |
|
evil_bunnY posted:And then not getting updates? Sounds good mate. That's kind of what I thought about going the no support route, but any thoughts on Production vs Basic?
|
# ? Aug 30, 2012 23:18 |
|
FISHMANPET posted:And if it ends up being that we need two identical servers then we can buy another server, I just need to know if that's necessary or not. quote:Host CPU is incompatible with the virtual machine's requirements Erwin fucked around with this message at 23:37 on Aug 30, 2012 |
# ? Aug 30, 2012 23:24 |
|
FISHMANPET posted:That's kind of what I thought about going the no support route, but any thoughts on Production vs Basic? How OK is management with having to wait 3 days for VMware to even start looking at an issue that crops up on Friday afternoon? The marginal price increase is so negligible that I wouldn't bother downgrading.
|
# ? Aug 30, 2012 23:36 |
|
Misogynist posted:It's still marked Technical Preview as of VMworld 2012 though, so we'll probably be waiting for a 6.0 release before it's prime time unless you know something I don't. You're right, probably 6.0. Gelsinger spoke of the tick-tock methodology Intel uses, so 6.0 could be next year some time. That's not really that long. VV
|
# ? Aug 30, 2012 23:42 |
|
Anything worth seeing at the VMWorld expo?
|
# ? Aug 31, 2012 01:04 |
|
Docjowles posted:How OK is management with having to wait 3 days for VMware to even start looking at an issue that crops up on Friday afternoon? The marginal price increase is so negligible that I wouldn't bother downgrading. I once had them tell me that my issue could only be resolved by rebooting my entire vSphere cluster in the middle of the day. Thanks for nothing, fuckwits.
|
# ? Aug 31, 2012 02:27 |
|
Misogynist posted:On the flip side, even if support does look at it within the guaranteed turnaround window, you'll have figured out the issue and solved it much less disruptively than they would have by the time they get back to you with an L1 form letter response. Quality dramatically drops based on where you're routed. I have had good experiences with the Canadians reps, but the Indian reps are usually very weak and extremely slow.
|
# ? Aug 31, 2012 02:32 |
|
PARALLELS 8 RUNS WINDOWS 8 AND DIRECTX 10 As a long-time Parallels user, I just got the preview of Parallels 8. I'm blown away Let goons rejoice: Parallels 8 now supports Windows 8 from Bootcamp (which, btw, works flawlessly despite Bootcamp "supporting" only Windows 7) and DirectX 10 (again, unofficially, but it appears to work great). On a Retina MacBook Pro with Parallels 8, I'm getting ~40fps in Bioshock 2 running on a Windows 8 Consumer Preview VM from Bootcamp. Batman Arkham City is working, too (albeit in DirectX10, not in DirectX 11). I should add that the screenshot below is on an external display. This freaking amazing Geforce 650M is also running my not-pictured retina display and a second external screen. ninja edit: City, not Asylum, which I haven't tried yet Sparkyhodgo fucked around with this message at 18:07 on Sep 1, 2012 |
# ? Sep 1, 2012 17:51 |
|
CrazyLittle posted:Anything worth seeing at the VMWorld expo? Also I got caught in a straight shill for Samsung session. It was about enabling Byod for your workforce and what it really was was a Samsung guy sitting on stage telling us everything they've been doing for byod. I sat through it for 35 minutes and bailed
|
# ? Sep 1, 2012 18:26 |
|
Yeah, I'm on the ISP / provider side, and we're not really big enough to get into the whole "hosted cloud" market when every bigger player's already stuck in that race to the bottom. I didn't see a whole lot of stuff there that would help me directly other than some of the virtualization focused hardware.
|
# ? Sep 1, 2012 22:34 |
|
I just love it when ESXi becomes catatonic, but still appears up to HA. I can ping the host, but can't do poo poo: VMs aren't responding and management requests from vCenter does nothing. Is there a way to tell vCenter to drop a host that is misbehaving like this, without having to do a 2am cab ride to reboot it?
|
# ? Sep 2, 2012 11:13 |
|
I'd check out your logs and see if you can figure out what's going on. If the host is talking to the HA master and it's writing to heartbeat datastores (if you're on 5) HA's not going to know any better. I don't think even VM HA is going to do anything if the VMs wouldn't respond to a restart request anyways.
|
# ? Sep 2, 2012 12:26 |
|
Mierdaan posted:I'd check out your logs and see if you can figure out what's going on. If the host is talking to the HA master and it's writing to heartbeat datastores (if you're on 5) HA's not going to know any better. I don't think even VM HA is going to do anything if the VMs wouldn't respond to a restart request anyways. Vulture Culture fucked around with this message at 18:12 on Sep 2, 2012 |
# ? Sep 2, 2012 18:10 |
|
Mierdaan posted:I'd check out your logs and see if you can figure out what's going on. If the host is talking to the HA master and it's writing to heartbeat datastores (if you're on 5) HA's not going to know any better. I don't think even VM HA is going to do anything if the VMs wouldn't respond to a restart request anyways. Nope, it's 4.1. As soon as I rebooted via iLO the VMs started up on other hosts. I don't have logs on the ESXi since I neglected to configure a persistent scratch storage and/or set up a syslog target. Spent part of my sunday doing exactly that. Is it worth even talking to support if I don't have logs from the hypervisor, or should I just upgrade to U3 that just came out and see if it happens again? This is the second time in a couple weeks this has happened (huge load spike, this specific machine goes catatonic and needs some iLO love).
|
# ? Sep 2, 2012 19:29 |
|
luminalflux posted:Is it worth even talking to support if I don't have logs from the hypervisor, or should I just upgrade to U3 that just came out and see if it happens again? This is the second time in a couple weeks this has happened (huge load spike, this specific machine goes catatonic and needs some iLO love).
|
# ? Sep 2, 2012 19:43 |
|
Misogynist posted:I'm noticing you're not following through with "set up a syslog server" as part of the action plan you're describing. Sorry if I was unclear - I set up both persistent scratch storage AND a syslog server, so I have logs in both places (verified that they're being written in both places as well). And I more or less only deal with Unix servers.
|
# ? Sep 2, 2012 19:53 |
|
Not much point unless you have log bundles. And they really need to be captured within minutes after the problem occurs - from the host and vCenter Server. You might have to do this stuff from the direct console or SSH, if enabled. Here are some questions you would be asked / things you could have tried: 1) Were the VMs online and responsive (yet stuck on that box - try ping, ssh, rdp, vnc, etc - reach out to known services in the VMs)? 2) Did the box ping? 3) Can vCenter server still ping and resolve that host by FQDN and hostname? 4) What does the console say? 5) Are you able to log into the console/iLO and access the Direct Console User Interface? 6) Are you able to hit alt+F12 and see the vmkernel logs scrolling anything? 7) Are you able to log into technical support mode / local shell? (Alt+F1, but it also has to be enabled in the DCUI). Basically if any of the above is true, the host isn't hung, crashed, or down. But the host management agents are not responding to vCenter Server. If a host ever appears as Not Responding, it's because VC thinks it is, but it might not actually be down. With that said, there's a lot that could be going wrong, so it warrants checking into. But before restarting management agents or hosts, try to collect log bundles (from the console, if you must), or you'll lose too much information about root-cause. Even with persistent log storage and syslog. The bundles contain stuff about connection counts, memory use, etc. that won't be in actual logging. I'm not really sure what it could be, though. A Not Responding state has too many possible causes.
|
# ? Sep 2, 2012 19:55 |
|
quote:1) Were the VMs online and responsive (yet stuck on that box - try ping, ssh, rdp, vnc, etc - reach out to known services in the VMs)? As for console access, I don't have the relevant licenses for text-console (only serial port access).
|
# ? Sep 2, 2012 20:05 |
|
What kind of storage are the VMs on - does the host go over FC or Ethernet (iSCSI, NFS, FCoE)? It sounds like it might have lost storage access at the time. I suppose it can happen during heavy load if there are driver/firmware/overheating (earlier 10Gb cards) problems, for instance. In the case of Ethernet, despite having teaming set up, your link state might not actually go down, so failover does not take place. Beacon probing would get around that (also a way to overcome driver/firmware-related outages, when using different physical NICs in the team - it kind of compliments that best practice). You can try to compare the time that Nagios reported outages with the time that the host stopped responding to VC - probably very close. As you likely read earlier, unfortunately HA (4.x and earlier) doesn't do shared datastore monitoring, so server ping responsiveness is all that matters to the other monitoring cluster members. Fault Domain Manager (FDM, or HA for 5.x and up) takes storage access into consideration as well and can let a server isolate itself if it must... and is configured to. Anyway storage loss, if it really occurred, will be one of those things that can (most likely will) make the server misbehave and not respond to any management requests, including direct agent login. Those worlds/processes are waiting on blocking I/O to complete against whatever resources that are "missing" or unreachable. I suppose if you set up the boxes to panic on NMI, then do just that, support can use that information and probably have a better chance of figuring out root-cause than with logging. Might need to give the server some time to dump before you reboot it again (an hour is safe), since you can't see the console say "Disk dump successful." Once it starts PSODing, then the VMs should HA over to another box as normal, too.
|
# ? Sep 3, 2012 04:47 |
|
Storage is on iSCSI, and the other 3 ESXis could reach the datastores. Everything's on GbE with NIC teaming into 2 switches. So it could very well be the NIC - both iSCSI links are on the on-board NIC, whereas the management and production networks are split over the on-board Broadcom and an Intel in the PCI slot.
|
# ? Sep 3, 2012 09:41 |
|
As long as we're on iSCSI storage problems, here's one I'm trying to run down. We have 3 cluster hosts on 5.0 U1, and a Compellent SAN hosting 7 volumes presented to all three hosts. I configured the hosts for round-robin MPIO, though we do only have one switch between hosts and storage. Currently each Compellent volume shows up as having 6 paths even though there's two VM kernel ports and two NICs on each Compellent controller, so I assume this is because of iSCSI virtual port redirection. Each vmknic is set to have one active, one unused uplink: 99% of the time, this hums along happily. However, sometimes in the middle of the night (normally 1-3am so firmly in our backup window) we get some of these messages in the log code:
code:
Mierdaan fucked around with this message at 03:20 on Sep 5, 2012 |
# ? Sep 3, 2012 15:24 |
|
Mostly work with Hyper-V but am currently working on upgrading some old VMWare Hosts to new Hardware. If we Own Essentials Plus 5.0 and are under current maintenance will we be able to upgrade to 5.1 without any additional costs? Currently running 2 Hosts in Development with plans to migrate VMs off of 2 Hosts (1 ESXi 3.5, 1 ESX 3.5). Migrated a few VMs already succsfully using the Converter, not sure if the aforementioned hardware version increase in 5.1 would affect anything migration wise? Or the VMs will run on the older hardware version just fine. Thanks.
|
# ? Sep 3, 2012 18:08 |
|
Mierdaan posted:As long as we're on iSCSI storage problems, here's one I'm trying to run down. Let's talk offline and fine if we're using any similar hardware for the hosts, switching ect ect Also I found it very interesting at the VMWorld storage best practices session that both the Netapp guy and the EMC guy said to turn off jumbo frames cause they just cause more problems then they are worth
|
# ? Sep 3, 2012 19:24 |
|
luminalflux posted:Storage is on iSCSI, and the other 3 ESXis could reach the datastores. Everything's on GbE with NIC teaming into 2 switches. So it could very well be the NIC - both iSCSI links are on the on-board NIC, whereas the management and production networks are split over the on-board Broadcom and an Intel in the PCI slot. Ideally you would have some redundancy which utilizes separate adapters (multi-port cards don't quite account for all NIC-related issues like firmware barfing). And if you can get two different drivers to be involved, that covers the driver side of failures as well. Off the top of my head, you could consider turning off interrupt remapping (introduced in 4.1), in case the server isn't handling it properly. There is a pretty specific problem signature to match with that, but you'd need intact server logs from quite some time before, then at the time of failure. I've seen cases where it took a week or so *after* the sign of interrupt remapping causing a problem, before the actual visible symptoms came up once. However it's usually an immediate effect (especially for storage traffic). Mierdaan posted:As long as we're on iSCSI storage problems, here's one I'm trying to run down. I can't say for sure, but it looks like from your log snippets, it was Target 3's paths (T3) being marked offline. Perhaps the load is bringing the port down (to ESX's perspective, at least). I guess you'd have to take it forward with performance troubleshooting and see if device latency is climbing, you're seeing lots of pause frames (well, is flow control on?), that kind of stuff. Only other thing is that Round Robin isn't truly MPIO - it puts chunks of I/O down each path sequentially - but it's still a good pathing policy to go with, regardless of what kind of array (active-active, or active-passive; ESX will only use the active paths and understands what is standby). Does Compellent recommend using round-robin? Try to see if any particular target ID is going down consistently during each occurrence. If it's different each time, maybe too many hosts are loading the same path (coincidentally/by-chance) on a given night - this depends on VM loads, I imagine, so DRS could shift the loads around. Perhaps Compellent might also recommend a smaller round robin I/O count before swapping paths. EMC, for instance, recommended a value of "1" for specific EMC storage configurations and it had a marked performance improvement for one troubled environment. Kachunkachunk fucked around with this message at 20:29 on Sep 3, 2012 |
# ? Sep 3, 2012 20:26 |
|
Kachunkachunk posted:Alrighty, so perhaps a remote syslog server would successfully receive logs during a recurrence. Are you able to set up iSCSI so it uses different physical adapters? Yep, that's the plan. Someone else set up these NICs up, I have no idea why iSCSI isn't split over the different NICs like the other nets (each ESXi has an on-board 4-port Broadcom and a 4-port intel in the PCI slot)
|
# ? Sep 3, 2012 21:36 |
|
@kachunkachunk: First off I gotta nitpick: RR is truly MPIO in that it defines multiple paths between a target and initiator. I agree with you in that it doesn't let you aggregate bandwidth between paths, but saying it's not truly MPIO is kinda misleading. Compellent does recommend RR as the MP algorithm of choice for their arrays, but I haven't found any guidance as to how many ops before a RR change they recommend; I've left it at the default right now. The main VM that should be impacted in our backup window is our file server VM, and at least last night I verified that that VM wasn't running on the host the Compellent lost paths to. The problem has moved around a few times though, so it's possible that DRS could be affecting it; it's too bad DRS doesn't have some sort of setting for "Aggressive but notify me". I'm pondering setting up some alerts for that so I can tell when DRS is taking actions; I think there's a "hot migrating" event I can alert on, right? FWIW, it does seem to be one target that is going offline during these events. Other hosts that are mapped to the same volume, presumably (though I can't back this up) with VMs on the same LUN aren't dropping off though, so I'm not really sure it's a performance issue or it'd affect more than one host as latency for all hosts climbed. Is this the issue that you were talking about that involved changing the OperationLimit to a single-digit number? Looks like that was cleared up by 4.0 U5, so I'd hope whatever bug was cleared up in that update didn't persist into 5.0 U1.
|
# ? Sep 3, 2012 21:42 |
|
It is odd that you have 6 paths. How are the ports on the Compellent set up? Are they in a single fault domain given that they're plugged into the same switch? Are you in virtual ports mode? I would think you should have two paths if in legacy mode or four if in virtual ports mode on the Compellent as long as you have dual controllers. Six seems like there's a misconfiguration somewhere. edit: I am pretty sure the vmknics should be on a seperate vswitch. Example of what a correct config looks like: This is from a branch office we have with a single-controller Compellent plugged into a single switch with broadcom hardware-assisted iscsi. Looks similar to your setup. KS fucked around with this message at 01:11 on Sep 4, 2012 |
# ? Sep 4, 2012 00:55 |
|
We're in virtual port mode, but best practices say you can use separate vSwitches or one - it shouldn't make a difference. Will post more tomorrow from work, on phone now...
|
# ? Sep 4, 2012 01:10 |
|
edit: actually thinking this through, the above means that the two-NIC to one vswitch setup you posted is definitely incorrect. They can't be set up to fail over between physical NICs if they're correctly bound to one physical NIC. I'd guess this is the source of your problem. Link KS fucked around with this message at 01:37 on Sep 4, 2012 |
# ? Sep 4, 2012 01:15 |
|
Mierdaan posted:We're in virtual port mode, but best practices say you can use separate vSwitches or one - it shouldn't make a difference. Will post more tomorrow from work, on phone now... Very interested to hear what you figure out. Please share when you get to a resolution.
|
# ? Sep 4, 2012 01:19 |
|
KS posted:
You're not thinking of this correctly. The vmknics are both bound to the software iscsi initiator, and then each vmknic uses a single uplink as its active uplink - the other is marked inactive, so there's no failover happening at the physical uplink level ever. The software iscsi initiator sees those two vmknics as potential paths to storage, and then can round-robin between the two of them. Whether this happens within one vswitch or across multiple vswitches shouldn't be relevent - see the Fourth Topic in VirtualGeek's MultiVendor iSCSI blog entry. Also, what you're remembering about not being able to bind multiple vmknics to the software iscsi initiator via the GUI isn't true in 5.0 anymore, see this post.
|
# ? Sep 4, 2012 02:32 |
|
KS posted:It is odd that you have 6 paths. How are the ports on the Compellent set up? Are they in a single fault domain given that they're plugged into the same switch? Are you in virtual ports mode? Missed this reply somehow so I did a more detailed post of the Compellent side of things over in the Enterprise Storage thread.
|
# ? Sep 4, 2012 03:25 |
|
Okay, the 6 path thing was a total non-issue. VMware support pinned it on the fact that I had volumes mapped to the hosts at the time I bound the vmknics to the iSCSI initiator, and those two paths that already existed (from vmhba34 to the two SAN ports) don't disappear until you reboot the host. I popped a host into maintenance, rebooted, and voila - 4 paths. Also, they say there's absolutely no issue using one vSwitch as I've done, as long as each vmknic has one active uplink only.
|
# ? Sep 4, 2012 19:20 |
|
Anyone use Xangati? I've recently migrated my entire environment from Xenserver over to Vmware Enterprise (VSPP). I have Veeam One, haven't gotten to far with messing around with it as it's only been a few days, but I just ran across Xangati and it looks pretty damned nice.
|
# ? Sep 4, 2012 20:30 |
|
I'm currently installing a vCenter server for my 2 node Enterprise cluster and I'm loving terrified/in awe that I'm actually installing it on a real thing, not just reading about it.
|
# ? Sep 4, 2012 23:31 |
|
So I just found out from reading this ArsTechnica article that Hyper-V Server 2012 has arrived. I've downloaded it and have it running in VMware now, but I haven't played with it extensively yet. I've had a physical Hyper-V Server 2008 R2 going for a while now so I don't have a problem navigating through it. I'm curious to know if anyone else has dabbled in this yet. I haven't looked it up yet but I'm curious to know if you can access the new features of Hyper-V Server 2012 through the Hyper-V Manager in Windows 7, or if you absolutely have to be using Windows 8. Is there an updated RSAT from Microsoft for manipulating the new features from Windows 7? Has anyone actually had a chance to play with the new Hyper-V Server and its features?
|
# ? Sep 4, 2012 23:38 |
|
|
# ? May 15, 2024 03:56 |
|
COCKMOUTH.GIF posted:Is there an updated RSAT from Microsoft for manipulating the new features from Windows 7?
|
# ? Sep 4, 2012 23:53 |