Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Potato Salad
Oct 23, 2014

nobody cares


Crosspost from Pissing Me Off: Myself

We have some seminar / training room VDI pools in View and some temp employee / home access VDI pools.

To handle the fact that we want user desktop / document folder redirection to our storage filer on the employee pools and not the training pools, we deployed the AD OU for our VDI computer accounts with policy loopback enabled. There is a very obvious edge case I didn't account for that has certain temp employees not on redirected folders. How to much more elegantly lay out the OU for our VDI machines is obvious now, but drat it I need to apply this laaaaaaaaaaate after hours.

Adbot
ADBOT LOVES YOU

Potato Salad
Oct 23, 2014

nobody cares


evol262 posted:

I'm sure the new SAN will help when the storage admins don't bother to configure multiple pathing, because bonds are better somehow.

Stop talking about the inadequacies of my storage infrastructure :smith:






I'm a grown adult. Why am I giggling uncontrollably when I read this aloud.

Potato Salad
Oct 23, 2014

nobody cares


Docjowles posted:

God dammit, I am never going to be able to read/say this as anything but "peen FS" from now on. Thanks a lot, buddy :argh:

I was doing "penifs" but now I can't even use the acronym. Thanks, right back at ya :argh:

Potato Salad
Oct 23, 2014

nobody cares


Absolute Zero as an actual need has a massive budget behind it for new talent and consultants.

You do have a massive budget, right? :corsair:

Potato Salad
Oct 23, 2014

nobody cares


devmd01 posted:

. What is my most effective method of migrating guests across the mpls, since we are also re-IPing everything anyways. Weekend downtime is acceptable, most guest storage luns are 2TB.

What would you use and why?


How far, physically?

Last time I moved a datacenter, I rented a cheap-o turnkey NAS, had poo poo slowly duplicate for a few days ahead of time, and drove it fifty miles. Had I wrecked, no production stuff would have been harmed -- it was a well-insured rental device. Beat the hell out of using their lovely ISP.

Potato Salad
Oct 23, 2014

nobody cares


Yes. I do this at home. It's just a datastore.

Edit - here's a guide that looks accurate after a quick read-over. http://www.virten.net/2015/10/usb-devices-as-vmfs-datastore-in-vsphere-esxi-6-0/

Potato Salad fucked around with this message at 04:29 on Jun 1, 2016

Potato Salad
Oct 23, 2014

nobody cares


Dedupe does very little for SOHO from a cost benefit perspective.

Potato Salad
Oct 23, 2014

nobody cares


I thought just by subject matter that I was in the enterprise storage thread. Apologies.

Dude with the money to burn: your system plan is unbalanced. Sell the second 1080, get a 950 PRO or other NVMe storage device and a turnkey SSD-backed SOHO NAS and enjoy the fastest storage you can buy without going always-on ramdisk.

Potato Salad
Oct 23, 2014

nobody cares


There was a post here, but it was needlessly confrontational. What people do with their money is none of my business.

Potato Salad fucked around with this message at 13:34 on Jun 13, 2016

Potato Salad
Oct 23, 2014

nobody cares


Look up "VMware EVC Mode." You can tell VMware what generation of CPU features to present to guests of a virtual datacenter, allowing you to mix them to the oldest common cpu generation.

Potato Salad
Oct 23, 2014

nobody cares


Definitely don't overspec VMs for class labs. That might give customers a good impression!

Potato Salad
Oct 23, 2014

nobody cares


Your IT department is retarded if they can't give you a slice of their storage pie. Lack of credentials on the single log generating machine is a bullshit excuse. There may a handful of options at numerous layers of the network to make it happen.

Potato Salad
Oct 23, 2014

nobody cares


Keyed ssh access, fw on a throwaway vm frontend of the storage pool open to that one server, there are so many ways to provide that storage slice with logical isolation from the rest of your storage in ways compliant with 800-53 or -171 for your DoD work.

Potato Salad
Oct 23, 2014

nobody cares


Nope, you can use anything.

Potato Salad
Oct 23, 2014

nobody cares


DevNull posted:

Never buy an AMD

Whatchu talking about, Zen will implement generations of generations of features all in one go without any disappoi-- :suicide101:

Potato Salad
Oct 23, 2014

nobody cares


I'm in the same place with an Oracle E-Biz system with users who (1) can't let me discard decades-old data to a non-production DB (2) can't write reports that exhibit any level of mindfulness for efficiency (3) won't put these reports in our digital document / OCR system and instead re-queue and re-print them every time someone needs to see <report result>.

Someone, please for the love of God, make an affordable turnkey SSD storage filer. You are going to have to eventually.

Potato Salad
Oct 23, 2014

nobody cares


Trading?

Potato Salad
Oct 23, 2014

nobody cares


High frequency trading is absolutely critically dependent on timing.

Potato Salad
Oct 23, 2014

nobody cares


I once couldn't get a guy on Spiceworks to understand why, while analysts and portfolio managers can work in a remote office, trading systems themselves needed to be near the exchange itself as a bar of entry to HF trading. His management was riding him for not providing an uplink to the exchange that worked faster than the speed of light would permit for the remote office they wanted to move their trading infrastructure to :psylon:

What a company was doing with some fool who couldn't see that, hey, the speed of light was his limit....there are high-6-figure-salaried experts on this kind of network architecture for a reason.

Potato Salad
Oct 23, 2014

nobody cares


Before new exchanges implemented rules (or was it the SEC...can't remember) on placing enough fiber in your uplink to impose a minimum latency to the exchange, traders were successful only when they had literally the closest office.

Potato Salad
Oct 23, 2014

nobody cares


mayodreams posted:

Two days later I was fired on day 90 of my 90 day probationary period because 'we are professionals, so we don't need checklists' and 'documentation is pointless because it is out of date as soon as it is written'.

And that ended the worst experience of my professional career.

> professionals
> documentation is pointless

:bahgawd:

Potato Salad
Oct 23, 2014

nobody cares


Specs for the hosts? What storage is this running on?

Potato Salad
Oct 23, 2014

nobody cares


If you have the cash, buy enough 2TB Samsung 850 PRO SSDs to handle your storage issues. 100vms on magnetic disk? What are the VM OSes? I can't imagine trying to run that kind of demand on a single spinning disk. This is coming off as incredibly seat-of-pants hobo dev.

Potato Salad
Oct 23, 2014

nobody cares


The two vmdk errors you're seeing happen when I try to do too much with lovely storage and elements of the storage stack start timing out or silently and gracelessly crashing. 100vms per single 4tb archival spinning hard drive is way off the edge of the map.

Potato Salad
Oct 23, 2014

nobody cares


There's a guy in another thread talking about some developers being, to be frank, uneducated but argumentative with their expectations for a barebones esxi setup and I can't help but wonder.

Maybe the support ticket with vmware will find some bug or misconfiguration (that's your fault if you were the guys who did the updates then rollbacks), but I can already kinda feel the sigh of the guy who jumps on the phone with your IT team tomorrow hearing that, yep, its another shop with devs who think virtualization is magic and haven't sat down and worked out what 100vms booting simultaneously on a single high-capacity drive or small array of drives on a cheap-rear end controller means just in terms of seek time alone.

"But it worked before!"

Potato Salad fucked around with this message at 23:55 on Aug 18, 2016

Potato Salad
Oct 23, 2014

nobody cares


I'm coming off harsher than I mean to. Boot storms are not inconveniences, they are problems.

Potato Salad
Oct 23, 2014

nobody cares


Winkle-Daddy posted:

And why you think we're running on a single spinning disk, I don't know; it's a RAID array.

Do you happen to have the model of the raid controller? Budget and even medium-range servers tend to ship with the lowest or next-to-lowest controller available for that generation unless you ask for something better during purchasing. The H710 that most (in my experience) sales reps include in quotes for gen 8 poweredge systems is weaker in a parity array than some high-end enterprise single disks in the real world. Additionally, just because this is a raid controller doesn't mean that seek time is eliminated as an issue. Striping can increase straight sequential reads, sure, but OS booting rarely is sequential. It's random as hell, and striping really doesn't help you when a disk still has to move from sector to sector physically. What striping/mirroring scheme are you using, and what's the storage system on your guests so we can figure out influences like dedupe. Is the raid controller in write through or write back mode? Storage is complex.

What are the OSes of the guests, and what build of esxi 5.5 did you roll back to? Same as before? If I am coming off like a tier 1 call center guy right now, it's because you may not have had the personal experience of seeing how deep the virt storage stack rabbit hole can go. Details are going to be important when you're maxing the system out for hours on end and expecting the storage stack not to collapse and start crashing.

Potato Salad fucked around with this message at 14:26 on Aug 19, 2016

Potato Salad
Oct 23, 2014

nobody cares


The bit about "it's not working like it was before" is invalid in a high-demand situation with respect to the capabilities of the underlying hardware. Are there pending operations that the OSes of the guests are trying to run at boot, for example? Is the guest trying to run any number of health checks on boot as they all crashed on the upgraded hypervisors before your rollback?

The system as it is right now is not congruent with the way it was before. If you were using an out-of-the-box esxi5 image before and reverted back to that image, then what's the thing still around right now that hasn't reverted? Perhaps configuration of the hypervisors, but also perhaps the state of your guests.

edit: aaaaand I was a Tier 1 guy at the beginning of my career for a high performance computing environment for climate modeling researchers, so I am aware of the friction that can take place between a developer and the hardware architects in a situation where the hardware is being intentionally pressed to the limit. Every single detail and element in the storage stack is absolutely essential to understand and draw out to reveal all the moving (proverbially) parts. Spending a few thousand bucks on enterprise-grade SSDs and removing parity calculations from your raid controller's workload may save you a lot of trouble in the long run if upgrades and changes are things you like to do this environment and you want those to go more smoothly. A Samsung 850 PRO at 1TB will last six years if written from empty to *full* every single day with a workload incurring a write amplification factor of three. If you mostly read with these devices and write less than 80B per day on average (which is more than likely if your disks are only 40GB in size), a $300 1TB 850 EVO would carry you through to its 5 year warranty and probably well beyond. Any chance you or your IT team have metrics on steady-state I/O of your testing and boot I/O? If not, something like PRTG is free for 100 sensors (more than enough to monitor two esxi hypervisors)

That is, assuming it's not just a simple hurf-durf configuration error someone made somewhere :)

Potato Salad fucked around with this message at 15:15 on Aug 19, 2016

Potato Salad
Oct 23, 2014

nobody cares


Just mulling off the top of my head right now on what I'd do as your IT guy, I'd probably sequentially boot up, let run for a bit, then shut down the VMs in small batches and watch to see whether or not their io is atypical. Paravirtualized devices could have changed on upgrade and the OSes needed a clean reboot after updating opcode or the drivers for your underlying hardware changed with the upgrade and would also require an OS reboot or at least result in a longer reboot depending on your guest OS. There's a lot that goes on when you upgrade esxi generations.



Perhaps all the guests need is a clean reboot in small batches and then they'll be good to go as before.

edit: consider, for example, what guest you have and whether switching or changing the HAL is something that'll happen on boot for it after the upgrade. It may be the case that this particular bootstorm that is timing out your VMs is a particularly nasty one that needs hand holding https://en.wikipedia.org/wiki/Hardware_abstraction#In_Operating_Systems

Potato Salad fucked around with this message at 14:32 on Aug 19, 2016

Potato Salad
Oct 23, 2014

nobody cares


As your IT guy, I'd probably also get a quote for a few $300 1TB 850 EVO SSDs and suggest that the endurance of these TLC drives may well be suited for your test environment if you're writing less than 80GB/day to each one each day on average (which would be 560 GB per week per ssd, or 56GB per VM per weekly test if that's anywhere close to your use case).

I also need to apologize for what has been a departure from the normally-chill tone of this thread.

Potato Salad fucked around with this message at 15:23 on Aug 19, 2016

Potato Salad
Oct 23, 2014

nobody cares


Who the gently caress VAR tried to sell MS licenses for nix systems?

Potato Salad
Oct 23, 2014

nobody cares


VMworld is meh. Woohoo, html5 next generation, yay everyone loves the host web client.

Still waiting for 'em to price their poo poo more competitively.

Potato Salad
Oct 23, 2014

nobody cares


I think containerization is the thing for you if you're running a bunch of scripts and homebrew apps with a short memory budget. The memory overhead of a little Ubuntu Server VM may be relatively small, but if you're running 6--8 apps on that machine, you'll get into the red kinda quick. You also seem to be maxing memory out sometimes, particularly in buffering. I'm guessing the zpool is backed by spinning disks? Not knowing what you're running on the machine, I'd have to guess that the buffering is write activity on your smb / zpool setup.

Is the write performance of your zpool ever a problem? A roommate uses zfs / Freenas to serve media and all of our centralized backups in addition to a pair of jails. It's a 16GB machine, and when we're doing big writes, our monitoring VM will see the storage stack gobble up huge amounts of ram.

I say this because you may notice an occasional performance degradation on your smb/zpool storage service if my assumption about what's occasionally eating your ram is correct. This would make precisely divvying-up your memory to guest VMs a rather crucial task. Got any monitoring on the memory consumption of each of your services?

Potato Salad
Oct 23, 2014

nobody cares


"Frank, dude, Frank, have you seen the new hard drives? Holy poo poo one WDRed can hold 16gb"

"poo poo, Sam, pair of those in raid and she's good to go."

Potato Salad
Oct 23, 2014

nobody cares


That's an awful lot of energy draw though. On the flip side are single-socket boards with processors like the E3-1265L V3 running only 40-ish watts idle (I may or may not be partial to my supermicro uATX + 1265L v3 VMware lab).

Potato Salad
Oct 23, 2014

nobody cares


drat your math! :bahgawd:

Potato Salad
Oct 23, 2014

nobody cares


Just wait until you try to configure hardware passthrough in the web console :unsmigghh:

Or manage serial devices.

Potato Salad
Oct 23, 2014

nobody cares


It's fairly clear uncommon functions have not been fully implemented and tested yet.

I'd rather it was in this state than not out at all, though. Works fine for day to day management.

Potato Salad
Oct 23, 2014

nobody cares


Nvidia GTX series cards try to avoid virt. You have to either pull serious trickery or buy a Quadro.

What about eGPU?

External Thundabolt 3 gpu enclosure + passhrough......? Forget use case; would anything prevent use of a, say, GTX 1060 by TB3 on a Windows vm?

Adbot
ADBOT LOVES YOU

Potato Salad
Oct 23, 2014

nobody cares


My Google-fu fails me at the moment, but I'll keep trying:

Does vSAN All-Flash require two distinct capacity and caching tiers, or can you get away with just a capacity tier?

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply