|
I'm on call every other week, for an AIX environment. Also de-facto person to call for a handful of applications. I just always carry my phone on me and don't let it change my life plans. I get called maybe once every other month. Out of curiosity, how many people use VmWare level snapshots for backups, tied with Commvault or Veeam, along with SQL servers? We put them in years ago for an environment of maybe a bit over 1200 vms. It's nice to have quick restores in the event of some kind of failure...but.... Combined with storage autotiering pools and a heterogeneous system environment, irregular workloads on lots of systems running all different operating systems...it wrecks havoc. VmWare snapshots can stun a VM if the deltas are growing at a rate faster than the disk can handle. SQL quieses databases, stops IO. All for completely random intervals, sometimes problematic, sometimes not, on 100s of machines. Sometimes the intervals are long enough to knock applications offline. You can tweak configuration, VmWare stun settings, no quiese policies, etc. Is it really worth it?
|
# ? Nov 4, 2015 15:38 |
|
|
# ? May 27, 2024 03:42 |
|
adorai posted:I am on call 24/7, 365 days per year for escalations. I am on call for direct end user support 4 weeks out of the year (once per quarter). Other than saturday mornings during my 4 weeks, when we have branches open, I think I average about 2 calls per year. I can live with that. Yeah I could definitely live with that (and what most other people are describing). I think my first company was just run like poo poo. Guess this isn't something I need to be overly concerned with; as with all things work-related, it all depends on the employer. Appreciate all the responses.
|
# ? Nov 4, 2015 15:52 |
|
At my last job, we ostensibly had an on-call rotation, but I took it from my team 100% of the time because I think I only got paged twice in two years with anything important.
|
# ? Nov 4, 2015 15:56 |
|
SSH IT ZOMBIE posted:VmWare level snapshots for backups, tied with Commvault or Veeam, along with SQL servers? <rant about snapshots are not backups> Snaps are convenient, but don't give point-in-time recovery for a database nor single-database restore, and are almost totally unusable with "tightly-coupled" VMs (guest clusters, mirroring, replication, etc). If you have some other backup system doing its own database-level backups, they can easily stomp on each other's toes and wreck your ability to restore. Hilariously, I've seen commvault detect that Veeam was affecting the log chain and auto-escalate log backups to Fulls, with the result of it taking 144 Full backups per day -- this had been going on for over two years when I walked in the door at that place and was just subtle enough that no one noticed because it never reported a job error. </rant> You're spot on that you have to be concerned about the disk performance, because snapshot cleanup has to perform a ton of IO and isn't always very good at filling but not overloading the IO system. Autotiering can be a huge negative here because the snapshot data that has to be merged at cleanup has by definition only been touched once at most and probably over a week ago so its been pushed down to the slowest media. Autotiering may kick in and escalate that data back to a fast tier, except that it's also still only going to be read once for merging and then deleted, so it has pushed useful data out of fast tiers for no value.
|
# ? Nov 4, 2015 16:47 |
|
Vulture Culture posted:At my last job, we ostensibly had an on-call rotation, but I took it from my team 100% of the time because I think I only got paged twice in two years with anything important. Yeah, I do this. In 18 months, I have received about 3 after hours calls. One was literally mid-move as I was carrying a mattress upstairs, that sucked, but otherwise it's been pretty easy and it's just not worth bothering with an official rotation. Yet.
|
# ? Nov 4, 2015 17:21 |
|
CLAM DOWN posted:Man, the average age of my department has to be 40+, the idea of a videogame would probably frighten them. I'm 40 and played Q2 with my cube neighbors over IPX at a Fortune 500 company. And we had a dedicated gaming LAN behind a locked door where we played all the early halflife mods like CS and TF.
|
# ? Nov 4, 2015 17:27 |
|
keseph posted:<rant about snapshots are not backups> VEEAM and Commvault don't create snapshot backups, they create snapshots and use those to create backups written to secondary storage. Both can also be configured to create log chain backups along with application consistent snapshots of a database to provide point in time recovery. It's also silly to say that if you can't roll through the log chain it isn't a backup when many people run their database in simple recovery mode anyway. There are also array level tools like SnapManager or Recoverpoint that can be used to create array snapshots coupled with log chain backups for recovery. Snapshots are backups. They are a recoverable, independent copy of the data. They *probably* shouldn't be your only copy of those backups, but coupled with array based replication they can make an excellent strategy for creating quick, low impact backups with short windows. VMware snapshots are bad for the reasons listed, but traditional agent based backup can also lead to serious issues when the array gets overloaded with sequential IO from 100 VMs all kicking off their backups at the same time. VVOLs will provide some relief since they offload snapshot operations to the array, and depending on your array manufacturer VEEAM and Commvault can also use array snapshots to make VMware snapshots short lived enough to avoid VM stun issues All of that said, plenty of people still use VEEAM and Commvault and NetBackup and Barracuda and all of the other VADP based backup tools without serious issue. Also, auto-tiering is a pretty stupid storage technology but it can generally be configured to avoid collecting data during backup windows. Likewise, most hybrid storage vendors do workload detection to avoid caching big sequential work. YOLOsubmarine fucked around with this message at 17:44 on Nov 4, 2015 |
# ? Nov 4, 2015 17:41 |
|
ratbert90 posted:People here asked if I wanted the credentials to throw slack/work email on my phone. First question from me was "Sure, the company will pay for my phone plan yes?"
|
# ? Nov 4, 2015 19:08 |
|
Collateral Damage posted:My boss asked if I wanted an iphone so I could hace a work number and read work email out of office hours. I asked him how much I'd be compensated since my contract doesn't include regular on call. I wouldn't have a work phone at all if they weren't compensating 100% tbh, and paying for any effort I put in outside of normal hours.
|
# ? Nov 4, 2015 19:09 |
|
I meant compensation as in extra pay for practically being on call. I turned that offer down.
|
# ? Nov 4, 2015 20:50 |
|
SSH IT ZOMBIE posted:Out of curiosity, how many people use VmWare level snapshots for backups, tied with Commvault or Veeam, along with SQL servers? We put them in years ago for an environment of maybe a bit over 1200 vms. It's nice to have quick restores in the event of some kind of failure...but.... We use our integrated SAN tools to backup all of our SQL and Exchange databases, which are stored on iSCSI luns dedicated to those servers. These are application consistent snapshots. We regularly test these. keseph posted:<rant about snapshots are not backups>
|
# ? Nov 4, 2015 23:51 |
|
evol262 posted:Seconding this. Past the mid-level roles, you get called less and less (other than DBAs). Escalated to, maybe, but that's rare. This was kind of my point. Mid level roles that are on call take the brunt of it, yes, but farther up the chain is more likely to be on call(for escalations, but that is still on call)
|
# ? Nov 5, 2015 00:59 |
|
That's not on call. On call is "Make yourself available during these hours. Be sober, in the area, and ready to drive to the data center in case something happens that we need to deal with." Escalating to a senior level resource in the event something critical explodes is just life in general, regardless of industry, position, or field of employment.
|
# ? Nov 5, 2015 01:17 |
|
adorai posted:We use VMware snapshots only as manual backups. We use our SAN to snap the VMware datastores at regular intervals. These backups are CRASH consistent. We are comfortable with them, because the drawbacks are pretty minor. We'd need to re-lay out our storage to do array level snapshots. Maybe we really need to do that. We have datastores and VMDKs provisioned out according to performance needs across multiple arrays, not in a way that lets us easily identify what luns need to be snapped to grab all of the drives for a given VM. Our storage guys have talked about it, maybe it is the better way to do it. Our backup solution isn't just snapshots for obvious reasons - the snapshots get backed up in enterprise backup software then released, but we're doing it at the VmWare layer and not the san.
|
# ? Nov 5, 2015 01:44 |
|
SSH IT ZOMBIE posted:We'd need to re-lay out our storage to do array level snapshots. Maybe we really need to do that. Your array manufacturer probably has a plugin that will handle creating a consistency group snapshot for VMs that have VMDKs across multiple LUNs.
|
# ? Nov 5, 2015 01:55 |
|
adorai posted:Array level snapshots ARE backups, as long as you understand the pitfalls with your snapshot strategy and you replicate or otherwise move or copy them to other storage. An array level or hypervisor level snapshot is not a backup; the copy to another device is the backup. The copy survives a software failure like a .vmx corruption or a controller or major physical failure on the storage device, and if "replicate" means storage firmware replicating the snapshot to an identical device then you have to think long and hard about how you protect yourself from firmware bugs. A backup isn't a backup if it shares a single point of failure with the live data, be it the disks or the software on top instantly replicating the damage over to the copy. Maybe I've just had awful luck seeing a bad SAN controller firmware update nuke half the LUNs on a 55TB device and synchronously mirror the nuke out to the failover DC, and also seen a .vmx corruption hang a VM host and completely ruin all the attached .vmdks and their VM snapshots. But poo poo happens sometimes, just ask codespaces.com how useful their backups were when an admin got owned and deleted both the hosted VMs and the online backups.
|
# ? Nov 5, 2015 04:20 |
|
keseph posted:Maybe I've just had awful luck seeing a bad SAN controller firmware update nuke half the LUNs on a 55TB device and synchronously mirror the nuke out to the failover DC, and also seen a .vmx corruption hang a VM host and completely ruin all the attached .vmdks and their VM snapshots. But poo poo happens sometimes, just ask codespaces.com how useful their backups were when an admin got owned and deleted both the hosted VMs and the online backups. Truth. Anything important needs a copy elsewhere, a real backup. You aren't alone. SAN, OS based, and VMWare based snapshots will all occasionally but very rarely screw up in a catastrophic way. It is not due to user error, there are occasionally legitimate bugs. Specifically vmware snapshots have always irked me as they opt to write change blocks to delta files instead of moving original data on write to delta files. It allows for quick snapshot rollback, but, if for some reason VMWare cannot consolidate, you are looking at data loss. A datastore filling up is incredibly dangerous and must be avoided at all costs.
|
# ? Nov 5, 2015 04:54 |
|
keseph posted:An array level or hypervisor level snapshot is not a backup; the copy to another device is the backup. The copy survives a software failure like a .vmx corruption or a controller or major physical failure on the storage device, and if "replicate" means storage firmware replicating the snapshot to an identical device then you have to think long and hard about how you protect yourself from firmware bugs. A backup isn't a backup if it shares a single point of failure with the live data, be it the disks or the software on top instantly replicating the damage over to the copy. They are backups. Whether they are reliable enough to serve as your only backup is a business and technology decision that will depend on a number of things, but they are most certainly backups in the sense that they provide a copy of data as it existed at a previous time. Array level snapshots protect against local file deletion, vmx corruption, guest filesystem corruption, and failed application installs. They provide a restorable copy for about 99% of restore requests. If it wasn't a backup we wouldn't talk about restoring from them. VMware snapshots are a less reliable form of backup. Setting up a SQL job to dump your database to another disk on the same server is an even less reliable form, but that is also still a backup, and something that you'll still find SQL admins doing. More importantly, most modern arrays have a variety of data protection features like linked checksums, misplaced write detection, regular parity scrubs, battery backed journals, and "always" consistent filesystems with atomic updates. You're far less likely to suffer corruption there than you are on, say, a tape sitting on a shelf somewhere. Likewise, the replication features rely on the filesystem being consistent so an inconsistency will break replication. And the target array can generally be configured to maintain a number of older snapshots to allow rollback on the secondary if necessary. I'm not sure what a malicious admin has to do with array snapshots. A malicious admin can also delete your backup catalog or "lose" the encryption keys you used to encrypt your data at rest on your backup target, or simply re-configure all of your backup jobs to age out all backups immediately. Not giving the same person rights to the delete both production data and backup data is the only real protection against that, but that doesn't preclude using array features as your first backup in the chain, nor does it preclude array based replication. The fundamental misconception here seems to be that some people think that backup and DR are the same thing. Having multiple backup copies, including off-site copies, may be one facet of your DR plan, but not every backup needs to be capable of surviving a disaster (your array getting trashed due to a firmware upgrade failing is a disaster and should be covered under your DR plan). Your offline backups written to a Data Domain that sits next to your production SAN isn't useful for DR when the building floods or burns down, but it was still a backup. And your synchronously replicated database that sites 200km away isn't a backup because you will never recover aged data from it, but it's probably very useful for DR. Backup is not DR, DR is not backup.
|
# ? Nov 5, 2015 06:57 |
|
Got the message yesterday that 2 out of the 4 positions in our support group has to be cut. The group consist of our manager who I take it for granted will not be cut, me and two coworkers. So the three of us now have to compete for one position, where we have to explain why we are the right person for the remaining job, and try indirectly to explain why the other two are the wrong person. So suddenly I find myself in a loving episode of The Apprentice, where "You're Fired" is a real thing if you gently caress up in the boardroom! I have already lawyered up.
|
# ? Nov 5, 2015 08:27 |
|
What are you getting a lawyer for?
|
# ? Nov 5, 2015 10:06 |
|
psydude posted:That's not on call. On call is "Make yourself available during these hours. Be sober, in the area, and ready to drive to the data center in case something happens that we need to deal with." Escalating to a senior level resource in the event something critical explodes is just life in general, regardless of industry, position, or field of employment. Do companies really require people to be sober while on call if they could still work?
|
# ? Nov 5, 2015 10:40 |
|
lampey posted:Do companies really require people to be sober while on call if they could still work? How would you get to work say in the middle of the night while on call if you're not sober?
|
# ? Nov 5, 2015 10:47 |
|
Taxi.
|
# ? Nov 5, 2015 10:52 |
|
CLAM DOWN posted:How would you get to work say in the middle of the night while on call if you're not sober? VPN.
|
# ? Nov 5, 2015 11:01 |
|
Dark Helmut posted:I'm 40 and played Q2 with my cube neighbors over IPX at a Fortune 500 company. And we had a dedicated gaming LAN behind a locked door where we played all the early halflife mods like CS and TF. Yeah - bear in mind a 40 year old now was born in 1975 and has therefore quite possibly been playing video games since they were a toddler.
|
# ? Nov 5, 2015 12:54 |
|
luminalflux posted:VPN. This guy gets it.
|
# ? Nov 5, 2015 13:45 |
|
Tab8715 posted:What are you getting a lawyer for? In america you can get a lawyer for anything and everything. Might as well exercise the right.
|
# ? Nov 5, 2015 14:29 |
|
I want to build a remote control datacenter robot so I can replace failed hardware from my couch.
|
# ? Nov 5, 2015 14:31 |
|
lampey posted:Do companies really require people to be sober while on call if they could still work? If your concern is "I work on critical 24x7 production systems better when I'm drunk" then I guess that explains a lot about some places I've worked in the past
|
# ? Nov 5, 2015 15:32 |
|
Collateral Damage posted:I want to build a remote control datacenter robot so I can replace failed hardware from my couch. "remote hands"
|
# ? Nov 5, 2015 15:33 |
|
evobatman posted:Got the message yesterday that 2 out of the 4 positions in our support group has to be cut. The group consist of our manager who I take it for granted will not be cut, me and two coworkers. So the three of us now have to compete for one position, where we have to explain why we are the right person for the remaining job, and try indirectly to explain why the other two are the wrong person. So suddenly I find myself in a loving episode of The Apprentice, where "You're Fired" is a real thing if you gently caress up in the boardroom! Unless your manager has a lot of other responsibilities, I don't think you can assume they won't get cut. A manager with a single direct report usually leaves HR scratching their heads and wondering why that manager's boss couldn't manage the direct report themselves.
|
# ? Nov 5, 2015 16:59 |
|
evobatman posted:Got the message yesterday that 2 out of the 4 positions in our support group has to be cut. The group consist of our manager who I take it for granted will not be cut, me and two coworkers. So the three of us now have to compete for one position, where we have to explain why we are the right person for the remaining job, and try indirectly to explain why the other two are the wrong person. So suddenly I find myself in a loving episode of The Apprentice, where "You're Fired" is a real thing if you gently caress up in the boardroom! I don't see a "win" coming out of this for anybody. If you're the one they keep, you're inheriting the workload of 2 people on top of what you already do, and since you were "saved" by management you'll be indebted to do all the work without complaint or additional compensation. I wouldn't bother with a lawyer, I'd take their announcement as a fine opportunity to start job searching before you're unemployed.
|
# ? Nov 5, 2015 17:51 |
Judge Schnoopy posted:I don't see a "win" coming out of this for anybody. If you're the one they keep, you're inheriting the workload of 2 people on top of what you already do, and since you were "saved" by management you'll be indebted to do all the work without complaint or additional compensation. I agree with this, you should just job hunt and even if you win that position leave the company asap. Thats a sinking ship and you need to bail out
|
|
# ? Nov 5, 2015 17:56 |
|
Why would you hire a lawyer in that situation anyways? God, I wish taxis weren't poo poo here and that would actually be an option vvv Uber will probably never operate in my province, welp CLAM DOWN fucked around with this message at 18:13 on Nov 5, 2015 |
# ? Nov 5, 2015 17:58 |
|
CLAM DOWN posted:How would you get to work say in the middle of the night while on call if you're not sober? Thanks to Uber, I don't have to even call or speak to someone!
|
# ? Nov 5, 2015 18:03 |
|
CLAM DOWN posted:Why would you hire a lawyer in that situation anyways? Yeah unless there's some relevant info you're leaving out (you can prove you're being fired over something that's a protected class, basically) I don't see how getting laid off is something to get a lawyer involved for. I mean, it sucks, but there's not much to be done about it. I'll dogpile with everyone else and say this is a sign that it's time to polish up the resume and YOTJ out of there. Even if you are one of the "lucky" survivors.
|
# ? Nov 5, 2015 18:38 |
|
4 people -> 1 person is a "living envy the dead" situation. Also, shame time: look at all you people not wanting SA upgrade to
|
# ? Nov 5, 2015 19:50 |
|
Bhodi posted:Also, shame time: look at all you people not wanting SA upgrade to Cloud is cool but bad practices on physical hardware only get worse on someone else's platform
|
# ? Nov 5, 2015 19:57 |
|
Need to get some cloud up in this bitch.
|
# ? Nov 5, 2015 20:15 |
|
|
# ? May 27, 2024 03:42 |
|
Vulture Culture posted:More because of complaints like "we had MySQL corruption because our host rebooted the wrong server" where the filesystem crash-consistency issues are not likely to get better on AWS or wherever Bhodi fucked around with this message at 21:24 on Nov 5, 2015 |
# ? Nov 5, 2015 21:21 |