Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
SSH IT ZOMBIE
Apr 19, 2003
No more blinkies! Yay!
College Slice
I'm on call every other week, for an AIX environment. Also de-facto person to call for a handful of applications. I just always carry my phone on me and don't let it change my life plans. I get called maybe once every other month.

Out of curiosity, how many people use VmWare level snapshots for backups, tied with Commvault or Veeam, along with SQL servers? We put them in years ago for an environment of maybe a bit over 1200 vms. It's nice to have quick restores in the event of some kind of failure...but....

Combined with storage autotiering pools and a heterogeneous system environment, irregular workloads on lots of systems running all different operating systems...it wrecks havoc.
VmWare snapshots can stun a VM if the deltas are growing at a rate faster than the disk can handle.
SQL quieses databases, stops IO. All for completely random intervals, sometimes problematic, sometimes not, on 100s of machines. Sometimes the intervals are long enough to knock applications offline.

You can tweak configuration, VmWare stun settings, no quiese policies, etc.

Is it really worth it?

Adbot
ADBOT LOVES YOU

Japanese Dating Sim
Nov 12, 2003

hehe
Lipstick Apathy

adorai posted:

I am on call 24/7, 365 days per year for escalations. I am on call for direct end user support 4 weeks out of the year (once per quarter). Other than saturday mornings during my 4 weeks, when we have branches open, I think I average about 2 calls per year. I can live with that.

Yeah I could definitely live with that (and what most other people are describing). I think my first company was just run like poo poo.

Guess this isn't something I need to be overly concerned with; as with all things work-related, it all depends on the employer.

Appreciate all the responses.

Vulture Culture
Jul 14, 2003

I was never enjoying it. I only eat it for the nutrients.
At my last job, we ostensibly had an on-call rotation, but I took it from my team 100% of the time because I think I only got paged twice in two years with anything important.

keseph
Oct 21, 2010

beep bawk boop bawk

SSH IT ZOMBIE posted:

VmWare level snapshots for backups, tied with Commvault or Veeam, along with SQL servers?
...
Is it really worth it?

<rant about snapshots are not backups>
Snaps are convenient, but don't give point-in-time recovery for a database nor single-database restore, and are almost totally unusable with "tightly-coupled" VMs (guest clusters, mirroring, replication, etc).
If you have some other backup system doing its own database-level backups, they can easily stomp on each other's toes and wreck your ability to restore. Hilariously, I've seen commvault detect that Veeam was affecting the log chain and auto-escalate log backups to Fulls, with the result of it taking 144 Full backups per day -- this had been going on for over two years when I walked in the door at that place and was just subtle enough that no one noticed because it never reported a job error.
</rant>
You're spot on that you have to be concerned about the disk performance, because snapshot cleanup has to perform a ton of IO and isn't always very good at filling but not overloading the IO system. Autotiering can be a huge negative here because the snapshot data that has to be merged at cleanup has by definition only been touched once at most and probably over a week ago so its been pushed down to the slowest media. Autotiering may kick in and escalate that data back to a fast tier, except that it's also still only going to be read once for merging and then deleted, so it has pushed useful data out of fast tiers for no value.

AlternateAccount
Apr 25, 2005
FYGM

Vulture Culture posted:

At my last job, we ostensibly had an on-call rotation, but I took it from my team 100% of the time because I think I only got paged twice in two years with anything important.

Yeah, I do this. In 18 months, I have received about 3 after hours calls. One was literally mid-move as I was carrying a mattress upstairs, that sucked, but otherwise it's been pretty easy and it's just not worth bothering with an official rotation. Yet.

Dark Helmut
Jul 24, 2004

All growns up

CLAM DOWN posted:

Man, the average age of my department has to be 40+, the idea of a videogame would probably frighten them.

I'm 40 and played Q2 with my cube neighbors over IPX at a Fortune 500 company. And we had a dedicated gaming LAN behind a locked door where we played all the early halflife mods like CS and TF. :smug:

YOLOsubmarine
Oct 19, 2004

When asked which Pokemon he evolved into, Kamara pauses.

"Motherfucking, what's that big dragon shit? That orange motherfucker. Charizard."

keseph posted:

<rant about snapshots are not backups>
Snaps are convenient, but don't give point-in-time recovery for a database nor single-database restore, and are almost totally unusable with "tightly-coupled" VMs (guest clusters, mirroring, replication, etc).

VEEAM and Commvault don't create snapshot backups, they create snapshots and use those to create backups written to secondary storage. Both can also be configured to create log chain backups along with application consistent snapshots of a database to provide point in time recovery. It's also silly to say that if you can't roll through the log chain it isn't a backup when many people run their database in simple recovery mode anyway.

There are also array level tools like SnapManager or Recoverpoint that can be used to create array snapshots coupled with log chain backups for recovery.

Snapshots are backups. They are a recoverable, independent copy of the data. They *probably* shouldn't be your only copy of those backups, but coupled with array based replication they can make an excellent strategy for creating quick, low impact backups with short windows.

VMware snapshots are bad for the reasons listed, but traditional agent based backup can also lead to serious issues when the array gets overloaded with sequential IO from 100 VMs all kicking off their backups at the same time. VVOLs will provide some relief since they offload snapshot operations to the array, and depending on your array manufacturer VEEAM and Commvault can also use array snapshots to make VMware snapshots short lived enough to avoid VM stun issues

All of that said, plenty of people still use VEEAM and Commvault and NetBackup and Barracuda and all of the other VADP based backup tools without serious issue.

Also, auto-tiering is a pretty stupid storage technology but it can generally be configured to avoid collecting data during backup windows. Likewise, most hybrid storage vendors do workload detection to avoid caching big sequential work.

YOLOsubmarine fucked around with this message at 17:44 on Nov 4, 2015

Collateral Damage
Jun 13, 2009

ratbert90 posted:

People here asked if I wanted the credentials to throw slack/work email on my phone. First question from me was "Sure, the company will pay for my phone plan yes?"
My boss asked if I wanted an iphone so I could hace a work number and read work email out of office hours. I asked him how much I'd be compensated since my contract doesn't include regular on call.

Kashuno
Oct 9, 2012

Where the hell is my SWORD?
Grimey Drawer

Collateral Damage posted:

My boss asked if I wanted an iphone so I could hace a work number and read work email out of office hours. I asked him how much I'd be compensated since my contract doesn't include regular on call.

I wouldn't have a work phone at all if they weren't compensating 100% tbh, and paying for any effort I put in outside of normal hours.

Collateral Damage
Jun 13, 2009

I meant compensation as in extra pay for practically being on call.

I turned that offer down.

adorai
Nov 2, 2002

10/27/04 Never forget
Grimey Drawer

SSH IT ZOMBIE posted:

Out of curiosity, how many people use VmWare level snapshots for backups, tied with Commvault or Veeam, along with SQL servers? We put them in years ago for an environment of maybe a bit over 1200 vms. It's nice to have quick restores in the event of some kind of failure...but....
We use VMware snapshots only as manual backups. We use our SAN to snap the VMware datastores at regular intervals. These backups are CRASH consistent. We are comfortable with them, because the drawbacks are pretty minor.

We use our integrated SAN tools to backup all of our SQL and Exchange databases, which are stored on iSCSI luns dedicated to those servers. These are application consistent snapshots. We regularly test these.

keseph posted:

<rant about snapshots are not backups>
Snaps are convenient, but don't give point-in-time recovery for a database nor single-database restore, and are almost totally unusable with "tightly-coupled" VMs (guest clusters, mirroring, replication, etc).
Array level snapshots ARE backups, as long as you understand the pitfalls with your snapshot strategy and you replicate or otherwise move or copy them to other storage. Most people don't actually need point in time recovery or single database restore, they are ok with just mounting the old database files (so long as the DB had committed all writes at the time of the snap).

RFC2324
Jun 7, 2012

http 418

evol262 posted:

Seconding this. Past the mid-level roles, you get called less and less (other than DBAs). Escalated to, maybe, but that's rare.

This was kind of my point. Mid level roles that are on call take the brunt of it, yes, but farther up the chain is more likely to be on call(for escalations, but that is still on call)

psydude
Apr 1, 2008

That's not on call. On call is "Make yourself available during these hours. Be sober, in the area, and ready to drive to the data center in case something happens that we need to deal with." Escalating to a senior level resource in the event something critical explodes is just life in general, regardless of industry, position, or field of employment.

SSH IT ZOMBIE
Apr 19, 2003
No more blinkies! Yay!
College Slice

adorai posted:

We use VMware snapshots only as manual backups. We use our SAN to snap the VMware datastores at regular intervals. These backups are CRASH consistent. We are comfortable with them, because the drawbacks are pretty minor.


We'd need to re-lay out our storage to do array level snapshots. Maybe we really need to do that.
We have datastores and VMDKs provisioned out according to performance needs across multiple arrays, not in a way that lets us easily identify what luns need to be snapped to grab all of the drives for a given VM. Our storage guys have talked about it, maybe it is the better way to do it.

Our backup solution isn't just snapshots for obvious reasons - the snapshots get backed up in enterprise backup software then released, but we're doing it at the VmWare layer and not the san.

YOLOsubmarine
Oct 19, 2004

When asked which Pokemon he evolved into, Kamara pauses.

"Motherfucking, what's that big dragon shit? That orange motherfucker. Charizard."

SSH IT ZOMBIE posted:

We'd need to re-lay out our storage to do array level snapshots. Maybe we really need to do that.
We have datastores and VMDKs provisioned out according to performance needs across multiple arrays, not in a way that lets us easily identify what luns need to be snapped to grab all of the drives for a given VM. Our storage guys have talked about it, maybe it is the better way to do it.

Our backup solution isn't just snapshots for obvious reasons - the snapshots get backed up in enterprise backup software then released, but we're doing it at the VmWare layer and not the san.

Your array manufacturer probably has a plugin that will handle creating a consistency group snapshot for VMs that have VMDKs across multiple LUNs.

keseph
Oct 21, 2010

beep bawk boop bawk

adorai posted:

Array level snapshots ARE backups, as long as you understand the pitfalls with your snapshot strategy and you replicate or otherwise move or copy them to other storage.

An array level or hypervisor level snapshot is not a backup; the copy to another device is the backup. The copy survives a software failure like a .vmx corruption or a controller or major physical failure on the storage device, and if "replicate" means storage firmware replicating the snapshot to an identical device then you have to think long and hard about how you protect yourself from firmware bugs. A backup isn't a backup if it shares a single point of failure with the live data, be it the disks or the software on top instantly replicating the damage over to the copy.

Maybe I've just had awful luck seeing a bad SAN controller firmware update nuke half the LUNs on a 55TB device and synchronously mirror the nuke out to the failover DC, and also seen a .vmx corruption hang a VM host and completely ruin all the attached .vmdks and their VM snapshots. But poo poo happens sometimes, just ask codespaces.com how useful their backups were when an admin got owned and deleted both the hosted VMs and the online backups.

SSH IT ZOMBIE
Apr 19, 2003
No more blinkies! Yay!
College Slice

keseph posted:

Maybe I've just had awful luck seeing a bad SAN controller firmware update nuke half the LUNs on a 55TB device and synchronously mirror the nuke out to the failover DC, and also seen a .vmx corruption hang a VM host and completely ruin all the attached .vmdks and their VM snapshots. But poo poo happens sometimes, just ask codespaces.com how useful their backups were when an admin got owned and deleted both the hosted VMs and the online backups.

Truth. Anything important needs a copy elsewhere, a real backup. You aren't alone. SAN, OS based, and VMWare based snapshots will all occasionally but very rarely screw up in a catastrophic way. It is not due to user error, there are occasionally legitimate bugs.

Specifically vmware snapshots have always irked me as they opt to write change blocks to delta files instead of moving original data on write to delta files. It allows for quick snapshot rollback, but, if for some reason VMWare cannot consolidate, you are looking at data loss.

A datastore filling up is incredibly dangerous and must be avoided at all costs.

YOLOsubmarine
Oct 19, 2004

When asked which Pokemon he evolved into, Kamara pauses.

"Motherfucking, what's that big dragon shit? That orange motherfucker. Charizard."

keseph posted:

An array level or hypervisor level snapshot is not a backup; the copy to another device is the backup. The copy survives a software failure like a .vmx corruption or a controller or major physical failure on the storage device, and if "replicate" means storage firmware replicating the snapshot to an identical device then you have to think long and hard about how you protect yourself from firmware bugs. A backup isn't a backup if it shares a single point of failure with the live data, be it the disks or the software on top instantly replicating the damage over to the copy.

Maybe I've just had awful luck seeing a bad SAN controller firmware update nuke half the LUNs on a 55TB device and synchronously mirror the nuke out to the failover DC, and also seen a .vmx corruption hang a VM host and completely ruin all the attached .vmdks and their VM snapshots. But poo poo happens sometimes, just ask codespaces.com how useful their backups were when an admin got owned and deleted both the hosted VMs and the online backups.

They are backups. Whether they are reliable enough to serve as your only backup is a business and technology decision that will depend on a number of things, but they are most certainly backups in the sense that they provide a copy of data as it existed at a previous time. Array level snapshots protect against local file deletion, vmx corruption, guest filesystem corruption, and failed application installs. They provide a restorable copy for about 99% of restore requests. If it wasn't a backup we wouldn't talk about restoring from them. VMware snapshots are a less reliable form of backup. Setting up a SQL job to dump your database to another disk on the same server is an even less reliable form, but that is also still a backup, and something that you'll still find SQL admins doing.

More importantly, most modern arrays have a variety of data protection features like linked checksums, misplaced write detection, regular parity scrubs, battery backed journals, and "always" consistent filesystems with atomic updates. You're far less likely to suffer corruption there than you are on, say, a tape sitting on a shelf somewhere. Likewise, the replication features rely on the filesystem being consistent so an inconsistency will break replication. And the target array can generally be configured to maintain a number of older snapshots to allow rollback on the secondary if necessary.

I'm not sure what a malicious admin has to do with array snapshots. A malicious admin can also delete your backup catalog or "lose" the encryption keys you used to encrypt your data at rest on your backup target, or simply re-configure all of your backup jobs to age out all backups immediately. Not giving the same person rights to the delete both production data and backup data is the only real protection against that, but that doesn't preclude using array features as your first backup in the chain, nor does it preclude array based replication.

The fundamental misconception here seems to be that some people think that backup and DR are the same thing. Having multiple backup copies, including off-site copies, may be one facet of your DR plan, but not every backup needs to be capable of surviving a disaster (your array getting trashed due to a firmware upgrade failing is a disaster and should be covered under your DR plan). Your offline backups written to a Data Domain that sits next to your production SAN isn't useful for DR when the building floods or burns down, but it was still a backup. And your synchronously replicated database that sites 200km away isn't a backup because you will never recover aged data from it, but it's probably very useful for DR. Backup is not DR, DR is not backup.

evobatman
Jul 30, 2006

it means nothing, but says everything!
Pillbug
Got the message yesterday that 2 out of the 4 positions in our support group has to be cut. The group consist of our manager who I take it for granted will not be cut, me and two coworkers. So the three of us now have to compete for one position, where we have to explain why we are the right person for the remaining job, and try indirectly to explain why the other two are the wrong person. So suddenly I find myself in a loving episode of The Apprentice, where "You're Fired" is a real thing if you gently caress up in the boardroom!

I have already lawyered up.

Gucci Loafers
May 20, 2006

Ask yourself, do you really want to talk to pair of really nice gaudy shoes?


What are you getting a lawyer for?

lampey
Mar 27, 2012

psydude posted:

That's not on call. On call is "Make yourself available during these hours. Be sober, in the area, and ready to drive to the data center in case something happens that we need to deal with." Escalating to a senior level resource in the event something critical explodes is just life in general, regardless of industry, position, or field of employment.

Do companies really require people to be sober while on call if they could still work?

CLAM DOWN
Feb 13, 2007




lampey posted:

Do companies really require people to be sober while on call if they could still work?

How would you get to work say in the middle of the night while on call if you're not sober?

Collateral Damage
Jun 13, 2009

Taxi.

luminalflux
May 27, 2005



CLAM DOWN posted:

How would you get to work say in the middle of the night while on call if you're not sober?

VPN.

feedmegin
Jul 30, 2008

Dark Helmut posted:

I'm 40 and played Q2 with my cube neighbors over IPX at a Fortune 500 company. And we had a dedicated gaming LAN behind a locked door where we played all the early halflife mods like CS and TF. :smug:

Yeah - bear in mind a 40 year old now was born in 1975 and has therefore quite possibly been playing video games since they were a toddler. :corsair:

mayodreams
Jul 4, 2003


Hello darkness,
my old friend

This guy gets it.

bobmarleysghost
Mar 7, 2006



Tab8715 posted:

What are you getting a lawyer for?

In america you can get a lawyer for anything and everything. Might as well exercise the right.

Collateral Damage
Jun 13, 2009

I want to build a remote control datacenter robot so I can replace failed hardware from my couch.

Vulture Culture
Jul 14, 2003

I was never enjoying it. I only eat it for the nutrients.

lampey posted:

Do companies really require people to be sober while on call if they could still work?
With most companies it's "don't be drunk while on-call."

If your concern is "I work on critical 24x7 production systems better when I'm drunk" then I guess that explains a lot about some places I've worked in the past

Internet Explorer
Jun 1, 2005





Collateral Damage posted:

I want to build a remote control datacenter robot so I can replace failed hardware from my couch.

"remote hands"

Zorak of Michigan
Jun 10, 2006

evobatman posted:

Got the message yesterday that 2 out of the 4 positions in our support group has to be cut. The group consist of our manager who I take it for granted will not be cut, me and two coworkers. So the three of us now have to compete for one position, where we have to explain why we are the right person for the remaining job, and try indirectly to explain why the other two are the wrong person. So suddenly I find myself in a loving episode of The Apprentice, where "You're Fired" is a real thing if you gently caress up in the boardroom!

Unless your manager has a lot of other responsibilities, I don't think you can assume they won't get cut. A manager with a single direct report usually leaves HR scratching their heads and wondering why that manager's boss couldn't manage the direct report themselves.

Judge Schnoopy
Nov 2, 2005

dont even TRY it, pal

evobatman posted:

Got the message yesterday that 2 out of the 4 positions in our support group has to be cut. The group consist of our manager who I take it for granted will not be cut, me and two coworkers. So the three of us now have to compete for one position, where we have to explain why we are the right person for the remaining job, and try indirectly to explain why the other two are the wrong person. So suddenly I find myself in a loving episode of The Apprentice, where "You're Fired" is a real thing if you gently caress up in the boardroom!

I don't see a "win" coming out of this for anybody. If you're the one they keep, you're inheriting the workload of 2 people on top of what you already do, and since you were "saved" by management you'll be indebted to do all the work without complaint or additional compensation.

I wouldn't bother with a lawyer, I'd take their announcement as a fine opportunity to start job searching before you're unemployed.

Langolas
Feb 12, 2011

My mustache makes me sexy, not the hat

Judge Schnoopy posted:

I don't see a "win" coming out of this for anybody. If you're the one they keep, you're inheriting the workload of 2 people on top of what you already do, and since you were "saved" by management you'll be indebted to do all the work without complaint or additional compensation.

I wouldn't bother with a lawyer, I'd take their announcement as a fine opportunity to start job searching before you're unemployed.


I agree with this, you should just job hunt and even if you win that position leave the company asap. Thats a sinking ship and you need to bail out

CLAM DOWN
Feb 13, 2007




Why would you hire a lawyer in that situation anyways?


God, I wish taxis weren't poo poo here and that would actually be an option :smith:


vvv Uber will probably never operate in my province, welp

CLAM DOWN fucked around with this message at 18:13 on Nov 5, 2015

Kashuno
Oct 9, 2012

Where the hell is my SWORD?
Grimey Drawer

CLAM DOWN posted:

How would you get to work say in the middle of the night while on call if you're not sober?

Thanks to Uber, I don't have to even call or speak to someone!

Docjowles
Apr 9, 2009

CLAM DOWN posted:

Why would you hire a lawyer in that situation anyways?

Yeah unless there's some relevant info you're leaving out (you can prove you're being fired over something that's a protected class, basically) I don't see how getting laid off is something to get a lawyer involved for. I mean, it sucks, but there's not much to be done about it.

I'll dogpile with everyone else and say this is a sign that it's time to polish up the resume and YOTJ out of there. Even if you are one of the "lucky" survivors.

Bhodi
Dec 9, 2007

Oh, it's just a cat.
Pillbug
4 people -> 1 person is a "living envy the dead" situation.

Also, shame time: look at all you people not wanting SA upgrade to :yayclod:

Vulture Culture
Jul 14, 2003

I was never enjoying it. I only eat it for the nutrients.

Bhodi posted:

Also, shame time: look at all you people not wanting SA upgrade to :yayclod:
More because of complaints like "we had MySQL corruption because our host rebooted the wrong server" where the filesystem crash-consistency issues are not likely to get better on AWS or wherever


Cloud is cool but bad practices on physical hardware only get worse on someone else's platform

psydude
Apr 1, 2008

Need to get some cloud up in this bitch.

Adbot
ADBOT LOVES YOU

Bhodi
Dec 9, 2007

Oh, it's just a cat.
Pillbug

Vulture Culture posted:

More because of complaints like "we had MySQL corruption because our host rebooted the wrong server" where the filesystem crash-consistency issues are not likely to get better on AWS or wherever


Cloud is cool but bad practices on physical hardware only get worse on someone else's platform
There are no idiot remote hands in aws, only your own fuckups. Remember when the datacenter went underwater and SA was down for a week or whatever? Again, not an issue in the clod. SA is a prime candidate; a small business with two or three servers that doesn't want to have to staff an actual IT department / person.

Bhodi fucked around with this message at 21:24 on Nov 5, 2015

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply