Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
MrMoo
Sep 14, 2000

Appropriate, 2016 and vendors still pushing out data in page terminal format:

Adbot
ADBOT LOVES YOU

Sinestro
Oct 31, 2010

The perfect day needs the perfect set of wheels.

Storysmith posted:

Having missed the original freakout, which person was throwing their cake into the mud? I think the fact that people within the project felt free to react with bannings and locking of the bug is a pretty big sign of a problem there.

I feel like a lot of nerds got into computers because we were bullied underdogs and then never internally recalibrated to the fact that almost every one of us makes more than the median salary and has a poo poo ton of opportunity ahead of us, and are no longer underdogs.

I understand them banning the people involved and locking the bug, because the response to the initial rejection wasn't to say "well, let's look at changing it universally in the next major revision" or "that's a bad thing", it was to scream at people and call them names for not agreeing immediately. "Throwing the cake in the mud" was describing the process of "They disagreed with me? I'm gonna quit forever!", which has happened many, many times for even less legitimate reasons, and it's a shame no matter what.

"They won't bend to a partial renaming of a variable that would make stuff inconsistent because I feel like it" is a bad reason to quit a community forever, and honestly if you were the type to do that, you probably shouldn't be in the kinds of leadership roles within the community that she was in no matter what, if it's just your personal feelings driving something, because like my original rant said, through about six and a half layers of exaggeration and fake rage on top of real frustration, a tiny symbolic change like that that isn't even really an issue unless you're intentionally trying to look for things to get offended over is not going to exactly change the world. Considering that the post was coming from a place of frustration about how I feel that every one of these happenings makes me less credible as a minority engineer (objectively, I'd be less likely to put a fellow woman or LGBT engineer into a leadership position within a project I ran if they had a connection with those sorts of politics because I don't want to depend on someone who could just decide that a tiny issue is worth quitting and running away from a project) and a general distaste for symbolic changes, you're pretty much making the point for me. It's generally a stupid point when people try to say that changes shouldn't be made just because there are worse problems in the world, but this is so loving inconsequential either way, especially when there are much, much more legitimate issues within the programming community that need addressed before something like that should even be on the radar. The value in terms of changing the world that partially changing a vaguely sexual but entirely non-gender specific double entendre variable name would have is so low that it should be decided based on the same metric as any other variable renaming patch, and I haven't read very much of the R codebase or done a ton of research into this issue, but one of the main objections to merging the patch is that there's still a lot of other places where similar things (give.head shows up a lot) couldn't be changed, so it'd be making the code less consistent and harder to read to half-make a symbolic change, and that's not good engineering. I don't think that NASA is going to change the shape of a rocket to make it less phallic, and that's about the level that the change is at.

fritz
Jul 26, 2003

Sinestro posted:

I understand them banning the people involved and locking the bug, because the response to the initial rejection wasn't to say "well, let's look at changing it universally in the next major revision" or "that's a bad thing", it was to scream at people and call them names for not agreeing immediately. "Throwing the cake in the mud" was describing the process of "They disagreed with me? I'm gonna quit forever!", which has happened many, many times for even less legitimate reasons, and it's a shame no matter what.

The banning was part of the initial rejection tho.

Sinestro
Oct 31, 2010

The perfect day needs the perfect set of wheels.
They're not in the right either. Everyone is a piece of poo poo!

sarehu
Apr 20, 2007

(call/cc call/cc)

Storysmith posted:

I think the fact that people within the project felt free to react with bannings and locking of the bug is a pretty big sign of a problem there.

It was clear from the go that the dude was some posturing asshat. Obviously they were fine with the code how it was at some point because that's how they wrote it, now some bro comes in saying he knows better and you should get with the program and he's gonna waste your time with his bullshit, yeah the answer is to tell him to gently caress right off.

I have no idea how you would be confused by this.

TooMuchAbstraction
Oct 14, 2012

I spent four years making
Waves of Steel
Hell yes I'm going to turn my avatar into an ad for it.
Fun Shoe

Sinestro posted:

They're not in the right either. Everyone is a piece of poo poo!

Finally, something we can all agree on.

kloa
Feb 14, 2007



Sorry, I meant if I go to run my Visual Studio project and something in the code doesn't match the SP, there's no debugging inside of VS.

Jabor
Jul 16, 2010

#1 Loser at SpaceChem

Sinestro posted:

They're not in the right either. Everyone is a piece of poo poo!

Yeah, but you're still saying that quitting the developer community in response to being unilaterally banned from the developer community is somehow you being the one throwing a hissy fit instead of the other way around.

Mostly it just reads like you have no loving idea what went on (and either haven't read or just plain ignored what other people have said about the facts of the matter), but you sure do have a whole lot of preconceived notions about it that you'd really like to be true.

sarehu
Apr 20, 2007

(call/cc call/cc)
The dude was editing comments on his blog posting that he didn't like into childish personal attacks, I think the banninators had the right idea. Edit: Re vvvv: And no actually that includes reasonable people commenting politely.

And no Jabor you are the one who doesn't know what's going on. Stop being so naive and learn to figure out who you shouldn't take at face value.

sarehu fucked around with this message at 07:56 on Feb 18, 2016

Knyteguy
Jul 6, 2005

YES to love
NO to shirts


Toilet Rascal
C# code:
       // Return true if strIn is in valid e-mail format.
       try {
          return Regex.IsMatch(strIn,
                @"^(?("")("".+?(?<!\\)""@)|(([0-9a-z]((\.(?!\.))|[-!#\$%&'\*\+/=\?\^`\{\}\|~\w])*)(?<=[0-9a-z])@))" +
                @"(?(\[)(\[(\d{1,3}\.){3}\d{1,3}\])|(([0-9a-z][-\w]*[0-9a-z]*\.)+[a-z0-9][\-a-z0-9]{0,22}[a-z0-9]))$",
                RegexOptions.IgnoreCase, TimeSpan.FromMilliseconds(250));
       }
https://msdn.microsoft.com/en-us/library/01escwtf(v=vs.110).aspx

necrotic
Aug 2, 2005
I owe my brother big time for this!

Knyteguy posted:

C# code:
       // Return true if strIn is in valid e-mail format.
       try {
          return Regex.IsMatch(strIn,
                @"^(?("")("".+?(?<!\\)""@)|(([0-9a-z]((\.(?!\.))|[-!#\$%&'\*\+/=\?\^`\{\}\|~\w])*)(?<=[0-9a-z])@))" +
                @"(?(\[)(\[(\d{1,3}\.){3}\d{1,3}\])|(([0-9a-z][-\w]*[0-9a-z]*\.)+[a-z0-9][\-a-z0-9]{0,22}[a-z0-9]))$",
                RegexOptions.IgnoreCase, TimeSpan.FromMilliseconds(250));
       }
https://msdn.microsoft.com/en-us/library/01escwtf(v=vs.110).aspx

http://emailregex.com/

Subjunctive
Sep 12, 2006

✨sparkle and shine✨

MrMoo posted:

Appropriate, 2016 and vendors still pushing out data in page terminal format:



I got a FOIA request back from DHS, and it was a PDF of single-character runs of text in the format of a series of 200 terminal screen captures. I've never written that much code for an immigration application before.

sarehu
Apr 20, 2007

(call/cc call/cc)
Is there some common terminal interaction library that people commonly use? Like, the way people do GUI scripting? At My First Internship it would have been really useful for doing some stuff.

necrotic
Aug 2, 2005
I owe my brother big time for this!
tmux has a bunch of scripting capabilities if thats what you mean.

Flobbster
Feb 17, 2005

"Cadet Kirk, after the way you cheated on the Kobayashi Maru test I oughta punch you in tha face!"

This is a fake site, right? Because the Swift one using NSPredicate is the most roundabout and dumb way of matching a regex.

necrotic
Aug 2, 2005
I owe my brother big time for this!
I have no idea but it's probably real since anyone regex matching emails is probably a horror of their own.

MrMoo
Sep 14, 2000

sarehu posted:

Is there some common terminal interaction library that people commonly use? Like, the way people do GUI scripting? At My First Internship it would have been really useful for doing some stuff.

libexpect as an API version of expect?

sarehu
Apr 20, 2007

(call/cc call/cc)

necrotic posted:

I have no idea but it's probably real since anyone regex matching emails is probably a horror of their own.

What are you supposed to do then? Like say you have a big-rear end text file and you want to scrape emails. Or you're scraping address books on an Exchange server and you need to see which emails are really valid email addresses and which are garbage. Or you need to see which are equivalent, e.g. "John Smith"@foo.com might be equivalent to john.smith@foo.com for your purposes.

necrotic
Aug 2, 2005
I owe my brother big time for this!

sarehu posted:

What are you supposed to do then? Like say you have a big-rear end text file and you want to scrape emails. Or you're scraping address books on an Exchange server and you need to see which emails are really valid email addresses and which are garbage. Or you need to see which are equivalent, e.g. "John Smith"@foo.com might be equivalent to john.smith@foo.com for your purposes.

Hence "probably" and not "definitely". But you don't use it to validate that the email is valid, you use it to find them in that scenario and thats it. Email addresses can be a large range of values that are not simply (somedotsandcharsornumbers)@(somecool.domain.maybe), you can quote the target and have spaces even (not that its exactly popular, but it is a valid email address).

You send something to the email if you want to know its a real email address.

And if you're just scraping for emails to spam who cares if its valid or not just send include it in your awful list.

sarehu
Apr 20, 2007

(call/cc call/cc)

necrotic posted:

Hence "probably" and not "definitely". But you don't use it to validate that the email is valid, you use it to find them in that scenario and thats it.

You may also need to recognize which entries are valid email addresses, and which are definitely invalid, not even attempting to be an email address. You might get some wrong (because it's supposed to be jon.smith, not john.smith) but recognizing valid/invalid email addresses accurately and productively is definitely a thing, and regexes are by far the best tool for the job. Besides hard AI.

Incidentally, the ones you see on that website are all no good. The RFC's don't matter. Hell, it doesn't even follow the RFC it says it's following, and as a result misses valid email addresses like ones with comments. And such email addresses do exist in the wild.

necrotic posted:

You send something to the email if you want to know its a real email address.
No, sorry, your client marketing analytics solution can't start sending validation emails to customers' clients.

necrotic
Aug 2, 2005
I owe my brother big time for this!
And thats a very specific use case which of course yes you use regex. A very large portion of the people attempting to regex match emails are doing it for registration forms or some other pointless bullshit.

necrotic
Aug 2, 2005
I owe my brother big time for this!

sarehu posted:

No, sorry, your client marketing analytics solution can't start sending validation emails to customers' clients.

Then stop scraping emails jesus loving christ.

Cuntpunch
Oct 3, 2003

A monkey in a long line of kings

kloa posted:

Sorry, I meant if I go to run my Visual Studio project and something in the code doesn't match the SP, there's no debugging inside of VS.

Not only that, but in an enterprise environment the following is not guaranteed to be true for the .NET developers(it certainly isn't true for me):

https://msdn.microsoft.com/en-us/library/cc646024(v=SQL.100).aspx posted:

SQL Server Management Studio must be running under a Windows account that is a member of the sysadmin fixed server roll.

sarehu
Apr 20, 2007

(call/cc call/cc)

necrotic posted:

Then stop scraping emails jesus loving christ.

Yes don't scrape emails from your own Exchange server. Good luck with that pal.

necrotic
Aug 2, 2005
I owe my brother big time for this!

sarehu posted:

Yes don't scrape emails from your own Exchange server. Good luck with that pal.

I honestly don't know in what situation you would scrape emails from an Exchange server. Its an email server so why not look at the To/Cc/Bcc/etc... headers?

ErIog
Jul 11, 2001

:nsacloud:

sarehu posted:

No, sorry, your client marketing analytics solution can't start sending validation emails to customers' clients.

The point is that you can't know from a simple regex if an e-mail address is valid or invalid. So the fact that it passes through a validation regex is functionally meaningless since even if it fits the RFC then it still could be a simple typo like "jon" instead of "john."

So while you might fend off the most extreme kind of data entry mistakes your "validation" isn't going to catch the most common and mundane kinds of e-mail address typos.

So you could spend a lot of time writing a regex to try to validate an e-mail address against the RFC, but it's a pretty big waste of time and will just create headaches later when it turns out you accidentally marked valid addresses as invalid.

ErIog fucked around with this message at 03:51 on Feb 18, 2016

TooMuchAbstraction
Oct 14, 2012

I spent four years making
Waves of Steel
Hell yes I'm going to turn my avatar into an ad for it.
Fun Shoe

ErIog posted:

So you could spend a lot of time writing a regex to try to validate an e-mail address against the RFC, but it's a pretty big waste of time and will just create headaches later when it turns out you accidentally marked valid addresses as invalid.

For marketing purposes, I fail to see why a slightly-imperfect filter is so awful. False negatives (valid addresses marked invalid) will not hear from us; false positives ("wrong numbers") will either bounce or accidentally contact the wrong person. But either way as long as the failure rate is low, we'll end up with a collection of addresses that are largely valid, which is a hell of a lot better than a "try the address and see if it delivers" approach.

I...don't really get why this is controversial?

ShoulderDaemon
Oct 9, 2003
support goon fund
Taco Defender

TooMuchAbstraction posted:

For marketing purposes, I fail to see why a slightly-imperfect filter is so awful. False negatives (valid addresses marked invalid) will not hear from us; false positives ("wrong numbers") will either bounce or accidentally contact the wrong person. But either way as long as the failure rate is low, we'll end up with a collection of addresses that are largely valid, which is a hell of a lot better than a "try the address and see if it delivers" approach.

I...don't really get why this is controversial?

I think most of us are assuming that you are vastly overestimating the cost of a false positive. Email bounces and misdeliveries are cheap. Your marketing coverage will be better (zero percent false negative rate!) if you just don't do any validation, and your costs will be very similar - there's no downside, a nontrivial potential upside (assuming you believe your marketing works), and it's less work for you!

Fundamentally, the validity of an email address carries no meaningful information up until the point at which you are actually about to send an email; at that point, the easiest and most accurate way to determine validity is to simply send the email. Doing anything else just means that you'll piss off people who's email addresses are incorrectly rejected, at little-to-zero gain.

Dessert Rose
May 17, 2004

awoken in control of a lucid deep dream...

sarehu posted:

The dude was editing comments on his blog posting that he didn't like into childish personal attacks, I think the banninators had the right idea.

Yeah, childish personal attacks like "Otters are super loving cute you guys".

I took a look at the other, unedited, posts of one of the more prolific posters that was affected by these edits and I can't see how that could have been anything but an improvement.

He can do whatever he wants with comments on his own blog, those posters don't have some sort of right to have their hateful garbage immortalized on his site.

sarehu
Apr 20, 2007

(call/cc call/cc)

necrotic posted:

I honestly don't know in what situation you would scrape emails from an Exchange server. Its an email server so why not look at the To/Cc/Bcc/etc... headers?

Also their address books, the content of their emails, and other things that aren't email (and things that aren't Exchange, too). For analytics.

(Also, IIRC even some of the data you find in "To" fields might be really sketchy, if it's Exchange.)

ErIog posted:

So while you might fend off the most extreme kind of data entry mistakes your "validation" isn't going to catch the most common and mundane kinds of e-mail address typos.

Actually it catches and prevents a whole lot of useless garbage data.

And it's generally possible to make a regex that works well and has a negligible error rate. For when you need to do that sort of thing.

Hammerite
Mar 9, 2007

And you don't remember what I said here, either, but it was pompous and stupid.
Jade Ear Joe
Basically email addresses are horrible, dealing with email is horrible, anyone who thinks they've found a simple easy way to deal with email is probably horrible.

Cuntpunch
Oct 3, 2003

A monkey in a long line of kings
Regex talk? Poor use of regex talk? Someone here *must* have come across "lets parse/validate HTML with regex" recently...right?

kloa
Feb 14, 2007


Cuntpunch posted:

Regex talk? Poor use of regex talk? Someone here *must* have come across "lets parse/validate HTML with regex" recently...right?

:catstare: pray for that person

gonadic io
Feb 16, 2011

>>=
I recently parsed our logs for an xml request (between Envelope tags) that contained a particular 16 digital ID between Id tags.

Then I striped the indentation and newlines using another regex and had bash generate Sql queries to insert the xml into the db (these we ran manually). Thought of this thread while I did it but it was a 1-time thing (backfilling the rows in our db that were generated before we started saving the response xml automatically) and the format was very fixed.

HappyHippo
Nov 19, 2003
Do you have an Air Miles Card?

Hammerite posted:

Basically email addresses are horrible, dealing with email is horrible, anyone who thinks they've found a simple easy way to deal with email is probably horrible.

Whether they work or not, I don't think many of those email regexs could be accused of being simple.

Munkeymon
Aug 14, 2003

Motherfucker's got an
armor-piercing crowbar! Rigoddamndicu𝜆ous.



ShoulderDaemon posted:

I think most of us are assuming that you are vastly overestimating the cost of a false positive. Email bounces and misdeliveries are cheap. Your marketing coverage will be better (zero percent false negative rate!) if you just don't do any validation, and your costs will be very similar - there's no downside, a nontrivial potential upside (assuming you believe your marketing works), and it's less work for you!

Fundamentally, the validity of an email address carries no meaningful information up until the point at which you are actually about to send an email; at that point, the easiest and most accurate way to determine validity is to simply send the email. Doing anything else just means that you'll piss off people who's email addresses are incorrectly rejected, at little-to-zero gain.

What if I consider excluding people who care deeply enough about whether forms will accept their technically-RFC-compliant :fishmech: addresses that they'll give up and not use my service rather than excluding the comment or whatever from their email a net gain?

feedmegin
Jul 30, 2008

kloa posted:

:catstare: pray for that person

This may be :thejoke: but I was assuming a reference to this - http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags

leper khan
Dec 28, 2010
Honest to god thinks Half Life 2 is a bad game. But at least he likes Monster Hunter.

Munkeymon posted:

What if I consider excluding people who care deeply enough about whether forms will accept their technically-RFC-compliant :fishmech: addresses that they'll give up and not use my service rather than excluding the comment or whatever from their email a net gain?

I'm totally ok with that as long as you aren't a utility provider. I had a /real/ fun time recovering my password when the electric company's sign up email validation presumably used the correct email validation regex and their recovery page did not.

I may have asked their support to leave detailed messages about how to fix their poo poo and why they should use an RFC compliant regex for their engineering staff.

Impotence
Nov 8, 2010
Lipstick Apathy

sarehu posted:

And it's generally possible to make a regex that works well and has a negligible error rate. For when you need to do that sort of thing.

I usually just use one of those libraries that just do everything for you like not care too much past an @ existing, autosuggesting common typos for major webmail provider domains, and checking for a valid MX record on the host

Adbot
ADBOT LOVES YOU

Series DD Funding
Nov 25, 2014

by exmarx
An RFC-compliant regex doesn't exist because comments can nest infinitely :fishmech:

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply