|
Nevett posted:This is in C#, in an ASP.NET project I've inherited. Apart from missing a "break", given a sane implementation of substring and string comparison (e.g. substring doesn't allocate a new string), this isn't really much different from the normal approach, is it?
|
# ? Aug 31, 2010 13:33 |
|
|
# ? May 30, 2024 13:09 |
|
Well, here's the method that uses it:code:
|
# ? Aug 31, 2010 13:42 |
|
markerstore posted:Apart from missing a "break", given a sane implementation of substring and string comparison (e.g. substring doesn't allocate a new string), this isn't really much different from the normal approach, is it? Disregarding the stupidity of the actual usage scenario, these would've been saner implementations in my opinion: code:
This is how I'd have done it to begin with: code:
|
# ? Aug 31, 2010 14:21 |
|
Regexes are actually a pretty good idea for this simple validation. Honestly, return str ~= /^\d+$/ is a much more clear and legible way of expressing "this should consist solely of digits", and return str ~= !^(\d{2})/(\d{2})/\d{4}$! and int($1) < 32 and int($2) < 13 is a much better way of expressing that CheckDate thingy. They're still simpler and clearer even after translating them to a language that isn't so straightforward with regards to regex matching. Of course, TryParse is really the better way of handling it. You can even use the parsed result as an out parameter, since you know the calling code is probably immediately going to parse it once you tell it it's good.
|
# ? Aug 31, 2010 14:26 |
|
SirViver posted:Substring does allocate a new string. You can read the characters directly from the string, though, making substring completely unnecessary for reading single characters. That said, how would you implement substring without allocating a new string? He may be referring to Java, where substring uses the parent's char array. http://www.javamex.com/tutorials/memory/string_memory_usage.shtml
|
# ? Aug 31, 2010 14:34 |
|
Nevett posted:This is in C#, in an ASP.NET project I've inherited. Run this thread through there and see how long it takes.
|
# ? Aug 31, 2010 15:24 |
|
Geekner posted:He may be referring to Java, where substring uses the parent's char array. http://www.javamex.com/tutorials/memory/string_memory_usage.shtml Jabor posted:Regexes are actually a pretty good idea for this simple validation.
|
# ? Aug 31, 2010 15:37 |
|
markerstore posted:Apart from missing a "break", given a sane implementation of substring and string comparison (e.g. substring doesn't allocate a new string), this isn't really much different from the normal approach, is it? Though, indeed, SirViver is almost sure to be on the mark since ASCII 0-9 are contiguous, the normal approach is surely to check something like if (a[i] >= '0' && a[i] <= '9') { ... }. But I disagree on the regex point; I mean, /-?\d+(\.\d+)?/ isn't the goriest regex if you wanted to do it right, anyway.
|
# ? Aug 31, 2010 16:00 |
|
SirViver posted:Substring does allocate a new string. You can read the characters directly from the string, though, making substring completely unnecessary for reading single characters. That said, how would you implement substring without allocating a new string? Strings are immutable in C# like in java, why would it allocate a new string?
|
# ? Aug 31, 2010 16:28 |
|
b0lt posted:Strings are immutable in C# like in java, why would it allocate a new string? Because otherwise you need to keep a ref to the old string, which depending on the relative sizes of the strings could be bad.
|
# ? Aug 31, 2010 16:33 |
|
SirViver posted:Maybe you're right, but to be honest, personally I avoid regexes as much as possible. Unless you're well versed in regex syntax and use them constantly they tend to become maintenance nightmares quickly. For people who learn regex only to solve a problem and immediately forget how it works afterwards (like me) they also end up completely indecipherable, regardless how simple. Then do something like this: code:
Of course, in the case of dates, they're not the best tool for the job, I was just using this as an example. Dates should always be parsed/tested using whatever calendar facilities your framework provides.
|
# ? Aug 31, 2010 16:56 |
|
code:
|
# ? Aug 31, 2010 17:24 |
|
SirViver posted:Substring does allocate a new string. You can read the characters directly from the string, though, making substring completely unnecessary for reading single characters. That said, how would you implement substring without allocating a new string?
|
# ? Aug 31, 2010 17:33 |
|
No Pants posted:Char.IsNumber(Char)
|
# ? Aug 31, 2010 19:46 |
|
Darth Nemesis posted:That's going to match things like ⅞ and ௰. Do you really want it to accept all of these? Edit: That actually makes things worse. Carry on. No Pants fucked around with this message at 21:49 on Aug 31, 2010 |
# ? Aug 31, 2010 21:16 |
|
Lexical Unit posted:Note that it's completely possible and straight forward to use non-static methods for events. You tend to see stuff like this a lot when someone couldn't figure out the syntax for taking a member pointer. Which is not completely unreasonable.
|
# ? Aug 31, 2010 22:53 |
|
No Pants posted:Fine. ch <= '\x00ff' && char.IsDigit(ch), then. http://dotnetpad.net/ViewPaste/v5vLYwZDTUStflE3PsCHbA
|
# ? Aug 31, 2010 22:56 |
|
Smugdog Millionaire posted:http://dotnetpad.net/ViewPaste/v5vLYwZDTUStflE3PsCHbA http://dotnetpad.net/ViewPaste/Lp6md7ZXDkmeyXlb-uNjmA
|
# ? Aug 31, 2010 23:11 |
|
This, this here, this is what a coding horror looks like. Macros and functions renamed, and several hundred lines of sub-horror snipped out (largely to protect certain secrets, and partly to actually make it a little clearer, if there is such a thing). This is a chunk of functionality programmed entirely by macros. The general pattern is (and there are a number of different implementation files): code:
code:
And there's bonus horror! A lot of this code is handling semaphores badly. The implementation code handles semaphores badly. Debugging this kind of code is near enough impossible. Oh, in case you're wondering. Yes, there is a reason it is written this way. And, no, it isn't a good reason.
|
# ? Aug 31, 2010 23:29 |
|
SirViver posted:Maybe you're right, but to be honest, personally I avoid regexes as much as possible. Unless you're well versed in regex syntax and use them constantly they tend to become maintenance nightmares quickly. For people who learn regex only to solve a problem and immediately forget how it works afterwards (like me) they also end up completely indecipherable, regardless how simple. But yes, for very simple cases that are unlikely to need correction and if using static precompiled ones (for better performance) they should be acceptable. My regex-allergic coworkers will write hundreds of lines of tedious, impenetrable if soup to grab some information from a coded string and I'll come along and make a regular expression that's less fragile* and does the same job - often more. Of course, it depends on what you do all day, but they're super loving helpful if you get the hang of them. *As in, written so they won't break if there's an extra character anywhere in the string. This is important here because when their spaghetti code can't handle a tiny variation in the input, it costs someone else (who is already really loving busy) time to sort it out and if that happens every day, then someone is wasting tons of time dealing with it.
|
# ? Aug 31, 2010 23:33 |
|
I figured it would be unfair not to give an example of one implementation. This is a genuine reduction of the C file included 3 times in the 'client' example. code:
code:
|
# ? Aug 31, 2010 23:49 |
|
Flobbster posted:Then do something like this: That said, I have to admit that I very rarely run into the need to parse strings anyway. Maybe if I had to do a lot of that my opinion would be different vv
|
# ? Sep 1, 2010 09:34 |
|
SirViver posted:That said, I have to admit that I very rarely run into the need to parse strings anyway. Maybe if I had to do a lot of that my opinion would be different vv Yeah, maybe you'd learn to regex. They are not some deep mystery only comprehensible by the gods, they're the most trivial way of expressing a string pattern.
|
# ? Sep 1, 2010 09:51 |
|
SirViver posted:It didn't even occur to me to not do that to begin with . Still, I'm wary of them and don't use them unless the parsing complexity really requires a regex. If the code is ten times longer than a regex but easier to read and maintain for me and more importantly my coworkers, I'll prefer the non-regex solution any day. One thing I've learned during my years is that writing "smart" or compact code at the expense of clarity does not help at all maintaining it in the long run. Regexes, unless clearly the better solution (I'm not arguing against their use in general, just using them where there is no distinct advantage), make the code look a lot more hostile, even if the parsing that is being done is very simple. Your point makes sense, but I'm not sure it's applicable here. At some point brevity becomes its own simplicity. A single line regex with maybe a comment wrapped in a well-named function that calls the regex is probably a whole lot easier for other programmers to follow than an if..then soup spaghetti which can mask subtle bugs with the validation that will be hard to track down. The limits of the regex are almost always explicitly spelled out in the regex, and a regex is very simple to pull out to test separately if it does become an issue. It's all about shades of grey here, but most regexes aren't really that complicated. They're handy. ErIog fucked around with this message at 14:17 on Sep 1, 2010 |
# ? Sep 1, 2010 14:14 |
|
porkfactor posted:This is a genuine reduction of the C file included 3 times in the 'client' example. It's like they saw "x-macros" and the original Bourne shell's code and thought "those guys didn't go nearly far enough."
|
# ? Sep 1, 2010 14:44 |
|
ErIog posted:Your point makes sense, but I'm not sure it's applicable here. At some point brevity becomes its own simplicity. A single line regex with maybe a comment wrapped in a well-named function that calls the regex is probably a whole lot easier for other programmers to follow than an if..then soup spaghetti which can mask subtle bugs with the validation that will be hard to track down. The limits of the regex are almost always explicitly spelled out in the regex, and a regex is very simple to pull out to test separately if it does become an issue. Regexes rule! I totally built an HTML parsing library using them!
|
# ? Sep 1, 2010 15:41 |
|
Zombywuf posted:Yeah, maybe you'd learn to regex. They are not some deep mystery only comprehensible by the gods, they're the most trivial way of expressing a string pattern. Yeah, this. You really should learn them, and they're really not that hard. They have applications other than programming too.
|
# ? Sep 1, 2010 15:59 |
|
Wheany posted:Yeah, this. You really should learn them, and they're really not that hard. They have applications other than programming too. Hey baby, s/your pants//g gently caress off, nerd
|
# ? Sep 1, 2010 22:12 |
|
TRex EaterofCars posted:Hey baby, s/your pants//g Is it bad that this would work on me?
|
# ? Sep 1, 2010 22:56 |
|
quiggy posted:Is it bad that this would work on me? The best chat-up lines are the insanely geeky ones because anybody they work on is an insane geek.
|
# ? Sep 1, 2010 23:21 |
|
porkfactor posted:Oh, in case you're wondering. Yes, there is a reason it is written this way. And, no, it isn't a good reason. Please tell me because it's generated from some weird-rear end-language to C parser.
|
# ? Sep 1, 2010 23:22 |
|
NotShadowStar posted:Please tell me because it's generated from some weird-rear end-language to C parser. Oh, if only. That would even make sense. Apparently the author thought that writing like this would be an awesome way to prevent someone from reverse engineering the code. There are times I swear some of the crap I have to work with is some kind of elaborate practical joke.
|
# ? Sep 1, 2010 23:54 |
|
Captain Capacitor posted:Regexes rule! I totally built an HTML parsing library using them! Best stackoverflow post: quote:You can't parse [X]HTML with regex. Because HTML can't be parsed by regex. Regex is not a tool that can be used to correctly parse HTML. As I have answered in HTML-and-regex questions here so many times before, the use of regex will not allow you to consume HTML. Regular expressions are a tool that is insufficiently sophisticated to understand the constructs employed by HTML. HTML is not a regular language and hence cannot be parsed by regular expressions. Regex queries are not equipped to break down HTML into its meaningful parts. so many times but it is not getting to me. Even enhanced irregular regular expressions as used by Perl are not up to the task of parsing HTML. You will never make me crack. HTML is a language of sufficient complexity that it cannot be parsed by regular expressions. Even Jon Skeet cannot parse HTML using regular expressions. Every time you attempt to parse HTML with regular expressions, the unholy child weeps the blood of virgins, and Russian hackers pwn your webapp. Parsing HTML with regex summons tainted souls into the realm of the living. HTML and regex go together like love, marriage, and ritual infanticide. The <center> cannot hold it is too late. The force of regex and HTML together in the same conceptual space will destroy your mind like so much watery putty. If you parse HTML with regex you are giving in to Them and their blasphemous ways which doom us all to inhuman toil for the One whose Name cannot be expressed in the Basic Multilingual Plane, he comes. HTML-plus-regexp will liquify the nerves of the sentient whilst you observe, your psyche withering in the onslaught of horror. Rege̿̔̉x-based HTML parsers are the cancer that is killing StackOverflow it is too late it is too late we cannot be saved the trangession of a chi͡ld ensures regex will consume all living tissue (except for HTML which it cannot, as previously prophesied) dear lord help us how can anyone survive this scourge using regex to parse HTML has doomed humanity to an eternity of dread torture and security holes using regex as a tool to process HTML establishes a breach between this world and the dread realm of c͒ͪo͛ͫrrupt entities (like SGML entities, but more corrupt) a mere glimpse of the world of regex parsers for HTML will instantly transport a programmer's consciousness into a world of ceaseless screaming, he comes, the pestilent slithy regex-infection will devour your HTML parser, application and existence for all time like Visual Basic only worse he comes he comes do not fight he com̡e̶s, ̕h̵is un̨ho͞ly radiańcé destro҉ying all enli̍̈́̂̈́ghtenment, HTML tags lea͠ki̧n͘g fr̶ǫm ̡yo͟ur eye͢s̸ ̛l̕ik͏e liquid pain, the song of re̸gular expression parsing will extinguish the voices of mortal man from the sphere I can see it can you see ̲͚̖͔̙î̩́t̲͎̩̱͔́̋̀ it is beautiful the final snuffing of the lies of Man ALL IS LOŚ͖̩͇̗̪̏̈́T ALL IS LOST the pon̷y he comes he c̶̮omes he comes the ichor permeates all MY FACE MY FACE ᵒh god no NO NOO̼OO NΘ stop the an*̶͑̾̾̅ͫ͏̙̤g͇̫͛͆̾ͫ̑͆l͖͉̗̩̳̟̍ͫͥͨe̠̅s ͎a̧͈͖r̽̾̈́͒͑e not rè̑ͧ̌aͨl̘̝̙̃ͤ͂̾̆ ZA̡͊͠͝LGΌ ISͮ̂҉̯͈͕̹̘̱ TO͇̹̺ͅƝ̴ȳ̳ TH̘Ë͖́̉ ͠P̯͍̭O̚N̐Y̡ H̸̡̪̯ͨ͊̽̅̾̎Ȩ̬̩̾͛ͪ̈́̀́͘ ̶̧̨̱̹̭̯ͧ̾ͬC̷̙̲̝͖ͭ̏ͥͮ͟Oͮ͏̮̪̝͍M̲̖͊̒ͪͩͬ̚̚͜Ȇ̴̟̟͙̞ͩ͌͝S̨̥̫͎̭ͯ̿̔̀ͅ
|
# ? Sep 2, 2010 01:48 |
|
TRex EaterofCars posted:Hey baby, s/your pants//g Well, I couldn't actually think of that many examples, and even those are close to programming. Anyway: learn regular expressions, they're really loving useful and not hard at all. (try http://www.weitz.de/regex-coach/ )
|
# ? Sep 2, 2010 11:07 |
|
spinflip posted:Best stackoverflow post: I was hoping someone was going to post that. I love working with regexes in Python, especially taking advantage of the string concatenation. code:
|
# ? Sep 2, 2010 14:49 |
|
code:
|
# ? Sep 2, 2010 15:25 |
|
CHRISTS FOR SALE posted:
I'm not sure hitting paste twice is really a coding horror, especially since it doesn't affect functionality or introduce any display bugs.
|
# ? Sep 2, 2010 16:17 |
|
Captain Capacitor posted:I was hoping someone was going to post that. In Perl, you can do this a bit easier by using the /x switch on the regex; since Python regexes are PCRE to my knowledge, it's likely the /x switch is implemented and you don't have to simulate it yourself. Cf. code:
|
# ? Sep 2, 2010 17:32 |
|
Lumpy posted:I'm not sure hitting paste twice is really a coding horror, especially since it doesn't affect functionality or introduce any display bugs.
|
# ? Sep 2, 2010 18:27 |
|
|
# ? May 30, 2024 13:09 |
|
Captain Capacitor posted:I was hoping someone was going to post that. I'm sure you probably already know this, but it's good practice to use r"raw strings" for regexp in python so you don't have to escape anything (not that you had to here, of course).
|
# ? Sep 2, 2010 19:17 |