|
Could you explain what exactly (_.strip() for _ in s.split(',')) is doing?
|
# ? May 10, 2014 20:44 |
|
|
# ? May 9, 2024 12:16 |
|
For every item ("_", in this case) in the iterable (s.split(',')), call method .strip() of that item and put that item in a tuple. Utimately, it results in a tuple like this: ("115 Berry Lane", "Plano", "TX 24134") For more explanation: https://docs.python.org/2/tutorial/datastructures.html#list-comprehensions
|
# ? May 10, 2014 20:54 |
|
Wouldn't it make sense to use regex in this case? I guess splitting strings would work for such a simple case, but it's always worth considering.code:
|
# ? May 10, 2014 20:54 |
|
I was always told that it's better to use Python methods rather than regex, but I don't really know why
|
# ? May 11, 2014 01:29 |
|
QuarkJets posted:I was always told that it's better to use Python methods rather than regex, but I don't really know why I've been told that regexps can be extremely computationally expensive. I think it just boils down to using the right tool for the right job.
|
# ? May 11, 2014 01:50 |
|
Cultural Imperial posted:I've been told that regexps can be extremely computationally expensive. I think it just boils down to using the right tool for the right job. I've also heard people say to avoid them because they can be difficult for other people to read
|
# ? May 11, 2014 01:54 |
|
BannedNewbie posted:I've also heard people say to avoid them because they can be difficult for other people to read I don't find that's much of a problem with all the web based regexp testers out there.
|
# ? May 11, 2014 02:02 |
|
Cultural Imperial posted:I don't find that's much of a problem with all the web based regexp testers out there. Yeah, but if they're just Python methods then it won't be necessary to use a web-based regexp tester And that doesn't help those of us who have to develop in an environment without internet
|
# ? May 12, 2014 06:46 |
|
QuarkJets posted:Yeah, but if they're just Python methods then it won't be necessary to use a web-based regexp tester Hmmm. Can a python method search for a windows SID? Like a string with the form S-1-.....
|
# ? May 12, 2014 07:19 |
|
"Don't use regular expressions because they're hard to read" is such a terrible piece of advice that I can only assume it was cargo-culted from a Slashdot comment from 2003. Regexs are fine and if they get complicated you pretty fix them up with Pythons support for verbose regex via re.VERBOSE.
|
# ? May 12, 2014 08:50 |
|
QuarkJets posted:And that doesn't help those of us who have to develop in an environment without internet I don't think I would accept a job offer from anywhere where this was the case. It just sounds miserable.
|
# ? May 12, 2014 17:30 |
|
QuarkJets posted:And that doesn't help those of us who have to develop in an environment without internet I'd LOVE to know what this company is, please tell us!!
|
# ? May 12, 2014 17:33 |
|
edit Nm answered my own question
the fucked around with this message at 17:57 on May 12, 2014 |
# ? May 12, 2014 17:55 |
|
e;fb by original poster
|
# ? May 12, 2014 17:58 |
|
the posted:Why isn't this working? str.strip just strips characters off the beginning and end of the string. See: https://docs.python.org/2/library/stdtypes.html#str.strip Try str.replace, or using re.sub for more complicated cases.
|
# ? May 12, 2014 17:59 |
|
ahmeni posted:"Don't use regular expressions because they're hard to read" is such a terrible piece of advice that I can only assume it was cargo-culted from a Slashdot comment from 2003. Regexs are fine and if they get complicated you pretty fix them up with Pythons support for verbose regex via re.VERBOSE. Like many sorts of general advice it breaks down in all sorts of situations. There are many times where using a string method or two makes your intent more clear than a regex. There are many times where you've got a dozen lines of string methods to do something you could do in a simple regex. String methods are faster than a regex. I think there are times when a regex is as fast (I can't say I've really run in to any situations where a regex is faster) but I can't come up with any examples off the top of my head. There are even more times where the speed doesn't even matter and you should go with the method that makes your intent the clearest. Example of string method being 10x faster (compiling the regex in this case makes hardly any difference): http://pastebin.com/7Jybgfid Someone is welcome to come up with an example of the reverse. I'm sure such an example exists. Thermopyle fucked around with this message at 19:03 on May 12, 2014 |
# ? May 12, 2014 19:01 |
|
Basically it all adds up to "use regular expressions when they are the most suitable tool for the job, and don't use them when they are not". Which applies equally to any tool at your disposal.
|
# ? May 12, 2014 19:11 |
|
I'm trying to use cssselect to strip the state URLs off this page I tried doing a straight selection of tables, but that didn't work. Then I tried going in via the div tag and then to the table, but that isn't grabbing anything either: Python code:
code:
edit: I also used this page as a test run for the css selector. I pasted in the entire source code from the page I want, and I wrote 'table' in the second box, and it perfectly selected the states. So why the hell isn't *my* code working? Second edit: Figured out the problem. They have a <p> tag on Line 50 that is supposed to be a </p> tag on their page. This breaks lxml I think? the fucked around with this message at 20:54 on May 12, 2014 |
# ? May 12, 2014 19:22 |
|
Thermopyle posted:Example of string method being 10x faster (compiling the regex in this case makes hardly any difference): For reference, Python always compiles the regex. Around 2.2 it got an internal cache, so there's no reason to use re.compile manually.
|
# ? May 12, 2014 20:06 |
|
Suspicious Dish posted:For reference, Python always compiles the regex. Around 2.2 it got an internal cache, so there's no reason to use re.compile manually. Oh, sweet. I think I might of known that at some point because years ago I used to compile them all the time and now I never do.
|
# ? May 12, 2014 20:21 |
|
Suspicious Dish posted:For reference, Python always compiles the regex. Around 2.2 it got an internal cache, so there's no reason to use re.compile manually. Explicitly compiling long-lived regex objects can still be beneficial and there isn't really any downside to doing so.
|
# ? May 12, 2014 21:00 |
|
Plorkyeran posted:Explicitly compiling long-lived regex objects can still be beneficial How?
|
# ? May 12, 2014 21:44 |
|
Thermopyle posted:How? Only if you can do it before you know when it will be used at runtime. Like just compile it on initialization and then you don't have to pay hte one-time compile cost on first-use.
|
# ? May 12, 2014 21:54 |
|
Thermopyle posted:How? The cache is not infinite in size (unbounded caches are also known as "memory leaks"), so regex objects that should be long-lived can get bumped from the cache by large numbers of one-shot regexes. Not something that's going to be an issue very often, but more often than never. Ignoring performance, I also think that precompiling hardcoded regexes result in trivially more readable code, i.e.: Python code:
|
# ? May 12, 2014 22:24 |
|
You can increase the cache size if you're going over 100 regexes (like a big Django site might).
|
# ? May 12, 2014 22:25 |
|
BeefofAges posted:I don't think I would accept a job offer from anywhere where this was the case. It just sounds miserable. It's not bad at all, and really not that rare, although I wouldn't say that it's common. Usually these kinds of environments will have Internet-connected computers sitting next to development computers, so it's not like you're screwed if you need to Google something. You won't be able to easily copy example code verbatim, however, which is probably actually a good thing
|
# ? May 12, 2014 23:54 |
|
more like dICK posted:You can increase the cache size if you're going over 100 regexes (like a big Django site might).
|
# ? May 12, 2014 23:57 |
|
So last night I made a thing, potentially dumb, potentially not. https://bitbucket.org/MalucoMarinero/cellacceptance Takes a running instance of LibreOffice, loads a spreadsheet and then inputs data and reads results. It's a way you can run generative tests on complex calculations, and test for correctness using a client vetted spreadsheet. Bit of a Rude Goldberg machine but maybe it'll help someone. Babby's first Python module so potentially badly packaged.
|
# ? May 13, 2014 01:27 |
|
Looking for regex help! As a non-matching group in a larger regex, I'm trying to match any number of spaces, or a dash. (?:\s*|\-) This matches any number of spaces, but returns None for all subsequent matching groups if it finds a -. (?:\s+|\-) This matches 1 or more spaces or a -. Ie it does what I think it should do, but I really need to match 0 or more spaces. (?:\s*|\-) When in doubt, try a question mark. Nope - this cuts the results short no matter what it finds. Edit: figured out a solution: (?:\-|\s*) Ie do the - first... Magic.*? Dominoes fucked around with this message at 10:24 on May 15, 2014 |
# ? May 15, 2014 10:07 |
|
I know you said you've solved it, but I'm still curious why you need to match zero or more spaces.
|
# ? May 15, 2014 11:34 |
|
Dominoes posted:Looking for regex help! As a non-matching group in a larger regex, I'm trying to match any number of spaces, or a dash. I don't believe the hyphen needs escaping in this context. (?:-|\s*)
|
# ? May 15, 2014 11:57 |
|
Dominoes posted:Looking for regex help! As a non-matching group in a larger regex, I'm trying to match any number of spaces, or a dash. (?:\s*|\-) is ambiguous because even when there's a hyphen there are also zero spaces in front of it. If you're sure that there should never be both spaces and a hyphen, you can use a negative lookahead assertion to only match spaces not followed by a hyphen, (?:\s*(?!-)|-) Or else it may be simpler to just match spaces and then an optional hyphen, (?:\s*-?)
|
# ? May 15, 2014 12:20 |
|
So I went from numba 0.11.1 to 0.12.1, and now inplace subtraction is no longer allowed. I wonder if inplace addition with a negative number is allowed? I haven't tried it. After fixing that, two functions that are quite similar in how they work have diverged in performance. One that used to be slow has gotten a lot faster (10s -> 2 s), and one that used to be quite fast is now a lot slower (0.25 s -> 15 s). I was looking at numbapro but this kind of stuff I'll have to play with it though, as I imagine there's some way I'm writing my code that is hurting its performance. The only issue is that some of the users are on 0.11 and some are on 0.12. Ugh.
|
# ? May 15, 2014 13:36 |
|
ohgodwhat posted:So I went from numba 0.11.1 to 0.12.1, and now inplace subtraction is no longer allowed. I wonder if inplace addition with a negative number is allowed? I haven't tried it. I would definitely encourage you to share some specifics on the GH issue tracker: https://github.com/numba/numba
|
# ? May 15, 2014 21:19 |
|
sharktamer posted:I know you said you've solved it, but I'm still curious why you need to match zero or more spaces. qntm posted:I don't believe the hyphen needs escaping in this context. (?:-|\s*) suffix posted:(?:\s*|\-) is ambiguous because even when there's a hyphen there are also zero spaces in front of it.
|
# ? May 15, 2014 22:34 |
|
What do you guys use for templating? Jinja2?
|
# ? May 15, 2014 23:07 |
|
Cultural Imperial posted:What do you guys use for templating? Jinja2? I typically use Mako, but I would be interested to hear what other people prefer.
|
# ? May 16, 2014 00:05 |
|
I also prefer mako, although I consider Jinja perfectly acceptable as an alternative. I haven't had a chance to use it, but Plim looks pretty cool, if you want to go down the complile-to-html route.
|
# ? May 16, 2014 03:18 |
|
Cultural Imperial posted:What do you guys use for templating? Jinja2? I use Jinja2, but I'm using flask, so.
|
# ? May 16, 2014 04:11 |
|
|
# ? May 9, 2024 12:16 |
|
Total programming rookie, working on an assignment in jython and I'm trying to implement input validation for two things at once, kinda stumped. Basically I'm drawing a 'room' on an image and then requesting input location to place a light inside the room. Between taking inputs and placing the light I want to check that the x/y coords are within the image and that the light is actually placed inside the room I've drawn. I can validate both of these independently just fine but can't work out a good economy for doing both appropriately, since my logic so far lets the user pass one check and then fail the first check when passing the second yet continue. I know how to wing it but I'd be repeating a lot of code and that doesn't seem right to me. Could anyone hook me up with some reading material or just general advice on input validation economy? Struggling with this got me really interested in best practices for things like this.
|
# ? May 16, 2014 08:59 |