|
Ok, cool. So, I'm importing a list of every single zipcode in the US and its corresponding latitude/longitude and state abbreviation. That seems like a prime candidate for a dictionary sorted by zipcode, right?
|
# ? Jul 11, 2014 16:41 |
|
|
# ? May 9, 2024 06:07 |
|
If you're going to be retrieving information later by zipcode, then yes. If you're going to be retrieving information by state, then use state for the keys. Etc.
|
# ? Jul 11, 2014 16:49 |
|
SurgicalOntologist posted:If you're going to be retrieving information later by zipcode, then yes. If you're going to be retrieving information by state, then use state for the keys. Etc. It's going to be by zipcode, because I need to map the zip codes I retrieved in the previous posts with the ones I'd import from this list.
|
# ? Jul 11, 2014 16:51 |
|
the posted:It's going to be by zipcode, because I need to map the zip codes I retrieved in the previous posts with the ones I'd import from this list. Dude, if you build a dictionary by zipcode, you're not going to have to do any searching to look up an entry. e: This is also one of my favorite things about Python, defaultdict: http://ianloic.com/2011/06/16/pythons-collections-defaultdict/ e2: oh wow I just stumbled across this tutorial on comprehensions: http://tech.pro/tutorial/1554/four-tricks-for-comprehensions-in-python namaste friends fucked around with this message at 17:04 on Jul 11, 2014 |
# ? Jul 11, 2014 16:58 |
|
I have another OCD best-practices question about library design. Is it bad form to have two functions with the same name in different modules? I have pyphase.discrete.relative_phase and pyphase.continuous.relative_phase. Is that okay? They're not drop-in replacements for each other, because the discrete version also returns the indices at which it's reporting phases, although for symmetry I could make the continuous version return all the indices.
|
# ? Jul 11, 2014 17:18 |
|
the posted:I'm using Beatbox to query a database in Salesforce. I'm grabbing a list of Account zip code fields. They get returned as type 'instance' and I convert them to a string before doing anything with them. the, when you initialize beatbox, are you using beatbox.Client() or beatbox.PythonClient()? I remember looking at the code for beatbox when you were posting about having to wrap everything in str() and shuddering -- as SurgicalOntologist has suggested, the API is godawful. However, it looks like it's because the default Client API is just a wrapper around some horrifying XML thing that is just dumping out results in a list that you then have to wrap in str(), which is what's part of what's causing your encoding issues. beatbox includes PythonClient which claims to turn "the returned objects into proper Python data types. e.g. integer fields return integers" and appears to return a list of dictionaries (their example): Python code:
|
# ? Jul 11, 2014 17:35 |
|
Huh, I've been using Client. Interesting.
|
# ? Jul 11, 2014 17:38 |
|
the posted:Huh, I've been using Client. Interesting. Switch and your code should be MUCH easier to work with. You'll have to change some of your code where you were doing list indexing to get access a record's field, but result['FirstName'] will be a lot easier to read and work with than remembering that result[3] is supposed to be FirstName. It looks like PythonClient also supports accessing the dictionary in dot-notation, meaning result.FirstName would also work.
|
# ? Jul 11, 2014 17:50 |
|
Appending onto my earlier posts, has anyone used Basemap within matplotlib for map projections before? I've got a list here of ~95,000 latitude/longitude coordinates, and I'd like to at least put them as dots on a map, and ideally have some way to measure density by increasing dot size or something. For example the docs list a variety of mapping projections, but I am not really sure which one I need to use.
|
# ? Jul 11, 2014 18:20 |
|
BabyFur Denny posted:Ok, thanks for that, and I think I made some progress in understanding the whole thing. For any of the tkinter widgets that have a "command" option, you have to pass a function *name*. You can't pass a call to a function, because, well, that calls your function immediately rather than waiting for a click. So always do this: Button(master, command=self.some_func) Never do this: Button(master, command=self.some_func()) That means you cannot easily pass arguments to the function, but there are ways around this. One is to make stuff that needs to be shared across methods into attributes of App, and use tkinter variables, so they are available everywhere. Here's how you could that for your code, although this is not the most elegant soluation to this particular problem. code:
code:
code:
|
# ? Jul 11, 2014 19:19 |
|
SurgicalOntologist posted:I have another OCD best-practices question about library design. Yes, it's totally okay so long as you keep your namespace clean when working with both functions. Someone doing something like this would run into issues: Python code:
|
# ? Jul 11, 2014 19:28 |
|
|
# ? Jul 11, 2014 19:45 |
|
Nicely done! so, that's the coordinates associated with each zip code right? I guess that's the geographical center?
|
# ? Jul 11, 2014 19:49 |
|
Thermopyle posted:Nicely done! Yeah, I ended up centering the map on the geographical center of the US, and I guessed the borders based on looking at a map. If anyone is interested, this tutorial explains it very well. code:
|
# ? Jul 11, 2014 19:56 |
|
So, again, there's probably a faster and more "pythonic" way to do this. Basically, in the mapping I'm doing, if there is a duplicate zip code (like two business in the same city) it will just overwrite the same dot. To account for this, I've decided to "nudge" the coordinate slightly to a random direction, so that the points won't overwrite each other. The issue I'm coming up with is that the loop I use to check for the duplicates is rather lengthy, and I'm looking for a faster way. Right now the list I'm working with is 18,660 locations, so the loop below is running 348,195,600 times. Not ideal. code:
|
# ? Jul 11, 2014 21:20 |
|
the posted:So, again, there's probably a faster and more "pythonic" way to do this. Basically, in the mapping I'm doing, if there is a duplicate zip code (like two business in the same city) it will just overwrite the same dot. To account for this, I've decided to "nudge" the coordinate slightly to a random direction, so that the points won't overwrite each other. This icount and jcount business isn't needed, Python gives you enumerate() to do this. Also, part of the if-else code is duplicated between the two cases. code:
Hammerite fucked around with this message at 21:38 on Jul 11, 2014 |
# ? Jul 11, 2014 21:32 |
|
I'd recommend just adding some random jitter to every point. A lot of high-level plotting libraries have that functionality, it's the standard way to solve the problem of overlapping points. Checking which points are overlapping is overkill, IMO. You can do this without a loop with numpy, or use a list comprehension to add a random number to everything. Numpy will be very fast if speed is the issue here. An alternative/additional thing to do is to set the alpha level (transparency) so it gets darker where there are more points. SurgicalOntologist fucked around with this message at 21:39 on Jul 11, 2014 |
# ? Jul 11, 2014 21:36 |
|
Hammerite posted:Do you really mean to append each latitude and each longitude n times, where n is the number of points in final_map_list? That's what you appear to be doing, Oh, oops. Yeah I just want to replace/fix it if it's a duplicate. Thanks for that catch. SurgicalOntologist posted:I'd recommend just adding some random jitter to every point. A lot of high-level plotting libraries have that functionality, it's the standard way to solve the problem of overlapping points. Checking which points are overlapping is overkill, IMO. You know what, that's a better idea. Instead of trying to find the duplicates, I could just move every single point just slightly, and that would solve it. Thanks for that great suggestion. the fucked around with this message at 23:08 on Jul 11, 2014 |
# ? Jul 11, 2014 23:01 |
|
I am parsing a textfile, the information there consists of several levels: lines, semicolon, comma. I want to put each line as an element in a tuple, and each line should be a tuple of all elements separated by semicolon, and those elements are tuples again of all elements separated by comma. e.g.: a;b;c,d e;f;g,h will become [[a,b,[c,d]],[e,f,[g,h]]] Additionally I want to clean up the elements (strip spaces). My code works but it looks convulated and ugly. I am sure it can be improved but my attempts only resulted in red error messages: code:
|
# ? Jul 12, 2014 12:13 |
|
Might be too dense for some tastes, but here is a way.code:
code:
|
# ? Jul 12, 2014 14:53 |
|
Does matplotlib work in python 3.4? The website seems to suggest it only supports 3.3.
|
# ? Jul 12, 2014 19:05 |
|
This isn't strictly Python-related, but I'm curious about it for a dumb little script I wrote: is there any way to negate a regular expression? Basically, "anything that does not match this pattern". I ended up doing this, and I get the feeling there's a better way of doing it Python code:
|
# ? Jul 12, 2014 21:06 |
|
Carrier posted:Does matplotlib work in python 3.4? The website seems to suggest it only supports 3.3. Yes
|
# ? Jul 12, 2014 21:15 |
|
null gallagher posted:This isn't strictly Python-related, but I'm curious about it for a dumb little script I wrote: is there any way to negate a regular expression? Basically, "anything that does not match this pattern". Negative lookahead assertion. code:
code:
|
# ? Jul 12, 2014 21:25 |
|
Neat, I didn't remember negative lookahead from my compilers textbook's regex section. Both of those look a little cleaner but still readable, thanks!
|
# ? Jul 12, 2014 21:58 |
|
In all the commotion of SciPy I forgot to come here an mention that we released Bokeh 0.5. Lots of good new stuff: log axes, minor ticks, tighter pandas integration, easier embedding, super high level bokeh.charts interface, and maybe most importantly, widgets! You can also see my SciPy talk on Bokeh (and any of the SciPy talks, Nich Coghlan gave a great Keynote this year) already on YouTube: https://www.youtube.com/watch?v=B9NpLOyp-dI
|
# ? Jul 12, 2014 23:02 |
|
BigRedDot posted:In all the commotion of SciPy I forgot to come here an mention that we released Bokeh 0.5. This looks like something I might need to learn at some point. I'm doing a lot of stuff that would probably benefit from such a package.
|
# ? Jul 12, 2014 23:09 |
|
the posted:This looks like something I might need to learn at some point. I'm doing a lot of stuff that would probably benefit from such a package. Well we have a mailing list and a GitHub, or feel free to PM me, always happy to help anyone use Bokeh!
|
# ? Jul 12, 2014 23:19 |
|
Is there an API out there where I can query by name and get out a listed address? I guess Yelp could do that, but I'll mainly be doing things like municipalities and facilities, so I don't think there would be many listings.
|
# ? Jul 14, 2014 19:20 |
|
Is there a simple way to use shutil.rmtree in a non-blocking way? I have a function where for one of the steps I'm basically just removing a temporary directory, but sometimes the directory contains some really huge files that take awhile to delete. I could use subprocess I guess, but that's clunky
|
# ? Jul 15, 2014 09:40 |
|
QuarkJets posted:Is there a simple way to use shutil.rmtree in a non-blocking way? I have a function where for one of the steps I'm basically just removing a temporary directory, but sometimes the directory contains some really huge files that take awhile to delete. I could use subprocess I guess, but that's clunky It's either that, or a thread. code:
Frankly, I'd see if you can move to a file system that doesn't take so long to remove files.
|
# ? Jul 15, 2014 10:15 |
|
QuarkJets posted:Is there a simple way to use shutil.rmtree in a non-blocking way? I have a function where for one of the steps I'm basically just removing a temporary directory, but sometimes the directory contains some really huge files that take awhile to delete. I could use subprocess I guess, but that's clunky The simplest way to run anything in a non-blocking way is to start in it a thread Python code:
|
# ? Jul 15, 2014 10:27 |
|
Looks like python-twitter doesn't support using the streaming API. I'm looking to grab as many possible tweets as I can about a subject, I want to build a database of tweets over the course of the past few years to look at trends. Are there any modules that support connecting to the API in this way? the fucked around with this message at 17:31 on Jul 16, 2014 |
# ? Jul 16, 2014 17:11 |
|
the posted:Looks like python-twitter doesn't support using the streaming API. I'm looking to grab as many possible tweets as I can about a subject, I want to build a database of tweets over the course of the past few years to look at trends. You edited it out just as I hit quote, but I just wanted to mention that when you're trying to figure out a dense data structure like that, the pprint module is a godsend.
|
# ? Jul 16, 2014 17:32 |
|
Sorry, thanks. I'll look into that. Also, in doing research online, it looks like there isn't really away to get long-term search result data from twitter? It seems like this is something that they don't allow with the public api.
|
# ? Jul 16, 2014 17:35 |
|
the posted:Sorry, thanks. I'll look into that. Nope, the most you can get is 7 days through the search API. Selling data was the major source of revenue for twitter until their ad platform starting taking off. Historical is a giant pain in the rear end to get, and expensive. Datasift will do it, but you need a subscription. Gnip will do one-off data, but you are starting at $500+, and if you are looking at years of data even for a narrow search its going up quickly from there. Try tweepy if you are using the streaming api.
|
# ? Jul 16, 2014 19:19 |
|
onionradish posted:the, when you initialize beatbox, are you using beatbox.Client() or beatbox.PythonClient()? FYI I just tried this and it's popping an error. Ideas? code:
|
# ? Jul 16, 2014 20:40 |
|
e: disregard
|
# ? Jul 16, 2014 20:44 |
|
the posted:FYI I just tried this and it's popping an error. Ideas? Are you getting that error on the svc = beatbox.PythonClient() line or somewhere else? If you do import beatbox; dir(beatbox) does PythonClient show up in the list? If not, see if your version is the same as the PyPi version. I'm assuming you retyped the code; the first colon looks like a typo here: for i in query[sf.records:]:
|
# ? Jul 16, 2014 21:08 |
|
|
# ? May 9, 2024 06:07 |
|
onionradish posted:Are you getting that error on the svc = beatbox.PythonClient() line or somewhere else? If you do import beatbox; dir(beatbox) does PythonClient show up in the list? If not, see if your version is the same as the PyPi version. Looks like it isn't. Reading that page it looks like this is an updated version of the one I was using. Thanks for the heads up.
|
# ? Jul 16, 2014 21:17 |