Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Locked thread
Sock on a Fish
Jul 17, 2004

What if that thing I said?
I've got a threading/scope question.

I just changed a script that deploys instances of a webapp to be multithreaded using the threading module. Initially this was much easier than I thought it would be -- I just took everything within the for loop that processes new deployment requests, stuck it in a threading.Thread class, and then just called that instead. Everything works, except that log entries from all of the spawned threads are written to every log for each deployment.

I realized that this is because the logger is a global, and each thread is adding its log as handler.

I originally did this for the convenience of not having to pass a logger object to every function that I wanted to make log entries in, but now it's leading to this fun behavior. Is there any way to have a global-like variable that exists only within a single thread?

edit: I just modified the script so that the logger is no longer global and is instead passed to every function and am still seeing this behavior. I don't get it. Do functions share scope between threads?

Sock on a Fish fucked around with this message at 01:17 on Nov 19, 2009

Adbot
ADBOT LOVES YOU

king_kilr
May 25, 2007
You're looking for thread locals: http://docs.python.org/library/threading.html#threading.local

Sock on a Fish
Jul 17, 2004

What if that thing I said?

king_kilr posted:

You're looking for thread locals: http://docs.python.org/library/threading.html#threading.local

Huh, so if I spawn two threads that use the same function they could end up sharing variables willy nilly unless I explicitly declare thread locals?

That's gonna be a lot of work. Is there no way to just set all variables to be thread locals? I don't have a need for sharing data between threads in this script.

king_kilr
May 25, 2007
No, there is no way to make all variables thread locals, also only globals are shared.

Modern Pragmatist
Aug 20, 2008

VirtuaSpy posted:

I even got the chance to talk to one of the high up MATLAB developers as to why I prefer the Python setup versus a commercial package like MATLAB.

I hope one of your reasons that you mentioned was that python is free.

VirtuaSpy posted:

I know Sage includes other packages useful for numerical and math work, but are there any math/Python/science folks that can chime in on if I should consider using Sage full time versus the setup I have now?

There is nothing wrong with your setup. I do scientific research and use notepad or a very basic text editor to do all of my coding. IMHO, the only reason why packages like Sage exist is to help individuals who know how to "program" in MATLAB ease into Python. If you already know python, and understand how to install any required modules, than the only thing that Sage adds is the convenience of already having typical packages pre-installed.

Disclaimer: This represents my opinion. Feel free to debunk.

BeefofAges
Jun 5, 2004

Cry 'Havoc!', and let slip the cows of war.

Sock on a Fish posted:

I've got a threading/scope question.

I just changed a script that deploys instances of a webapp to be multithreaded using the threading module. Initially this was much easier than I thought it would be -- I just took everything within the for loop that processes new deployment requests, stuck it in a threading.Thread class, and then just called that instead. Everything works, except that log entries from all of the spawned threads are written to every log for each deployment.

I realized that this is because the logger is a global, and each thread is adding its log as handler.

I originally did this for the convenience of not having to pass a logger object to every function that I wanted to make log entries in, but now it's leading to this fun behavior. Is there any way to have a global-like variable that exists only within a single thread?

edit: I just modified the script so that the logger is no longer global and is instead passed to every function and am still seeing this behavior. I don't get it. Do functions share scope between threads?

I think what you're looking for is to turn off hierarchical logging. Try disabling propagate: http://docs.python.org/library/logging.html#logging.Logger.propagate

Sock on a Fish
Jul 17, 2004

What if that thing I said?

BeefofAges posted:

I think what you're looking for is to turn off hierarchical logging. Try disabling propagate: http://docs.python.org/library/logging.html#logging.Logger.propagate

Here we go:

quote:

getLogger() returns a reference to a logger instance with the specified if it it is provided, or root if not. The names are period-separated hierarchical structures. Multiple calls to getLogger() with the same name will return a reference to the same logger object. Loggers that are further down in the hierarchical list are children of loggers higher up in the list. For example, given a logger with a name of foo, loggers with names of foo.bar, foo.bar.baz, and foo.bam are all children of foo. Child loggers propagate messages up to their parent loggers. Because of this, it is unnecessary to define and configure all the loggers an application uses. It is sufficient to configure a top-level logger and create child loggers as needed.

The call was always getLogger('logger'). Making the string unique solved the problem.

nonathlon
Jul 9, 2004
And yet, somehow, now it's my fault ...
Having a setuptools problem that maybe someone has seen before:

So I'm developing a package. To keep things clean, I'm using virtualenv with no-site-packages as a sandbox. Now, I go to do a install of my package so I can test it and use "python setup.py develop" to do a development install. However, when I try to import the package from with the virtualenv Python, it reports "not found".

Obvious problems or solutions:

* Am I installing and calling the right Python? Yes.
* Does using distribute instead of setuptools help? Nope.
* Did this previously work? Yes.
* Are errors reported by "python setup.py develop"? No.
* Manually editing the easy_install.pth file makes things work.

It looks to me as if the develop install is not doing it's job, silently failing. Technical details: Python 2.6, OSX10.6, latest stable versions of setuptools, virtualenv, distribute.

good jovi
Dec 11, 2000

'm pro-dickgirl, and I VOTE!

outlier posted:

Having a setuptools problem that maybe someone has seen before:

So I'm developing a package. To keep things clean, I'm using virtualenv with no-site-packages as a sandbox. Now, I go to do a install of my package so I can test it and use "python setup.py develop" to do a development install. However, when I try to import the package from with the virtualenv Python, it reports "not found".

Obvious problems or solutions:

* Am I installing and calling the right Python? Yes.
* Does using distribute instead of setuptools help? Nope.
* Did this previously work? Yes.
* Are errors reported by "python setup.py develop"? No.
* Manually editing the easy_install.pth file makes things work.

It looks to me as if the develop install is not doing it's job, silently failing. Technical details: Python 2.6, OSX10.6, latest stable versions of setuptools, virtualenv, distribute.

What does sys.path say? Does it work when you do python setup.py install?

Avenging Dentist
Oct 1, 2005

oh my god is that a circular saw that does not go in my mouth aaaaagh

outlier posted:

Now, I go to do a install of my package so I can test it and use "python setup.py develop" to do a development install.

I'm going to "answer" your question with another question: what benefit does this serve over just checking the code out from SVN? (Bonus points if you know how it affects modules written in C.)


(Also, doesn't Distribute replace setuptools? Why do you have both?)

bitprophet
Jul 22, 2004
Taco Defender

Avenging Dentist posted:

I'm going to "answer" your question with another question: what benefit does this serve over just checking the code out from SVN? (Bonus points if you know how it affects modules written in C.)


(Also, doesn't Distribute replace setuptools? Why do you have both?)

It (setup.py develop) lets you keep the source checkout itself outside of your PYTHONPATH and links it there instead. It also runs install hooks like updating command-line scripts and, presumably, rebuilding C extensions. So it does everything setup.py install does, but with links or similar instead of copying. (Thus, for pure Python stuff, typically you run setup.py develop just once and then hack away on your checkout, only re-running it when you need to eg update a CLI script due to a version # change, or etc.)

(And to my knowledge, Distribute replaces/enhances setuptools, but is starting to move away from it, so it's probably beneficial to have both if you're just testing out Distribute experimentally, or if there's something in the latest Distribute that behaves differently from setuptools.)

nonathlon
Jul 9, 2004
And yet, somehow, now it's my fault ...

Sailor_Spoon posted:

What does sys.path say? Does it work when you do python setup.py install?

Good idea. Using "setup.py develop", the "installed" module is not importable and does not appear on sys.path. When using "setup.py install", it behaves as expected - the package is importable and the egg appears on sys.path. So it specifically effects "develop" - but I've done this exact thing on this exact machine before and it worked fine.

To add to the mystery - it's the same version of setuptools installed in working & non-working environments.

bitprophet posted:

Many wise words

What he said.

wins32767
Mar 16, 2007

Is something actually getting written to your virtual-env site-packages when running develop? I'd assume yes for install since it works.

nonathlon
Jul 9, 2004
And yet, somehow, now it's my fault ...

wins32767 posted:

Is something actually getting written to your virtual-env site-packages when running develop? I'd assume yes for install since it works.

When running develop, a .egg-link file is created but the easy_install.pth file is untouched.

On a hunch I ran "setup.py develop" with sudo. The package installs correctly. A quick look at the permissions of the site-packages directory doesn't show anything untoward, but there was a bit of stray metadata.

wins32767
Mar 16, 2007

The reason I asked is that I just had a similar problem on trying to get a coworker setup on Ubuntu. In my case, it gave the standard easy_install (really setuptools I guess) "Couldn't write test file to <virtual_env_path>". I didn't check to see if it wrote anything, but we resolved it by using sudo.

nonathlon
Jul 9, 2004
And yet, somehow, now it's my fault ...

wins32767 posted:

The reason I asked is that I just had a similar problem on trying to get a coworker setup on Ubuntu. In my case, it gave the standard easy_install (really setuptools I guess) "Couldn't write test file to <virtual_env_path>". I didn't check to see if it wrote anything, but we resolved it by using sudo.

Hmmm. I definitely didn't see any error message, but sudo does seem to fix it. Many a problem in virtual_env?

nbv4
Aug 21, 2002

by Duchess Gummybuns
http://beta.flightlogg.in/nbv4/maps.html

If you go to that page, there is supposed to be three images at the bottom. Most likely only two will show up, and one of the ones that does show up will have messed up overlayed text. If you copy the URL of each image individually, they all work perfectly. It's just when they get displayed on the page together that they get messed up. Does anyone know what could be causing this? The site is ran from django 1.1.1, and the images are made with matplotlib 99.4 and with the basemap add on.

The only think I can think of is that matplotlib/basemap is not thread safe and the webserver (apache in worker-mpm) is trying to render the three images at the same time in three different threads and for some reason the threads are messing each other up. Could this be whats happening? Is there any way to fix this?

tripwire
Nov 19, 2004

        ghost flow
I've never encountered anything like that with matplotlib before, and its impossible to tell more about whats happening without knowing what some basics of how your code works.

As far as I know, you save images with matplotlib by calling savefig method of a figure object, or you use a StringIO object to save to a string and then use PIL or some other library to process and finally encode the image.

How are you generating and saving your images?

nbv4
Aug 21, 2002

by Duchess Gummybuns

tripwire posted:

I've never encountered anything like that with matplotlib before, and its impossible to tell more about whats happening without knowing what some basics of how your code works.

As far as I know, you save images with matplotlib by calling savefig method of a figure object, or you use a StringIO object to save to a string and then use PIL or some other library to process and finally encode the image.

How are you generating and saving your images?

here's the code:
http://github.com/nbv4/flightloggin/blob/master/maps/states.py

once the figure is created, this code gets executed:

code:
   response=HttpResponse(content_type=self.mime)
   fig.savefig(response,
               format=self.extension,
               bbox_inches="tight",
               pad_inches=.05,
               edgecolor="white")
   return response
Also, this used to work perfectly fine. Sometime around the time I switched from prefork to worker is when this problem arose.

tripwire
Nov 19, 2004

        ghost flow
I was kind of stumped for a while, because I was thinking that the only way that text should get written 3 times is if (for example) rhode island shows up multiple times in the dictionary of states. Which would be impossible if its a dictionary.

Reading mailing lists it looks like pyplot, which you use for its figure class, isn't in fact thread safe (though the rest of matplotlib is)


check out the code here: http://old.nabble.com/font-troubles-td16601826.html
it looks like the guy got around it by importing some other matplotlib modules: Figure, which resides in matplotlib.figure, and FigureCanvasAgg, which is from matplotlib.backends.backend_agg

The Red Baron
Jan 10, 2005

Got a question for the 'in-depth' people here. Have there been any serious attempts at adding inline monomorphic/polymorphic caching to (C)Python? Eliminating dictionary lookups should give a non-negligible speedup to attribute accesses as well as offer the possibility for attribute packing a-la Google's V8 JS engine. I know the Unladen Swallow guys are considering it, but I would presume that lies quite far ahead.

Thermopyle
Jul 1, 2003

...the stupid are cocksure while the intelligent are full of doubt. —Bertrand Russell

Which is better?

code:
def save(obj):
    shelf[obj.name] = obj
or...
code:
class obj():
    def save(self):
        shelf[self.name] = self

Avenging Dentist
Oct 1, 2005

oh my god is that a circular saw that does not go in my mouth aaaaagh

The Red Baron posted:

Got a question for the 'in-depth' people here. Have there been any serious attempts at adding inline monomorphic/polymorphic caching to (C)Python? Eliminating dictionary lookups should give a non-negligible speedup to attribute accesses as well as offer the possibility for attribute packing a-la Google's V8 JS engine. I know the Unladen Swallow guys are considering it, but I would presume that lies quite far ahead.

CPython is one of the least optimization-friendly projects I've seen. Most of the Python community seems to think that CPython is already fast enough, even though the performance is often downright sad. As much as I like writing stuff in Python, I always feel the pressure to implement my stuff using the Python C API so that it isn't slow as molasses.

tehk
Mar 10, 2006

[-4] Flaw: Heart Broken - Tehk is extremely lonely. The Gay Empire's ultimate weapon finds it hard to have time for love.

Thermopyle posted:

Which is better?

code:
def save(obj):
    shelf[obj.name] = obj
or...
code:
class obj():
    def save(self):
        shelf[self.name] = self
It depends on what shelf is, and how you are using it. If save is not manipulating its own object I would not make it a method of obj, however if it is some object relational thing then the second approach makes sense. You have a third option of adding save to a subclass of what ever type(dict) shelf is. That seems the least magical for when you are reading the code later.

tehk fucked around with this message at 21:06 on Nov 28, 2009

The Red Baron
Jan 10, 2005

Avenging Dentist posted:

CPython is one of the least optimization-friendly projects I've seen. Most of the Python community seems to think that CPython is already fast enough, even though the performance is often downright sad. As much as I like writing stuff in Python, I always feel the pressure to implement my stuff using the Python C API so that it isn't slow as molasses.

Yeah that's in many ways the impression I've gotten as well. As well as the "the GIL is a feature for enforcing proper multiprogramming" crowd :v: The reason I'm asking is because I want to Do A Thing with PIC and runtime type specialization for my baby python LLVM implementation, but I'll probably try to use V8's approach as inspiration.

Thermopyle
Jul 1, 2003

...the stupid are cocksure while the intelligent are full of doubt. —Bertrand Russell

tehk posted:

It depends on what shelf is, and how you are using it. If save is not manipulating its own object I would not make it a method of obj, however if it is some object relational thing then the second approach makes sense. You have a third option of adding save to a subclass of what ever type(dict) shelf is. That seems the least magical for when you are reading the code later.

Shelf is a shelve. I'm storing my object on disk.

Save isn't manipulating it's own object per se...it's storing the object in the shelve.

deedee megadoodoo
Sep 28, 2000
Two roads diverged in a wood, and I, I took the one to Flavortown, and that has made all the difference.


Thermopyle posted:

Shelf is a shelve. I'm storing my object on disk.

Save isn't manipulating it's own object per se...it's storing the object in the shelve.

I'm confused why you would want to store the object in the shelf. It really doesn't make much sense to me.

Thermopyle
Jul 1, 2003

...the stupid are cocksure while the intelligent are full of doubt. —Bertrand Russell

HatfulOfHollow posted:

I'm confused why you would want to store the object in the shelf. It really doesn't make much sense to me.

That's how the shelve module works.

quote:

A “shelf” is a persistent, dictionary-like object. The difference with “dbm” databases is that the values (not the keys!) in a shelf can be essentially arbitrary Python objects — anything that the pickle module can handle. This includes most class instances, recursive data types, and objects containing lots of shared sub-objects. The keys are ordinary strings.

The question boils down to: Is it better to have a function insert an object into a shelf, or is it better to have an object put itself (via a method) on the shelf?

Thermopyle fucked around with this message at 23:57 on Nov 28, 2009

nonathlon
Jul 9, 2004
And yet, somehow, now it's my fault ...

Thermopyle posted:

Which is better?

code:
def save(obj):
    shelf[obj.name] = obj
or...
code:
class obj():
    def save(self):
        shelf[self.name] = self

A lot depends on context but, I'd tilt towards 1. Reasons:

* It's shorter
* Anything can get stored in the shelf, as long as it has a "name" attribute.
* Making it an obj method doesn't encapsulate or hide any great amount of detail

I can only think of one counter-reason:

* If you later want to store a different type of object and wish to store it in a different way, this could be done by having a different "save" method implementation in the new class.

None of these are great, strong killer reasons, but "shorter and simpler" is the best. Go with that.

king_kilr
May 25, 2007

The Red Baron posted:

Got a question for the 'in-depth' people here. Have there been any serious attempts at adding inline monomorphic/polymorphic caching to (C)Python? Eliminating dictionary lookups should give a non-negligible speedup to attribute accesses as well as offer the possibility for attribute packing a-la Google's V8 JS engine. I know the Unladen Swallow guys are considering it, but I would presume that lies quite far ahead.

Yes, I attempted to implement a PIC under both CPython raw and Unladen Swallow. The performance gain was negligible where it existed, probably because the type cache implemented in python 2.6 already takes a long way there. Reid Kleckner has been working on a patch for unladen swallow to remove unncesceary lookups on type objects at monomorphic attr sites: http://codereview.appspot.com/157130/show . He and I have also been discussing a few other optimizations that can be done (such as things like shadow classes, idea from pypy and v8).

The Red Baron
Jan 10, 2005

king_kilr posted:

Yes, I attempted to implement a PIC under both CPython raw and Unladen Swallow. The performance gain was negligible where it existed, probably because the type cache implemented in python 2.6 already takes a long way there. Reid Kleckner has been working on a patch for unladen swallow to remove unncesceary lookups on type objects at monomorphic attr sites: http://codereview.appspot.com/157130/show . He and I have also been discussing a few other optimizations that can be done (such as things like shadow classes, idea from pypy and v8).

Very interesting, I must admit I wasn't fully aware of the type cache until now but it seems like a nice optimization. I'm currently trying to figure out how shadow types can be "easily" implemented for Python and thus far I'm leaning against simply having a second shadow type pointer in the base object class, with the shadow types containing a hashmap of 'attribute name'->'object field index or transition', the latter depending on whether the attribute is present in the shadow type (getattr) or is a known transition (setattr). Any tales of experiences with this would be gladly accepted. I suspect an issue will be balancing the sheer number of such types and their attribute entries in a long-running app, particularly for the initial hidden type.

king_kilr
May 25, 2007

The Red Baron posted:

Very interesting, I must admit I wasn't fully aware of the type cache until now but it seems like a nice optimization. I'm currently trying to figure out how shadow types can be "easily" implemented for Python and thus far I'm leaning against simply having a second shadow type pointer in the base object class, with the shadow types containing a hashmap of 'attribute name'->'object field index or transition', the latter depending on whether the attribute is present in the shadow type (getattr) or is a known transition (setattr). Any tales of experiences with this would be gladly accepted. I suspect an issue will be balancing the sheer number of such types and their attribute entries in a long-running app, particularly for the initial hidden type.

http://groups.google.com/group/unladen-swallow/browse_thread/thread/c61055121cb4cca6/27557a340cd811fa describes Reid's approach to it. If you're interested in this you should hope into #unladenswallow on irc.oftc.net as that's where people discuss this stuff (as well as on the mailing list).

deedee megadoodoo
Sep 28, 2000
Two roads diverged in a wood, and I, I took the one to Flavortown, and that has made all the difference.


Thermopyle posted:

That's how the shelve module works.

No I mean, why do you need a save method? Why can't you just do

code:
shelve[name]=obj
wherever you need to update? It just seems kind of unnecessary.

m0nk3yz
Mar 13, 2002

Behold the power of cheese!

Avenging Dentist posted:

CPython is one of the least optimization-friendly projects I've seen. Most of the Python community seems to think that CPython is already fast enough, even though the performance is often downright sad. As much as I like writing stuff in Python, I always feel the pressure to implement my stuff using the Python C API so that it isn't slow as molasses.

Bullshit. There's not a single person who is anti optimizations so long as it doesn't sacrifice the maintainability or readability of the code.

The Red Baron
Jan 10, 2005

m0nk3yz posted:

Bullshit. There's not a single person who is anti optimizations so long as it doesn't sacrifice the maintainability or readability of the code.

I think it's more a matter of CPython not being able to escape from the needs of legacy support for C-extensions that are built with certain assumptions in mind (i.e. the main reason the GIL still exists) that might preclude certain optimizations. I'm fairly certain Unladen Swallow would see higher performance results if they didn't bother with essentially emulating the python stack machine in a register-based system due to a need for switching between interpretation and native code execution. These are all uneducated observations I have made, though, so I may be wrong.

king_kilr
May 25, 2007

The Red Baron posted:

I think it's more a matter of CPython not being able to escape from the needs of legacy support for C-extensions that are built with certain assumptions in mind (i.e. the main reason the GIL still exists) that might preclude certain optimizations. I'm fairly certain Unladen Swallow would see higher performance results if they didn't bother with essentially emulating the python stack machine in a register-based system due to a need for switching between interpretation and native code execution. These are all uneducated observations I have made, though, so I may be wrong.

The reason unladen swallow needs to switch between the JIT and the interpretter is because that's you get good performance. Read urs hoelzle's PHd thesis, it clearly outlines that the best performance can be gained by having a series of bail conditions and one super optimized codepath, rather than a series of: if guard: fast_path(), else: slow_path(). In fact this inhibits optimizations, since a fast path might be something like "perform a direct call to a C function", not knowing this fast path means we can't know the return type for this opcode, which inhibits future dataflow optimizations.

Thermopyle
Jul 1, 2003

...the stupid are cocksure while the intelligent are full of doubt. —Bertrand Russell

HatfulOfHollow posted:

No I mean, why do you need a save method? Why can't you just do

code:
shelve[name]=obj
wherever you need to update? It just seems kind of unnecessary.

It seems like it's the most maintainable. Say I want to switch to MySQL, or some fancy other method of persistence in the future.

Plorkyeran
Mar 22, 2007

To Escape The Shackles Of The Old Forums, We Must Reject The Tribal Negativity He Endorsed

Thermopyle posted:

It seems like it's the most maintainable. Say I want to switch to MySQL, or some fancy other method of persistence in the future.
Then you'll have to change __setitem__ instead of save.

Thermopyle
Jul 1, 2003

...the stupid are cocksure while the intelligent are full of doubt. —Bertrand Russell

Plorkyeran posted:

Then you'll have to change __setitem__ instead of save.

Isn't it as wide as it is tall? I mean, the dict-like nature seen in "save" is just an artifact of using shelve.

Forget about shelve. Pretend the contents of "save" is code for writing results to a MySQL database.

The question is more about coding style and less about shelve.

Adbot
ADBOT LOVES YOU

Avenging Dentist
Oct 1, 2005

oh my god is that a circular saw that does not go in my mouth aaaaagh

m0nk3yz posted:

Bullshit. There's not a single person who is anti optimizations so long as it doesn't sacrifice the maintainability or readability of the code.

There's a world of difference between "we do not accept optimization patches" and "we do not devote our effort to making optimizations". If, however, the primary project you work with is Python, then I can see why you'd think they're not optimization-unfriendly.

  • Locked thread