|
Has anybody here used threading.local? I can (and probably will) do some experiments before getting into it, but I'm trying to understand its caveats and gotchas. I am in a situation where I need to pass in some context to a callback. The system that uses that callback is stateless and doesn't need to worry about thread safety. However, the callback I'm using is contextual so I need different state between threads. I'm trying to figure out how I might manage that state. My understanding is threading.local() just gives me some handle to shove poo poo on a per-thread basis. I first need to figure out if threading.local() will give a unique handle each time that I have to then juggle or if it gives me the same handle each time for the same thread. If the former, then I guess I have to stash it somehow and resolve it on a per-thread basis. At that point, I don't know if it matters that I use it in the first place, so I was hoping it was the latter. The other issue then is cleaning up afterwards. I guess I need to make sure I delete the stuff so I don't leave it assigned to the thread for eternity.
|
# ? Nov 3, 2020 19:53 |
|
|
# ? May 15, 2024 01:11 |
|
Rocko Bonaparte posted:Has anybody here used threading.local? I can (and probably will) do some experiments before getting into it, but I'm trying to understand its caveats and gotchas. I am in a situation where I need to pass in some context to a callback. The system that uses that callback is stateless and doesn't need to worry about thread safety. However, the callback I'm using is contextual so I need different state between threads. I'm trying to figure out how I might manage that state. Threading.local creates an object whenever you call it, so you should assign that object to a variable and then store local data as attributes on it. That object will persist as long as you need it, but calling threading.local again will just give you a new object.
|
# ? Nov 3, 2020 22:21 |
|
Rookie question: How do I find/read built-in function implementations in Python? E.g. I have Python 3.9 and I'd like to see how they implemented the max() or any() function. I've been poking around github and Google but haven't had any luck.
|
# ? Nov 5, 2020 00:58 |
|
Hughmoris posted:Rookie question: Depends on which ones you mean. max and min and other builtins and language-provided functions specifically are in c but the standard library modules are by and large in the Lib directory of the cpython repo
|
# ? Nov 5, 2020 01:15 |
|
Phobeste posted:Depends on which ones you mean. max and min and other builtins and language-provided functions specifically are in c but the standard library modules are by and large in the Lib directory of the cpython repo Thanks!
|
# ? Nov 5, 2020 01:19 |
And if you use an IDE like PyCharm etc, you can cmd-click on any function and it'll take you right to its definition.
|
|
# ? Nov 5, 2020 01:20 |
|
I have a Python script that I'd like to redirect stdin to stdout in, like you can do in the shell. This is how I'm using the script:code:
code:
Going further by removing the explicit for-loop and using iterators yields roughly the same speeds and CPU utilization: code:
code:
Given that I just want to redirect stdin to stdout, I figure I need to swap their file descriptors using something like os.dup2() instead of using Python to explicitly iterate. Anyone know how I'd go about that? salisbury shake fucked around with this message at 22:00 on Nov 9, 2020 |
# ? Nov 9, 2020 21:52 |
Why not use subprocess.PIPE directly, no need to use shell then
|
|
# ? Nov 10, 2020 00:26 |
|
So I'm trying to install pygame because the book I'm using has a project which uses it. I'm using Anaconda. I created a 2.7 environment because pygame apparently doesn't work in 2.8 yet. I used pip to install. Here's the console: (base) C:\Users\User>pip install pygame Collecting pygame Using cached pygame-2.0.0-cp38-cp38-win_amd64.whl (5.1 MB) Installing collected packages: pygame Successfully installed pygame-2.0.0 Yet it still doesn't recognise in Spyder when I try to import pygame - just 'no such module exists'. I had previous issues but thought it was due to my messy installation on my external HDD so cleared the room on my main drive and did it all to the letter of the book - still no good. Any tips?
|
# ? Nov 10, 2020 01:25 |
Have you added that base venv interpreter to spyder?
|
|
# ? Nov 10, 2020 02:02 |
|
Bundy posted:Have you added that base venv interpreter to spyder? I have no idea what this is, but I’ll get on it if that seems necessary?
|
# ? Nov 10, 2020 02:05 |
Jakabite posted:I have no idea what this is, but I’ll get on it if that seems necessary? Phone posting but in your paste earlier you had (base) in front of your prompt for pip, suggesting an activated venv. If spyder isn't aware of that venv it won't find the module you installed in it. One reason I switched to pycharm was less faff managing venvs.
|
|
# ? Nov 10, 2020 02:15 |
|
Ah yeah I had just activated one for the previous version of python. Overall for what I need Anaconda is seeming like more hassle than it’s worth. Might be time to switch.
|
# ? Nov 10, 2020 02:45 |
|
salisbury shake posted:I have a Python script that I'd like to redirect stdin to stdout in, like you can do in the shell. This is how I'm using the script: I’m not super advanced with Linux or python. I know my way around both though... and I’m curious, why would you want to do this A) in python B) at all Reading stdin and pushing to stdout that is. Isn’t that like, typing on a screen? Excuse me is this is a really dumb question... I mean no offense, I simply don’t know
|
# ? Nov 10, 2020 14:57 |
|
For multiprocessing pools, should I be spawning one any time I want to async-map over an iterator, or can I just create a single pool for the arena that I am working with and then summon it by reference any time I need to go over an iterator? I'm not planning to actually asynchronously run anything, I just want to parallelize some iterative computations. Also the documentation makes OMINOUS ALLUSIONS to the fact that you can't rely on garbage collection for pools and you have to close them manually. So if I store a pool, do I like... have to actually write a destructor for whatever object encloses it telling it to terminate the pool?
|
# ? Nov 10, 2020 21:17 |
|
Mirconium posted:For multiprocessing pools, should I be spawning one any time I want to async-map over an iterator, or can I just create a single pool for the arena that I am working with and then summon it by reference any time I need to go over an iterator? I'm not planning to actually asynchronously run anything, I just want to parallelize some iterative computations. You have to close the pool one way or another, it won't happen if it is an attribute of an object that gets garbage collected. The easiest way is to open pools with context managers and the with keyword. This means you can re-use a pool like you mention to save spinning up new processes but you need to keep track of it and ensure you close it eventually. If your parallel processes take long enough just re-make the pool, but if spinning up a new pool takes enough time it is worth it to keep it around, especially if you use a length pool intitializer function. You could create the pool in an outside context with a context manager and have the entire lifecycle of your object happen inside of that I suppose.
|
# ? Nov 10, 2020 21:45 |
|
namlosh posted:I’m not super advanced with Linux or python. I know my way around both though... and I’m curious, why would you want to do this No offense taken. I built a tool like the pipe viewer (pv) tool that gives you insight into shell pipelines. You use pv like this: code:
The faster I can redirect stdin to stdout, the better, which is why I figure I can swap file descriptors using os.dup2() because that's what Bash essentially does behind the scenes when you do I/O redirections. salisbury shake fucked around with this message at 21:51 on Nov 10, 2020 |
# ? Nov 10, 2020 21:49 |
|
OnceIWasAnOstrich posted:You have to close the pool one way or another, it won't happen if it is an attribute of an object that gets garbage collected. The easiest way is to open pools with context managers and the with keyword. So what about the destructor strategy? Like if I just add pool.terminate() to __del__ are there potential issues with that? (I guess potentially if __del__ doesn't get called, which I assume can arise from crashes or Exceptions or something?)
|
# ? Nov 10, 2020 22:27 |
|
Going through an openpyxl tutorial and have come across something I don't even know how to google. I'm just opening an excel file and putting the rows into JSON. If the last print statement is indented, I get everything just like I want. code:
|
# ? Nov 11, 2020 02:02 |
|
salisbury shake posted:No offense taken. I built a tool like the pipe viewer (pv) tool that gives you insight into shell pipelines. Awesome! I get it now... makes total sense, thanks for taking the time to explain it to me. ^^^^^^^^^^^ the post above smells like a closure issue, maybe?
|
# ? Nov 11, 2020 02:29 |
bigperm posted:Going through an openpyxl tutorial and have come across something I don't even know how to google. I'm not entirely sure, but I'd guess that the rows are getting assigned the same id. Try indexing with an int and incrementing it for your dictionary key unless the id() function is important for some other reason.
|
|
# ? Nov 11, 2020 02:37 |
|
a foolish pianist posted:I'm not entirely sure, but I'd guess that the rows are getting assigned the same id. Try indexing with an int and incrementing it for your dictionary key unless the id() function is important for some other reason. That was it. Thank you.
|
# ? Nov 11, 2020 02:49 |
|
salisbury shake posted:The above is not only slow, it maxes out my CPU at only something like 150MB/s. I would not be surprised if you are just running into python being slow as a limit. You are doing: - Lookup what 'next' means on the stdin.buffer object - Call it, it reads data into an internal buffer - Allocate a new python object and copy the bytes into it (I think you are at least skipping the bytes -> UTF-8 conversions the way you are doing it) - Lookup what 'write' means - Call it, copy the data into another internal buffer (occasionally flush that buffer to the OS) - Deallocate the python object - Loop The shell pipe redirections are just renaming things so that the output of the first and the input of the second are the same thing. You can't introduce yourself in the middle of that because there is no middle, they're one file with two names. If you want to insert your program into a pipeline and don't care about anything besides counting, the fastest way will be to use os.open() / os.read() / os.write() and a decent sized buffer size to skip as much python as possible. You'll still be allocating&deallocating python objects, copying data, and doing dictionary lookups for every chunk though, so it'll be slower than a C one Mirconium posted:For multiprocessing pools, should I be spawning one any time I want to async-map over an iterator, or can I just create a single pool for the arena that I am working with and then summon it by reference any time I need to go over an iterator? I'm not planning to actually asynchronously run anything, I just want to parallelize some iterative computations. Be aware that by default, python on unix multiprocessing does invalid things. It will fork() to create copies of the existing process, then try to use them as pool workers, which violates POSIX. It will usually happen to work for most common libc implementations as long as absolutely everything in the process is single threaded. multiprocessing.set_start_method('spawn') will fix it.
|
# ? Nov 11, 2020 04:19 |
|
Mirconium posted:So what about the destructor strategy? Like if I just add pool.terminate() to __del__ are there potential issues with that? (I guess potentially if __del__ doesn't get called, which I assume can arise from crashes or Exceptions or something?) Destructor would be fine if the object gets garbage collected properly. If you have a crash at the wrong time your pool will hang around afterward regardless of any of this (you will just be slightly more likely for this to happen if it is alive for the entirely script lifetime instead of just during computation). I don't remember clearly what guarantees CPython has wrt to garbage collection and exceptions but you would still probably want to wrap your function with a context manager and use __exit__ instead, or use a try/finally block. OnceIWasAnOstrich fucked around with this message at 15:31 on Nov 11, 2020 |
# ? Nov 11, 2020 15:27 |
|
Pandas question: how can I split a column filled with strings into multiple rows by character count? I've got a dataframe that needs to get exported into an xls to be used as an import into an ancient system that limits cell character count to 100 characters. Since the strings are sentences, I'd prefer to split by the whitespace right before hitting 100 characters, but I haven't found a solution. It doesn't need to be efficient, the dataframe is only ~200 rows and probably somewhere around 1000 rows or less after it's been split. Basic idea: code:
code:
|
# ? Nov 11, 2020 16:11 |
|
Qtamo posted:Pandas question: how can I split a column filled with strings into multiple rows by character count? I've got a dataframe that needs to get exported into an xls to be used as an import into an ancient system that limits cell character count to 100 characters. Since the strings are sentences, I'd prefer to split by the whitespace right before hitting 100 characters, but I haven't found a solution. It doesn't need to be efficient, the dataframe is only ~200 rows and probably somewhere around 1000 rows or less after it's been split. At the first entry for Jimmy, split the string into a list of however many 100-char strings, and append a dictionary with {"Name": "Jimmy", "String": [first 100-char string]} to row_list. Then loop over the remaining strings, appending a dictionary to row_list with {"String": [next 100-char string]} until the strings are exhausted. Finally, make a df with DataFrame(row_list, columns=['Name', 'String']). Should work fine. Looping over dfs isn't efficient but as you said, it's only a few hundred lines so whatever. Edit: like this. code:
Zugzwang fucked around with this message at 16:59 on Nov 11, 2020 |
# ? Nov 11, 2020 16:25 |
|
I’m trying to learn Python for data journalism, and I was wondering if anyone had any resource suggestions. I’m looking for a basic curriculum that I can follow, test my working knowledge, and track my progress. I’ve been dabbling in python4everybody, but I was wondering if there were other suggestions.
|
# ? Nov 11, 2020 18:27 |
|
Head Bee Guy posted:I’m trying to learn Python for data journalism, and I was wondering if anyone had any resource suggestions. I’m looking for a basic curriculum that I can follow, test my working knowledge, and track my progress. I dont have a singular resource but I have some tips: It seems like your goal would be to make great visualizations of data. I'd pick a project I want to write about and can get data for and I'd start googling how to do what I want. As a hard learned tip, even 3 years in, I STILL find matplotlib cumbersome. Many python libraries are wrappers on it. Here's some of the options you might want to get familiar with. If your goal is to make web apps for journalism, the easiest way if youre just starting out and only know python is Plotly Dash IMO. Bonus that it is built by the same people who make plotly so it does that stuff very easily.
|
# ? Nov 11, 2020 23:55 |
|
CarForumPoster posted:I dont have a singular resource but I have some tips: People with more experience can chime in, but as someone who self-taught a lot of programming, yeah, pick a project first, it's SO much easier to motivate yourself to learn if you have a limited set of objectives that you want to achieve for reasons beyond "I want to know a thing". Trying to digest the entire python syntax and standard library is going to be heavily besides the point for you, because a lot of it is built to do programmer things instead of data processing things. Also python visualizations are all awful, especially for making actual nice looking plots that do unusual formatting, as presumably would be needed in data journalism, DOUBLE especially for making them HTML-friendly. I personally have had good luck with auto-generating javascript and html5. It's an added layer of learning curve, but when you get good at python, remember that as an option.
|
# ? Nov 12, 2020 23:29 |
|
Mirconium posted:Also python visualizations are all awful, especially for making actual nice looking plots that do unusual formatting, as presumably would be needed in data journalism, DOUBLE especially for making them HTML-friendly. I personally have had good luck with auto-generating javascript and html5. It's an added layer of learning curve, but when you get good at python, remember that as an option. I can't really agree with this, although default matplotlib and some of its wrappers can be awful especially if you want non-raster renderers. There are plenty of nice HTML-friendly ways to do very nice visualizations in Python including but not limited to Plotly and Bokeh. Rolling your own Javascript and HTML generation seems like an amazing amount of work for something that is probably going to be uglier and way harder to use than the Python plotly.js interface. Dash/Plotly is a great resource for data journalism-type stuff where you want fancy/nice/interactable/web-friendly visualizations.
|
# ? Nov 12, 2020 23:42 |
|
Foxfire_ posted:If you want to insert your program into a pipeline and don't care about anything besides counting, the fastest way will be to use os.open() / os.read() / os.write() and a decent sized buffer size to skip as much python as possible. You'll still be allocating&deallocating python objects, copying data, and doing dictionary lookups for every chunk though, so it'll be slower than a C one Thanks, I tried this out and hit 3GB/s with a 64KB buffer. code:
code:
|
# ? Nov 13, 2020 06:01 |
|
Zugzwang posted:One simple way would be to use df.iterrows(), and fill a list (let's call it row_list) with each row you want into a dictionary with "Name and "String" keys. Thanks for this. I'd read the warning in the pandas docs about not modifying something I'm iterating over and for some reason it didn't occur to me to just throw the stuff into a new dataframe, so I avoided iterrows altogether
|
# ? Nov 13, 2020 10:07 |
|
Qtamo posted:Thanks for this. I'd read the warning in the pandas docs about not modifying something I'm iterating over and for some reason it didn't occur to me to just throw the stuff into a new dataframe, so I avoided iterrows altogether
|
# ? Nov 13, 2020 18:25 |
Qtamo posted:Pandas question: how can I split a column filled with strings into multiple rows by character count? I've got a dataframe that needs to get exported into an xls to be used as an import into an ancient system that limits cell character count to 100 characters. Since the strings are sentences, I'd prefer to split by the whitespace right before hitting 100 characters, but I haven't found a solution. It doesn't need to be efficient, the dataframe is only ~200 rows and probably somewhere around 1000 rows or less after it's been split. Stupid simple version: code:
|
|
# ? Nov 13, 2020 19:31 |
|
I'd recommend the grouper iterator but god drat it's annoying itertools has a "recipes" section in the documentation instead of just putting code in the library where it'd be useful
|
# ? Nov 13, 2020 19:43 |
|
more-itertools makes the the grouper recipe available.
|
# ? Nov 13, 2020 19:52 |
|
Zoracle Zed posted:I'd recommend the grouper iterator but god drat it's annoying itertools has a "recipes" section in the documentation instead of just putting code in the library where it'd be useful The number of times I have had to Google and copy-paste the exact same function off of either Stackoverflow or the itertools doc depending on which ones shows up first is just so frustrating. Put it in the drat library already. I don't need the best way to write that that memorized taking up space in my brain. Who do we need to bother to make this happen? I know I can install more-itertools or whatever but I don't want a whole extra dependency when that is an incredibly common and simple need that would fit well in itertools.
|
# ? Nov 13, 2020 19:53 |
|
Thanks for all the suggestions - I ended up making my own solution before reading up on the replies since my last post, so here's a horrible version building on Zugzwang's initial reply and some applied stackoverflow (warning, this might be really ugly/stupid):code:
|
# ? Nov 16, 2020 12:01 |
|
I'm reading up on coroutines but getting tripped over by what's the current preferred method of doing stuff. Various older tutorials are full of examples like Python code:
Does anyone have a good, up to date tutorial on this stuff?
|
# ? Nov 16, 2020 14:21 |
|
|
# ? May 15, 2024 01:11 |
|
So I've been following a tutorial for making a basic space invaders type game with pygame. I decided to also start doing my own thing in parallel to exercise the old creative muscles too - it's all going fine so far except for 'blitting' the ship. Everything, as far as I can tell, is the same as the other project which does work. I can post the whole code if that would help but for now I'll stick to the line giving me trouble: self.screen.blit(self.image, self.rect) This is the only line in the blitme() method of the Ship class. I've tried setting a breakpoint when the method is called, and it seems to call fine from the main code. self.screen's properties correspond to the surface that the whole game runs on, self.image has the correct properties that it should have after loading the image I'm using for the ship, and self.rect also has the correct properties of the rect of the ship image. I even tried changing the x and y attributes of the rect to (100, 100) to absolutely ensure it was within the borders of the screen. When I run the program, however, it doesn't blit the ship. It doesn't throw any errors or anything, the ship just never appears. I've tried changing the image used for the ship to something much bigger and more obvious too. Nothing. Any ideas? E: self.screen.fill is also not doing anything. It's just showing a black screen no matter what values I pass. When I print self.screen it does show up as a surface with the expected dimensions though. It's like anything to do with self.screen just... isn't happening? E2: Forgot to add 'pygame.display.flip()' in my update_screen method. Leaving this here as a lesson to similar fools. Jakabite fucked around with this message at 22:17 on Nov 16, 2020 |
# ? Nov 16, 2020 20:46 |