|
I'm banging my head against the wall on a numpy/pandas float issue. I have a DataFrame of simulation results (or observations, doesn't matter). There is a floating point column "time", that would be convenient as an index but because floating-point indexes are difficult I am using timestep (integer) as the index and keeping time as a regular column. However, a common operation is looking up the index at a particular time. Because time is a float it can't be done the easy way (this is the same reason not to use time as the index). Anyways, here is my attempt:Python code:
Here are my test: Python code:
Unfortunately, it fails consistently at time step 9: code:
In short, I can't figure out a good way to reliably get the closest index with floats. I really don't want to do something like argmin(abs(time_series - time_values)) for a frequently used lookup function, but is that the only way? SurgicalOntologist fucked around with this message at 17:47 on Nov 16, 2017 |
# ? Nov 16, 2017 17:44 |
|
|
# ? May 28, 2024 15:40 |
|
SurgicalOntologist posted:In short, I can't figure out a good way to reliably get the closest index with floats. I really don't want to do something like argmin(abs(time_series - time_values)) for a frequently used lookup function, but is that the only way? Searchsorted doesn't look for the index of the nearest-to-target value, it looks for the index where you could insert the target value while keeping the original vector sorted. If searchsorted returns idx for target it doesn't tell you if target is closer to values[idx-1] or values[idx], you need to check that separately. Here's a solution from stackoverflow: code:
|
# ? Nov 16, 2017 22:05 |
|
Want to get all your existing code typed for mypy? https://github.com/dropbox/pyannotate
|
# ? Nov 20, 2017 02:17 |
|
I wrote a small data processing script a while back and since then I have updated pandas. Now the script does not work. Short of creating a venv for a single script, is there any way of telling my system to use a specific version of pandas when running the script?
|
# ? Nov 20, 2017 16:59 |
|
Jose Cuervo posted:I wrote a small data processing script a while back and since then I have updated pandas. Now the script does not work. Short of creating a venv for a single script, is there any way of telling my system to use a specific version of pandas when running the script? Why not just use a virtualenv? virtualenvwrapper makes it a lot easier to work with virtualenv. I imagine this isn't going to be the last time you're going to have version issues so I'd suggest just making a virtualenv and setup a requirements.txt.
|
# ? Nov 20, 2017 20:38 |
|
Yeah your environment probably only has 1 version of pandas installed, so your script is going to use that version. If you want a different version then you would need to either downgrade pandas in that environment (which is the easier option) or create a new environment with the downgraded pandas and then either activate that environment whenever you want to use this script or modify your script's shebang to use the correct environment's python binary
|
# ? Nov 21, 2017 04:22 |
|
Another option is to fix the script to work with the updated pandas version.
|
# ? Nov 21, 2017 09:12 |
|
My use case is that I need to rerun a bit of analysis for a paper in response to a review that requires something slightly different or with a updated set of data. I don't think it is worth my time to fix the script that I might have written 6 months ago for a one time run, so it seems like the venv is the way to go.
|
# ? Nov 21, 2017 15:20 |
|
Do you use conda? Conda lets you downgrade packages really easily, then you can just upgrade again when you're done
|
# ? Nov 22, 2017 06:37 |
|
Conda is seriously so good.
|
# ? Nov 22, 2017 10:54 |
|
You can do the same on a vanilla install...
|
# ? Nov 22, 2017 16:02 |
|
I'm trying to comprehend numpy arrays - I have an assignment and the first question is to create a random vector of size 20 and sort it in descending order. This is what I came up with:code:
array([[-0.94139218, -0.70652483, -0.67840897, -0.67044282, -0.62539388, -0.61770677, -0.58816414, -0.46556941, -0.44944398, -0.4487512 , -0.43776743, -0.41519608, -0.39534896, -0.34280607, -0.23698099, -0.0829909 , -0.05634266, -0.05450404, -0.04979055, -0.02429839]]) I'm not sure about the parameters used for 'np,' though - in this case ((1,20)). So I think '1' means that is has one dimension, so it's a vector. The '20' seems to be the total size of the array. Then I see many arrays that have a third number, but I'm not sure what it means...the tutorials that I've seen so far refuse to stoop to my level. Can anyone elucidate?
|
# ? Nov 24, 2017 02:04 |
|
Seventh Arrow posted:I'm trying to comprehend numpy arrays - I have an assignment and the first question is to create a random vector of size 20 and sort it in descending order. This is what I came up with: np is the numpy module, importing it as np just gives you a shorthand way of accessing everything in numpy. The other way is to just "import numpy" and then you'd be calling numpy.random.random(). It's very common to see numpy aliased to np in this way, it's convenient and common practice What you're really doing is giving arguments to the random() function of the np.random module. The arguments you provided are a tuple: (1,20). The tuple you provide defines what shape the output array will have. You could have also provided (20,) and it would have still given you a 20-element array. If you wanted a 4x10 array (40 elements), you'd give it (4,10), ie: np.random.random((4,10)) QuarkJets fucked around with this message at 02:12 on Nov 24, 2017 |
# ? Nov 24, 2017 02:09 |
|
Ok great, thanks! I did know about the "np" thing but thanks anyhoo So in the next question it says to create a 5x5 array with 6's on the borders and 0's on the inside. So I guess I would use ((5,5)) for the tuple, yes? I'll have to look into arranging the numbers in such a specific fashion though. edit: wait, the ((5,5)) doesn't seem right Also, is that a real quote from Duck Dunn?
|
# ? Nov 24, 2017 02:26 |
|
For more information on random, you can try googling for the numpy random module: https://docs.scipy.org/doc/numpy-1.13.0/reference/routines.random.html You can call any of the functions defined there by invoking np.random.whatever_function_you_want() with the proper arguments. The docs will describe what the arguments are and what they're used for. For instance, you invoked numpy.random.random, which according to the documentation returns a random array of floats. But what if you want integers? That page shows there's a randint function: https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.random.randint.html#numpy.random.randint You'd call it with np.random.randint
|
# ? Nov 24, 2017 02:28 |
|
Seventh Arrow posted:Ok great, thanks! I did know about the "np" thing but thanks anyhoo So in the next question it says to create a 5x5 array with 6's on the borders and 0's on the inside. So I guess I would use ((5,5)) for the tuple, yes? I'll have to look into arranging the numbers in such a specific fashion though. random.random((5,5)) would give you a 5x5 array of random floats. You don't really want a random array of values though; you may want to use something like np.zeros or np.ones
|
# ? Nov 24, 2017 02:29 |
|
I've started using pyCharm. I like all the little advice and hints but I don't understand this one: A quick googling suggest this isn't bad practice but I'm curious anyway. This is the code: Python code:
e: Thanks Loezi Loezi posted:I think it's just helpfully telling you that you won't be able to use Tkinter after the inner try/except block if the the try fails. Which is true, but also something you already knew? unpacked robinhood fucked around with this message at 17:24 on Nov 27, 2017 |
# ? Nov 24, 2017 10:53 |
|
Answered my own question, nm.
|
# ? Nov 24, 2017 11:30 |
|
unpacked robinhood posted:I've started using pyCharm. I think it's just helpfully telling you that you won't be able to use Tkinter after the inner try/except block if the the try fails. Which is true, but also something you already knew?
|
# ? Nov 24, 2017 11:38 |
|
Can anyone tell me why the upper bound of this while loop increases by thousands every iteration? As far as I can tell I'm not modifying the pix array at all so I'm not sure how elements are being added to it.code:
Danyull fucked around with this message at 06:08 on Nov 29, 2017 |
# ? Nov 29, 2017 06:03 |
|
As you say, from that code alone there is no reason len(pix) should increase. Absent any other information I'm going to guess the actual problem has to do with taking chunks of length L and incrementing your index by 1. If idx is 50 and rule.len is 100 and len(pix) is some large number, are you really intending to have chunks of pixels 50-149, 51-150, 52-151, etc? If that's not it either I'm overlooking something or you need to provide more information. If that is it, using names such as "chunk_length" instead of "l" or "position" instead of "idx" could help you avoid this kind of problem in the future. Having "i", "idx", "l", and "1" all in the same 10 lines of code is a bit gnarly. The loop appending pixel one at a time to chunk.ch also smells funny, it looks like a slice might be a good idea, but I don't know what pix and chunk.ch are exactly. breaks fucked around with this message at 07:48 on Nov 29, 2017 |
# ? Nov 29, 2017 07:33 |
|
The loop is part of a script that should be breaking images into chunks and sub-chunks of predetermined sizes, then randomly shuffling and permutating some of of them based on a chosen pattern. I'm using Pillow as a way to gather data about the image. The pix array is a list containing tuples that represent the RGB values for each pixel of the image in the form (RRR | GGG | BBB). So yes, it has the chance to be pretty large, but I am currently testing it on images that are only about 10x10 pixels in size and still having the problem. I actually tried slicing for that appending loop but I think I was doing it wrong because chunk.ch kept coming up as an empty array after trying to add pixels to it. Also sorry about the bad naming/syntax, this is my first real try at programming since high school.
|
# ? Nov 29, 2017 08:10 |
|
So from your description it sounds as if you do intend to increment your index by l (the length of the current chunk) instead of 1 (as in the number). If you fix that does the current problem persist?
|
# ? Nov 29, 2017 08:18 |
|
Danyull I'm not perfectly sure I get what you want, butcode:
|
# ? Nov 29, 2017 10:34 |
|
breaks posted:So from your description it sounds as if you do intend to increment your index by l (the length of the current chunk) instead of 1 (as in the number). If you fix that does the current problem persist? You're right, I should have used the length there instead of one, however the infinite loop problem still existed. I ended up fixing it actually, but I still don't know how. For some reason when it got to the point of creating a new chunk, the new chunk.ch would already be filled with the elements of pix. Appending the elements of pix onto the end of chunk.ch was for some reason also appending the elements onto pix again, causing it to double in size each time. The issue went away after adding in "chunk.ch = []" between "chunk = Chunk()" and the appending loop.
|
# ? Nov 29, 2017 20:43 |
|
Danyull posted:You're right, I should have used the length there instead of one, however the infinite loop problem still existed. I ended up fixing it actually, but I still don't know how. For some reason when it got to the point of creating a new chunk, the new chunk.ch would already be filled with the elements of pix. Appending the elements of pix onto the end of chunk.ch was for some reason also appending the elements onto pix again, causing it to double in size each time. The issue went away after adding in "chunk.ch = []" between "chunk = Chunk()" and the appending loop. code:
|
# ? Nov 29, 2017 20:49 |
|
I ended up doing:code:
e: and now fully fixed Danyull fucked around with this message at 21:47 on Nov 29, 2017 |
# ? Nov 29, 2017 21:30 |
|
"while" solutions usually feel rather un-pythonic to me. I think if you want to iterate over an iterable, you should iterate over the iterable! Counting indices (or however that thing is called) are alien and un-pthonic.
|
# ? Nov 29, 2017 22:13 |
|
At a guess something screwy was going on in your Chunk class, so when you constructed a new one you were actually reusing an array (maybe handing the same one out multiple times so they all saw every append), or doing something with your pix list or the original Image object, something like that Honestly in these situations the best thing is to run it through a debugger, set it to pause in a useful place so you can poke around at the current state of things, and you'll probably get an idea of where things are going wrong. Sometimes it's an obvious bug, other times it gives you a direction to investigate
|
# ? Nov 29, 2017 22:17 |
|
Cingulate posted:"while" solutions usually feel rather un-pythonic to me. I think if you want to iterate over an iterable, you should iterate over the iterable! Well guy hasn't been programming for a while! So long as it works that's the main thing, you can clean it up later oh wait now I remember, it does not have that and you have to write your own which is bad. Thanks python! baka kaba fucked around with this message at 22:25 on Nov 29, 2017 |
# ? Nov 29, 2017 22:20 |
|
conda build variants, how do they work? I have a Python package called package1 that I want to build for both Python 2.7 and 3.5. package1 has a build requirement for package2, which is also a Python package and should also be built for Python 2.7 and 3.5, and package2 is not a sub-package of package1. Inside each of my conda recipe folder, as explained on the build variant page, I have the following file: code:
If I instead do conda build package2 first then it will correctly build package2_py27 and package2_py35, and then if I build package1 it'll correctly build package1_py27 and package1_py35. Is there a way to make the first usage case work properly so that I don't have to worry about build order?
|
# ? Nov 29, 2017 23:23 |
|
It's probably not pythonic because I only knew C# and Java up until this point.baka kaba posted:At a guess something screwy was going on in your Chunk class, so when you constructed a new one you were actually reusing an array (maybe handing the same one out multiple times so they all saw every append) This looks to have been the problem. I screwed up the def __init__ for that class and it wasn't creating a new array for some reason. Fixed that and then I could remove the chunk.ch = [] line I had to throw in. If anyone wants to mess around with it you can get it here. The only package you need is Pillow. You just run shufflestream.py with an image named test.jpg in the same project/folder and it will pop out a new glitch.jpg. You can edit the constants/patterns at the beginning of the file for interesting results.
|
# ? Nov 30, 2017 00:54 |
|
Noice! It would be cool if you could supply a pattern that says which chunks get glitched (so you can have some clean parts and maybe make the shapes everyone loves) or make the randomisation function get more pronounced as it goes, so it gets more glitchy as you go down the image
|
# ? Nov 30, 2017 01:49 |
|
Is there a good way to step through Python code? I found a Python library to parse torrent names (https://github.com/divijbindlish/parse-torrent-name/blob/master/PTN/parse.py) and I can't quite figure out how it works.
|
# ? Nov 30, 2017 04:26 |
|
Hughmoris posted:Is there a good way to step through Python code? I found a Python library to parse torrent names (https://github.com/divijbindlish/parse-torrent-name/blob/master/PTN/parse.py) and I can't quite figure out how it works. Pycharm. Add the library to your project, set a breakpoint in it and run the debugger.
|
# ? Nov 30, 2017 04:40 |
|
PyCharm is great, but if you don't want to use it, you can use pdb or ipdb.
|
# ? Nov 30, 2017 05:04 |
|
pudb's my favorite if you're on unix
|
# ? Nov 30, 2017 06:33 |
|
How fraught is the idea of runtime module generation with Python? I'm working with protobuf and have had one too many subtle bugs that amounted to "you forgot to run the script to regenerate from the proto modules, idiot." My other code in compiled languages has it as part of the build process, making this near-impossible, and I'm toying with the idea of adding it to the Python module at load-time. I was thinking something along the lines of: code:
But I'm not sure how bad of an idea this is with Python. It looks like there are tools for finding the location of the current source or .pyc file, but I'm wondering if there are horrific code-lovecraftian angles I haven't considered here.
|
# ? Nov 30, 2017 09:52 |
|
Little things: Is there a pythonic one-liner to make natural langage enumerations ? For example I'd have: Python code:
Same with "you ordered 3 items" with the 's' added as needed ? It's two dead simple functions with conditionnals but I feel like there's an interesting and probably obtuse python trick for this.
|
# ? Nov 30, 2017 10:51 |
|
|
# ? May 28, 2024 15:40 |
|
unpacked robinhood posted:Little things: Python code:
EDIT: Shouldn't post at 3am. Misread what you asked, I would just do it with an if, elif, else block. Da Mott Man fucked around with this message at 12:51 on Nov 30, 2017 |
# ? Nov 30, 2017 12:43 |