|
Is this what you're looking for? You only wanted one element at a time from each of the later lists (list2, list3) added to each of the lists in list1, right? Not very pretty, but it seems to have gotten to what you're hinting at:code:
code:
Zugzwang fucked around with this message at 03:08 on Sep 24, 2020 |
# ¿ Sep 24, 2020 02:30 |
|
|
# ¿ Apr 29, 2024 06:26 |
|
Mirconium posted:Wait holy smokes when did Python get (I assume optional?) types? I've been living under a rock for a while, admittedly, but holy crap! On the other hand, if you write performance-intensive functions in Cython, you can potentially get huge performance gains by doing nothing special except declaring types.
|
# ¿ Oct 17, 2020 00:10 |
|
Qtamo posted:Pandas question: how can I split a column filled with strings into multiple rows by character count? I've got a dataframe that needs to get exported into an xls to be used as an import into an ancient system that limits cell character count to 100 characters. Since the strings are sentences, I'd prefer to split by the whitespace right before hitting 100 characters, but I haven't found a solution. It doesn't need to be efficient, the dataframe is only ~200 rows and probably somewhere around 1000 rows or less after it's been split. At the first entry for Jimmy, split the string into a list of however many 100-char strings, and append a dictionary with {"Name": "Jimmy", "String": [first 100-char string]} to row_list. Then loop over the remaining strings, appending a dictionary to row_list with {"String": [next 100-char string]} until the strings are exhausted. Finally, make a df with DataFrame(row_list, columns=['Name', 'String']). Should work fine. Looping over dfs isn't efficient but as you said, it's only a few hundred lines so whatever. Edit: like this. code:
Zugzwang fucked around with this message at 16:59 on Nov 11, 2020 |
# ¿ Nov 11, 2020 16:25 |
|
Qtamo posted:Thanks for this. I'd read the warning in the pandas docs about not modifying something I'm iterating over and for some reason it didn't occur to me to just throw the stuff into a new dataframe, so I avoided iterrows altogether
|
# ¿ Nov 13, 2020 18:25 |
|
My computer sure did some interesting stuff when I tried to create a ~5 GB string from a list and write it to a text file all in one go. Only made that mistake once.
|
# ¿ Dec 10, 2020 23:25 |
|
Do any of y'all have experience distributing packages with Cython? I've been Googling about this as much as I can and have also tried reverse-engineering the Cythonizing components of packages such as pandas, all to no avail. Right now my folder structure is: code:
setup.py currently looks like this: code:
I noticed that other Cython-using packages have a variety of helper files like MANIFEST.in in the main directory, but none of the helper files seem to mention Cython, so ¯\_(ツ)_/¯ I've also tried this with the built .pyd extension in the cy/ folder, and the imports within __init__.py just fail. At this point, I've spent way more time on this than I care to think about. Help me Python thread Kenobi, you're my only hope. Zugzwang fucked around with this message at 04:47 on Dec 31, 2020 |
# ¿ Dec 31, 2020 04:09 |
|
The file variable is calling read() and decode() methods. I’m not familiar with tablib either, but it sure looks like that line is converting the file into text and then using that to construct a Dataset. If the method you tried didn’t work to construct the DataFrame, you should still inspect the contents of the file variable, since I’m not sure where else the data would be coming from. pandas’s read_csv accepts a path or buffer with a read() method, so have you tried just chopping off read() and decode() from file and passing that into read_csv? Zugzwang fucked around with this message at 03:43 on Nov 8, 2022 |
# ¿ Nov 8, 2022 03:35 |
|
duck monster posted:Zed Shaw is a weird dude. Saw him ranting at someone on twitter the other day about why we shouldnt use django for webapps, when C++ is available and much faster.
|
# ¿ Nov 14, 2022 12:49 |
|
I did not know about cached_property. That looks awesome and I can’t wait to simplify some of my code with it.
|
# ¿ Dec 3, 2022 04:26 |
|
+1 for ATBS. It’s not just a good intro to the language. It immediately shows you how to do truly useful things with it. (It’s also free!) Al’s follow up book to that, Beyond the Basic Stuff With Python, is also great. It’s about how to put together nontrivial, maintainable programs once you know the ins and outs of the language.
|
# ¿ Dec 4, 2022 07:42 |
|
I just install everything into base and use only the packages that win the ensuing melee.
|
# ¿ Dec 6, 2022 03:36 |
|
Just eyeballing it, looks like the part I highlighted in this line is your issue: profilePic = profileDiv.find_element(By.XPATH, ".//img[contains(@class, 'presence-entity__image')]").get_attribute('src') if profileDiv.find_element(By.XPATH, ".//img[contains(@class, 'presence-entity__image')]").get_attribute('src') else “” The problem is that if it doesn’t find profilePic, you are still trying to call the get_attribute method on it. Which will crash. Is your error message something like “NoneType has no method get_attribute”? Anyway, cut that method call and see if it works.
|
# ¿ Dec 12, 2022 20:56 |
|
You could also get an arbitrary number of floats by having the user specify something like a space-delimited string. For ex:code:
|
# ¿ Feb 14, 2023 23:32 |
|
Speaking of comprehensions, is there a Pythonic way to populate a list with elements from an iterator until the list hits a certain size? Like, the naive implementation (with the desired list size of 20 in this example) is:code:
I tried looking into some lesser-used functions in itertools but didn’t see anything that jumped out as being appropriate. Zugzwang fucked around with this message at 05:16 on Feb 15, 2023 |
# ¿ Feb 15, 2023 05:10 |
|
i vomit kittens posted:Is there a reason islice wouldn't work or did you just miss it when going through itertools? Zugzwang fucked around with this message at 06:10 on Feb 15, 2023 |
# ¿ Feb 15, 2023 06:04 |
|
Thanks for the iterator input, goons. That was exactly what I needed. Simpler, less verbose/nested code and with better results
|
# ¿ Feb 16, 2023 02:56 |
|
Do you mean you want to make a list containing grade #n from every list? That can just be code:
|
# ¿ Feb 23, 2023 19:00 |
|
duck monster posted:I've been using python since the 1990s. Like, I recently discovered (through Trey Hunner’s newsletter) a way of avoiding a common for/break pattern. Let’s say you want to assign foo to the first item in an iterable that meets some criterion (such as the first even number), and if you don’t find it, then assign it to None. The verbose implementation is: code:
code:
Seems like Python always has a way of simplifying really ugly code.
|
# ¿ Mar 1, 2023 14:56 |
|
FISHMANPET posted:A friend pointed out to me that I don't need the default value in next, especially when the next step in my code (loading the URL I found) would fail if the default value happened, so I just took that out. It'll cause a StopIteration exception which I'm not handling, but what it really means is something is messed up with the project I'm reading from and I'll need to fix my code regardless. Data Graham posted:Yeah, I hear the Kill Bill klaxon in my head whenever I see [0] Zugzwang fucked around with this message at 06:17 on Mar 4, 2023 |
# ¿ Mar 4, 2023 06:08 |
|
“Beyond the Boring Stuff with Python” and “Practices of the Python Pro” are both great books that are about how to write good Python in general. Wes McKinney’s (creator of pandas) book is online for free too: https://wesmckinney.com/book/ Also, check out polars (the package) for data analysis. It’s a newer DataFrame library written in Rust. It isn’t a full replacement for everything pandas does, but in general, it’s comically faster. Zugzwang fucked around with this message at 23:26 on Mar 4, 2023 |
# ¿ Mar 4, 2023 23:13 |
|
I really like Al Sweigart’s book Automate the Boring Stuff with Python. It’s “here are the basics of Python, now here are a bunch of very useful things you can do with it.” It’s available for free at https://automatetheboringstuff.com For further education, the YouTube channel ArjanCodes is great.
|
# ¿ Mar 6, 2023 17:15 |
|
PySimpleGUI is legit. It’s basically ergonomic wrappers for the built-in Tkinter library. Not sure if it does everything you want, but it has a lot of demos and examples here: https://www.pysimplegui.org/en/latest/cookbook/
|
# ¿ Mar 9, 2023 00:09 |
|
How reliable are modern Python-to-exe packages? I recently learned that a research tool I’ve been working on for my job might eventually need to be deployed to outside users, probably closed source. And due to IP reasons on their end, it can’t be a web app hosted on AWS or whatever because their data/results can’t be on a system outside of their company at any time.
|
# ¿ Mar 17, 2023 00:09 |
|
StumblyWumbly posted:I've used pyInstaller, there are times when it is a massive pain but once you get it working it is fine. Things to be aware of: CarForumPoster posted:You should never ever expect users to have a python install. Thats asking for it-works-on-my-computer hell. PyInstaller has worked well for me. QuarkJets posted:iirc Singularity containers are basically designed for this kind of situation, where you don't trust either the system owner or other users on the system. And you can build one from a docker container so you can get the best of both worlds
|
# ¿ Mar 17, 2023 03:41 |
|
Yeah, that makes sense. This is still hypothetical at this point, so I am mostly trying to figure out what paths I would take if we needed to go there. To be frank, I’m not even sure it’s feasible for this to be closed source, and I’d rather it not be. Will ultimately be up to management at my organization though.
|
# ¿ Mar 17, 2023 05:02 |
|
duck monster posted:Ah my naive child.
|
# ¿ Mar 22, 2023 11:17 |
|
Falcon2001 posted:Here's my smarmy one-line take on it, thanks to the Python standard library. Anyway itertools owns and I am constantly trying to learn more about how to apply its dark magic.
|
# ¿ Apr 21, 2023 04:34 |
|
Seconding both ArjanCodes and mCoding. The former has a great deal of stuff on general software engineering, with examples implemented in Python. The latter is mostly but not exclusively Python (he also covers C++ stuff sometimes).
|
# ¿ May 15, 2023 01:07 |
|
"ChatGPT or idiot?" will now be a question I ask myself frequently. I recently read a blog post like that one that extolled C++ as a language known for being simple and easy to learn.
|
# ¿ May 16, 2023 13:40 |
|
IIRC polars doesn't have all the functions pandas does, though I'm not enough of an expert in either to go into detail. But yeah I've almost totally switched over to polars. The differences in ergonomics, memory usage, and especially speed vs pandas are just plain unfair.
|
# ¿ May 30, 2023 13:05 |
|
Yeah I really wish Python had something like "type hinting is purely optional, but they will be enforced if you specify them." As opposed to "the Python interpreter does not give the slightest gently caress about type hints." Julia has the former as part of its multiple dispatch model, but sadly, adoption of Julia continues to be pretty underwhelming.
|
# ¿ Sep 3, 2023 02:22 |
|
Cursed comedy option: use for i, _ in enumerate(list), then access elements via the index
|
# ¿ Sep 5, 2023 18:16 |
|
Why write something in Python if C++ will do the job??
|
# ¿ Sep 19, 2023 16:25 |
|
I don't have time to import time from time
|
# ¿ Sep 27, 2023 00:25 |
|
I personally recommend PyInstrument as a profiler because it only shows you the things where your code spends substantial time. Its HTML output ability is nice. IIRC cProfile shows you every operation/call, many of which will be irrelevant to the program's speed of execution. Anyway, your question is pretty broad. Sometimes the issue comes from a bit of code that works fine when your dataset is small and not so fine when it isn't. I once had a small section of code that was consuming 25% of the program's runtime, and it's because it was repeatedly calling min() on a very large, growing set. This is not only O(n), it was n operations every time. Tracking the minimum value through another means sped up that section over 500x, and that bit of code didn't even register as noteworthy in the profiler anymore.
|
# ¿ Sep 30, 2023 21:47 |
|
I'd only use it if I expected a specific kind of error to arise that I knew was okay to ignore. One of the files I need to regularly read at work comes from a 3rd party and is essentially a giant LZMA2-compressed text file. Whenever I get to the end of it, it raises an EOFError due to a glitch in whatever they're using to compress it. (This error also shows up in 7-Zip; or rather, 7-Zip says "hey this file looks a bit odd" but still reads it okay.) I had to write a context manager specifically for ignoring EOFErrors in that dumb file type because otherwise my code would spend like 30 minutes rolling through it completely fine and then crash when it got to the very last line.
|
# ¿ Oct 3, 2023 11:42 |
|
LLMs are useful when there's a lot of data ("how do I web scrape this page in Python") but decidedly less useful when there's not. It doesn't help at all that they don't actually understand anything, they just mush together stuff that is usually mushed together. Zugzwang fucked around with this message at 19:59 on Oct 6, 2023 |
# ¿ Oct 6, 2023 19:56 |
|
ComradePyro posted:succinct description of most reddit comments
|
# ¿ Oct 7, 2023 00:00 |
|
I am a mediocre programmer and only know how to commit sane amounts of crime
|
# ¿ Oct 7, 2023 23:54 |
|
|
# ¿ Apr 29, 2024 06:26 |
|
Wes McKinney's (creator of pandas) book is available for free on his site: https://wesmckinney.com/book/
|
# ¿ Oct 18, 2023 20:53 |