Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
SurgicalOntologist
Jun 17, 2004

I've got a little puzzle... we have a bunch of Python packages with a similar test setup: pytest with pytest-cov, pytest-flake8, pytest-docstyle, and pytest-mypy. We just copied this configuration into a new library, one that has zero actual tests at this point (for the purpose of getting at least the linting for now), and we got 49% coverage. What gives? Does one of these linters actually execute the code in a way that registers the coverage measurement? And if so why only 49%? We're all quite confused.

Adbot
ADBOT LOVES YOU

zhar
May 3, 2019

Very simple question, I have a bunch of files like 1.json, 2.json, and so on that I need to play with in that order.

I try to use this:
code:
for file in sorted(os.listdir()):
and it gives me the list in this order: 1.json, 101.json etc. How do I get it to sort in the correct order?

QuarkJets
Sep 8, 2008

You could zero pad all of the file names that start with a number prior to sorting

necrotic
Aug 2, 2005
I owe my brother big time for this!
If you can't rename them then a for i in range(len(file_list)) and constructing the filename from i+1 would be easy, assuming no numbers are skipped in the filenames.

a foolish pianist
May 6, 2007

(bi)cyclic mutation

code:
files = os.listdir(directory)
file_numbers = []
for file in files:
    file_numbers +=int(file.split(".")[0])

for i in range(0, high_number):
    if i in file_numbers:
        do_thing("().json".format(i))

zhar
May 3, 2019

Thanks, I ended up padding with zfill.

Thermopyle
Jul 1, 2003

...the stupid are cocksure while the intelligent are full of doubt. —Bertrand Russell

https://www.vice.com/en_us/article/zmjwda/a-code-glitch-may-have-caused-errors-in-more-than-100-published-studies?utm_source=reddit.com

The original paper can be found here: https://pubs.acs.org/doi/10.1021/acs.orglett.9b03216

If you dig enough you can find out that they assumed glob.glob output was sorted.

Malcolm XML
Aug 8, 2009

I always knew it would end like this.
Natural sort: https://stackoverflow.com/questions/4836710/does-python-have-a-built-in-function-for-string-natural-sort

VictualSquid
Feb 29, 2012

Gently enveloping the target with indiscriminate love.
I am reading my way through the testing goat book right now.

At some point he recommends automating deployment using:
code:
run(f'python3.6 -m venv virtualenv')
But that will obviously fail on a modern system. I can just change it to 3.7, but that feels janky.

When I googled it it looks like there is no option or even variant to automatically create a venv with a python version that isn't already installed through the package manager.

To me that implies that everybody in the python world expects that their software will be utterly abandoned by all users by the time 3.8 becomes the default for new linux installs. Is that really a reasonable assumption in the python world?

Dominoes
Sep 20, 2007

I think the assumption is you must have the desired Py version installed. On Linux or Mac, this is straightfwd to do by building from source; using the included instructions, it will install to `usr/bin` under the alias `python3.7` etc. You can then either run `python3.8 -m venv virtualenv`, or `./usr/bin/python3.8 -m venv virtualenv`. (You may be able to find unofficial package-manager versions, but these are distro-specific, and may-or-may not be a pain to install) In Windows, the best way may be to use the installers; I don't think you'll get the aliases, but can use `./C/Program files/python3.8/Scripts/python.exe` etc.

quote:

When I googled it it looks like there is no option or even variant to automatically create a venv with a python version that isn't already installed through the package manager.
With a tool I created recently, you'd run `pyflow switch 3.8`, and `pyflow install` to download and switch to the new version. I think Conda may do this as well.

VictualSquid posted:

To me that implies that everybody in the python world expects that their software will be utterly abandoned by all users by the time 3.8 becomes the default for new linux installs.
Chances are, there will never become a default across linux distros. Based on history, 3.8 won't be the most popular for a long time. I think you'll find there are many different workflows around, and using an automated venv-creator tied to a version/alias is one of many techniques. Code built for Python 3.6 etc should work on 3.8... but not vice versa.

Dominoes fucked around with this message at 16:38 on Oct 13, 2019

Dominoes
Sep 20, 2007

I haven't dug into the original chem papers, but it's surprising this wasn't picked up earlier due to surprising or inconsistent results... it makes me wonder how much publication bias played in.

Thermopyle
Jul 1, 2003

...the stupid are cocksure while the intelligent are full of doubt. —Bertrand Russell

Yeah.

It was actually caught because of different results in the same lab on computers running different OS's.

CarForumPoster
Jun 26, 2013

⚡POWER⚡

Dominoes posted:

I think the assumption is you must have the desired Py version installed.

Yea, auto installing the latest python version for an environment of an otherwise known config seems like begging for problems.

QuarkJets
Sep 8, 2008

Dominoes posted:

I haven't dug into the original chem papers, but it's surprising this wasn't picked up earlier due to surprising or inconsistent results... it makes me wonder how much publication bias played in.

It's a small miracle that it was discovered at all

The fundamental issue was that whoever wrote the script assumed that glob provides sorted results, so the returned file ordering was OS-dependent. Someone who's not actively looking for bugs in the script likely wouldn't notice this assumption, and in fact, hundreds of researchers using the script didn't notice the issue for over 5 years; how often does a chemist perform code review for widely-used software that they didn't write? I doubt it ever happens.

And then this error produces output that still looks entirely reasonable; in the example provided, one OS produced a value of 173.2 while another produced 172.7. A lot of the studies suffering from the problem might not even be sensitive to a calculation bias this small. And many of those studies actually published correct results, because they were running on an OS where glob coincidentally provided results in the order that the script was expecting

We like to imagine that scientists are rigorously reverifying each others' work all the time, but the fact of the matter is that there's little fame or profit to be found in doing that kind of essential work, so this reverification process doesn't happen nearly as often as it needs to. And in this reverification study, if the student had been running on an OS that provides the same glob ordering as the original study then the issue would have remained undiscovered.

QuarkJets
Sep 8, 2008

CarForumPoster posted:

Yea, auto installing the latest python version for an environment of an otherwise known config seems like begging for problems.

It's kind of a ghetto way to do version control but yeah, it makes sense. If you wrote software for Python 3.7, you don't want someone to assume that it'll continue working fine under Python 4.5 in the year 2057

The March Hare
Oct 15, 2006

Je rêve d'un
Wayne's World 3
Buglord
So I've worked with Python GUI development before and packaging is such a nightmare.

https://build-system.fman.io/docs/

I just found this which lets you make a pyqt5 app and build it with `fbs freeze` or create an installer with `fbs installer` and ran through their little tutorial and everything worked flawlessly. Does anyone have any experience going beyond the tutorial with this thing? It seems way too good to be true.

cinci zoo sniper
Mar 15, 2013




3.8 is here.

punished milkman
Dec 5, 2018

would have won

The f-string debug unpacking shorthand or whatever it is seems neat I guess

Hed
Mar 31, 2004

Fun Shoe
The walrus operator looks neat but I'll have to practice with it to see where it's good and more importantly where it might impair readability.

SurgicalOntologist
Jun 17, 2004

Yeah, looks possible to overdo but I'll probably end up using it.

Cockmaster
Feb 24, 2002
I've been playing around the multiprocessing library, and it looks like for what I was hoping to do with it (run a function which needs to execute on the order of 10,000 times, taking a total of roughly 1-2 seconds with a single thread), the overhead would use more time than I'd save.

Are there any other solutions for parallel processing in Python under Windows 10, ones that don't add major overhead? This function is probably too complex to be a good candidate for CUDA.

QuarkJets
Sep 8, 2008

Is that 1-2 seconds overall, or 1-2 seconds per function call? If the former, you can probably still get a small performance increase but it may not be noticeable. If the latter, multiprocessing is extremely well-suited to your problem

But... you should use concurrent.futures instead. It's basically a high-level wrapper for multiprocessing that was introduced in Python 3.2. It's multiprocessing (and also multithreading), but even easier to use.

If you're mostly dealing with primitives and arrays, an even better and way-faster option is to use Numba to compile your function. Numba comes with a bunch of parallelization features, or you can just turn off the GIL and use your own multithreading with concurrent.futures if that's what you prefer.

Cockmaster
Feb 24, 2002

QuarkJets posted:

Is that 1-2 seconds overall, or 1-2 seconds per function call? If the former, you can probably still get a small performance increase but it may not be noticeable. If the latter, multiprocessing is extremely well-suited to your problem

But... you should use concurrent.futures instead. It's basically a high-level wrapper for multiprocessing that was introduced in Python 3.2. It's multiprocessing (and also multithreading), but even easier to use.

If you're mostly dealing with primitives and arrays, an even better and way-faster option is to use Numba to compile your function. Numba comes with a bunch of parallelization features, or you can just turn off the GIL and use your own multithreading with concurrent.futures if that's what you prefer.

It's 1-2 seconds overall. If concurrent.futures is based on the multiprocessing library, wouldn't it have the same problem with overhead as using multiprocessing by itself?

It looks like Numba is my best option here (short of rewriting the function with Cython). Thank you.

Thermopyle
Jul 1, 2003

...the stupid are cocksure while the intelligent are full of doubt. —Bertrand Russell

Found a list of some use cases for the new walrus operator. I'm sure going forward we're all going to hear a million times "what's it good for?".

https://github.com/vlevieux/Walrus-Operator-Use-Cases

QuarkJets
Sep 8, 2008

Cockmaster posted:

It's 1-2 seconds overall. If concurrent.futures is based on the multiprocessing library, wouldn't it have the same problem with overhead as using multiprocessing by itself?

It looks like Numba is my best option here (short of rewriting the function with Cython). Thank you.

Yes, that's right. I'd advise using concurrency.futures if your problem was 1-2 seconds per function call * 10000 iterations, but it probably won't be good when each function call is 1-2 ms

Also yeah, Numba is great and I prefer it over Cython, both for ease of use and for performance. In fact if you can compile the function in nopython mode then it may run so fast that you don't even need to consider parallelism

Private Speech
Mar 30, 2011

I HAVE EVEN MORE WORTHLESS BEANIE BABIES IN MY COLLECTION THAN I HAVE WORTHLESS POSTS IN THE BEANIE BABY THREAD YET I STILL HAVE THE TEMERITY TO CRITICIZE OTHERS' COLLECTIONS

IF YOU SEE ME TALKING ABOUT BEANIE BABIES, PLEASE TELL ME TO

EAT. SHIT.


Thermopyle posted:

Found a list of some use cases for the new walrus operator. I'm sure going forward we're all going to hear a million times "what's it good for?".

https://github.com/vlevieux/Walrus-Operator-Use-Cases

The list comprehension stuff seems genuinely good, but the rest is a bit meh.

punished milkman
Dec 5, 2018

would have won

Private Speech posted:

The list comprehension stuff seems genuinely good, but the rest is a bit meh.

I'm sure that this is at least partially because I haven't really used the walrus operator yet and I'll need time to adjust, but it seems so unintuitive and makes the code way less readable for me.

Sad Panda
Sep 22, 2004

I'm a Sad Panda.
If I understand it properly, my main use will be shortening...

code:
invalid = True
while not invalid:
     do stuff

NinpoEspiritoSanto
Oct 22, 2013




Sad Panda posted:

If I understand it properly, my main use will be shortening...

code:
invalid = True
while not invalid:
     do stuff

This is my current takeaway as well, though I rarely use regex for anything so I'm sure the match shortcut might come in handy for those that do. Some other examples at the link Thermopyle shared might come in handy.

Thermopyle
Jul 1, 2003

...the stupid are cocksure while the intelligent are full of doubt. —Bertrand Russell

punished milkman posted:

I'm sure that this is at least partially because I haven't really used the walrus operator yet and I'll need time to adjust, but it seems so unintuitive and makes the code way less readable for me.

I've used it in several languages and it's still unintuitive and hard to read.

I think it's probably going to be over-used.

punished milkman
Dec 5, 2018

would have won
There was some discussion a few pages back about the Obey the Testing Goat book for TDD with Python/Django. Just wanted to chime in and say I bought it because of whoever mentioned it and can vouch that it rules. If you're like me in that you've been doing lots of work with Django but have been neglecting doing adequate testing because it seems hard, overwhelming and maybe a waste of time, you should probably read this book.

Norton the First
Dec 4, 2018

by Fluffdaddy
Is this an OK place to ask for help with something?

I've started working with pandas dataframes/pivot tables at my job. I have a pivot table that looks like this:



How do I add subtotals for elements of the middle level, e.g., a subtotal listing under "B" that sums over elements 2-14? I've found solutions online, but I haven't fully understood the reasoning, and trying to apply the solutions blindly has given me different kinds of unsatisfactory results.

SurgicalOntologist
Jun 17, 2004

code:
df.groupby('Middle Level').sum()

One thing to get used to, if you're coming from Excel, is that you wouldn't put the subtotals "under" the raw data. I mean, you could probably do it, with the above line then assigning new rows, but it would be awkward. Better to use pandas to do the calculations, then if you need a pretty output at some point you would output to html or something.

Norton the First
Dec 4, 2018

by Fluffdaddy

SurgicalOntologist posted:

code:
df.groupby('Middle Level').sum()
One thing to get used to, if you're coming from Excel, is that you wouldn't put the subtotals "under" the raw data. I mean, you could probably do it, with the above line then assigning new rows, but it would be awkward. Better to use pandas to do the calculations, then if you need a pretty output at some point you would output to html or something.

Thanks for the reply. I suppose that (the prettiness) is the real question. For context, in this unit people have been taking raw data reports they get weekly, pasting them into a workbook, and constructing Excel pivot tables by hand for a whole host of tasks. The goal is to give them the same output they're used to programmatically.

Tayter Swift
Nov 18, 2002

Pillbug
Multilevel indicies destroy my brain and I hates them. I like to convert stuff to tall format and operate that way, then pivot back to what form the output needs to be in. (I think that's called Tidy Data? Dunno)

Business
Feb 6, 2007

KICK BAMA KICK posted:

Reread some code I wrote before finishing Obey the Testing Goat (again, strong recommend, not sure anything has ever helped me as much as that) and realizing I had actually rolled my own extremely stupid version of unittest.mock to fake an external API, return some fake data, capture the arguments used, the whole nine yards.

Kinda addicting going back through my code and mocking out every last thing to create proper unit tests though I definitely see the concerns raised about mocks tying you to a particular implementation, sometimes to the point of tests almost looking like tautologous restatements of the code being tested.

jumping on the testing train myself now! any advice from windows vets on getting geckodriver/selenium working there? It seems annoying but I haven't messed with it too much yet

Business fucked around with this message at 15:39 on Oct 23, 2019

KICK BAMA KICK
Mar 2, 2009

Business posted:

jumping on the testing train myself now! any advice from windows vets on getting geckodriver/selenium working there? It seems annoying but I haven't messed with it too much yet
Other than the occasional version mismatch after Firefox auto-updates I didn't have any problem with geckodriver as long as I stuck the executable in the root of my project.

punished milkman
Dec 5, 2018

would have won

Business posted:

jumping on the testing train myself now! any advice from windows vets on getting geckodriver/selenium working there? It seems annoying but I haven't messed with it too much yet

I haven't used it on Windows but maybe check out webdriver_manager and see if it works for you (https://github.com/SergeyPirogov/webdriver_manager). Installs via pip and automatically downloads the latest version of the needed driver for whatever browser you're working with.

Thermopyle
Jul 1, 2003

...the stupid are cocksure while the intelligent are full of doubt. —Bertrand Russell

Business posted:

jumping on the testing train myself now! any advice from windows vets on getting geckodriver/selenium working there? It seems annoying but I haven't messed with it too much yet

You just download geckodriver and use it. Nothing special about windows.

Adbot
ADBOT LOVES YOU

Business
Feb 6, 2007

Thanks yeah I was overthinking it. Messed something up the first try and based on google got the impression it was more complicated than it actually was

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply