Python

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Python

Furism: Feb 21, 2006; Live long and headbang

Automate the Boring Stuff is nice but a little too simple. I'd like to know more about the proper coding conventions in the Python world. What would be the next book?

# ? Dec 15, 2018 18:36

Adbot: ADBOT LOVES YOU

# ? May 14, 2024 00:18

bob dobbs is dead: Oct 8, 2017; I love peeps; Nap Ghost

Furism posted:

Automate the Boring Stuff is nice but a little too simple. I'd like to know more about the proper coding conventions in the Python world. What would be the next book?

"effective python"
"fluent python"

they're ostensibly advanced but you can just rtfm when you get a bit lost

# ? Dec 15, 2018 18:41

KICK BAMA KICK: Mar 2, 2009

KICK BAMA KICK posted:

e: Oh, query the database in the main thread and submit each row to a concurrent.futures.ProcessPoolExecutor? Then iterate through concurrent.futures.as_completed(those_futures) to get the max. That more on the right track?

ee: Yep, , huge thanks

So this didn't actually work -- probably obvious to everyone but me that I'm still loading all the data into memory as I submit those Futures. I got away with it on my first test (my database is like 85% of the size of the RAM on the target machine) but subsequent runs soft-locked the machine, mashing Ctrl-C might get control back a minute or two later. Finally sorted it out though -- realized I was dumb for using a database; each row was just a blob serializing an ndarray and a string identifying the thing that data's about, so I just dumped those into .npy files with the filename identifying them. Now I just pool.map(do_thing, Path('/data/those_files/').glob('*.npy')) and np.load the file there. Super simple code, slightly faster, uses maybe 10% of the memory it did before. This whips rear end!

# ? Dec 17, 2018 04:16

CarForumPoster: Jun 26, 2013; â¡POWERâ¡

How do I do something to a range of columns in a df?

I have 10 columns named "DispoClass_#" with the # being 1-10. I want to set to ordinality of the categorical values they contain with .cat.set_categories

How do I select all 10? I need to do this with other things structured as "Name_#" so just writing them out isn't that deirable.

Something like this, but, ya know, works...

code:

df_raw[["DispoClass_1":"DispoClass_10"]].cat.set_categories(['High', 'Medium', 'Low'], ordered=True, inplace=True)

# ? Dec 17, 2018 16:16

Furism: Feb 21, 2006; Live long and headbang

So apparently I'm a dumbass because I can't seem to instantiate an object for a class I created. I read the documentation and I don't understand what I'm doing wrong

This is my class:

code:

class CfClient:
    __bearerToken = ""

    def __init__(self, userName, userPassword, controllerAddress):
        self.userName = userName
        self.userPassword = userPassword
        self.controllerAddress = controllerAddress

    def login(self):
        ## Do Login
        self.__bearerToken = 12345
        return True

And my main file:

code:

import lib.models.CfClient

cfClient = CfClient("Soandso",
                    "somePassword", "https://192.168.1.10")

When I do this, I'm getting an error saying "Undefined variable 'CfClient'"

I can't seem to understand the difference between my code and the documentation sample, except that my file is under two subdirectories (into which I dropped __init__.py files). What am I doing wrong?

# ? Dec 17, 2018 18:32

TheFluff: Dec 13, 2006; FRIENDS, LISTEN TO ME
I AM A SEAGULL
OF WEALTH AND TASTE

~~Either import with an alias:~~

Python code:

import lib.models.CfClient as CfClient # doesn't work, I'm dumb

Or refer to it with the full path:

Python code:

import lib.models.CfClient

client = lib.models.CfClient("foo", "bar")

edit: no wait I'm dumb, that first one doesn't work. Just do

Python code:

from lib.models import CfClient

instead if that's what you want.

edit edit: The above assumes that "class CfClient: ..." is in a file called models.py in a directory called lib. If you instead have lib/models/CfClient.py which contains the class CfClient, then you need to tell Python about both the file and the class, like so:

Python code:

from lib.models.CfClient import CfClient

client = CfClient("foo", "bar")

TheFluff fucked around with this message at 19:10 on Dec 17, 2018

# ? Dec 17, 2018 18:47

cinci zoo sniper: Mar 15, 2013

CarForumPoster posted:

How do I do something to a range of columns in a df?

I have 10 columns named "DispoClass_#" with the # being 1-10. I want to set to ordinality of the categorical values they contain with .cat.set_categories

How do I select all 10? I need to do this with other things structured as "Name_#" so just writing them out isn't that deirable.

Something like this, but, ya know, works...
code:
df_raw[["DispoClass_1":"DispoClass_10"]].cat.set_categories(['High', 'Medium', 'Low'], ordered=True, inplace=True)

You should use df.filter() for that.

Python code:

import pandas as pd

test = pd.DataFrame()
test["hello"] = [1, 2, 3]
test["a_01"] = ["foo", "bar", "baz"]
test["a_02"] = ["foo", "bar", "baz"]
test["a_03"] = ["foo", "bar", "baz"]

print(test)

target = test.filter(like='a_', axis=1)
test[target.columns] = target.apply(lambda x: x.str.capitalize())

print(test)

# ? Dec 17, 2018 19:03

Dr Subterfuge: Aug 31, 2005; TIME TO ROC N' ROLL

What about df.loc? It�s at least advertised to do slicing with strings.

# ? Dec 17, 2018 19:19

CarForumPoster: Jun 26, 2013; â¡POWERâ¡

cinci zoo sniper posted:

You should use df.filter() for that.

Python code:

import pandas as pd

test = pd.DataFrame()
test["hello"] = [1, 2, 3]
test["a_01"] = ["foo", "bar", "baz"]
test["a_02"] = ["foo", "bar", "baz"]
test["a_03"] = ["foo", "bar", "baz"]

print(test)

target = test.filter(like='a_', axis=1)
test[target.columns] = target.apply(lambda x: x.str.capitalize())

print(test)

Much appreciated. Also appreciate you helping me last week.

Also I just now found out about : https://regex101.com/

Holy crap is that helpful! I have the worst time with regexs and I basically end up finding someone on stack overflow who wasnt the same thing and cpying the answer.

# ? Dec 17, 2018 19:25

Furism: Feb 21, 2006; Live long and headbang

TheFluff posted:

If you instead have lib/models/CfClient.py which contains the class CfClient, then you need to tell Python about both the file and the class, like so:
Python code:
from lib.models.CfClient import CfClient

client = CfClient("foo", "bar")

That was it, thanks!

# ? Dec 17, 2018 19:45

cinci zoo sniper: Mar 15, 2013

Dr Subterfuge posted:

What about df.loc? It�s at least advertised to do slicing with strings.

You can, but I don't think there even seldom are any good reasons to do so.

What you could do instead is something like this:

Python code:

target = test.columns[test.columns.str.contains(pat='a_')]
test[target] = test[target].apply(lambda x: x.str.capitalize())

# ? Dec 17, 2018 19:58

Dr Subterfuge: Aug 31, 2005; TIME TO ROC N' ROLL

What makes df.loc so much less desirable?

# ? Dec 17, 2018 21:29

cinci zoo sniper: Mar 15, 2013

Dr Subterfuge posted:

What makes df.loc so much less desirable?

For regex/-like subsetting it�s just a question of code clarity. Functionally nothing will change at lower levels, I think. Like, what would be your proposed .loc example here?

# ? Dec 17, 2018 21:46

Furism: Feb 21, 2006; Live long and headbang

So I'm trying to POST a file along with a Bearer Token against a REST API. I get an error from the API telling me that "Request was not successfully validated against the schema." I'm trying to figure out what I did wrong because I think my request is correctly crafted:

code:

def uploadFile(self, file):
        ofile = open(file, "rb")
        files = {'file' : ofile}
        response = requests.post(
            self.controllerAddress + '/files?type=multipart',
            headers={'Authorization': 'Bearer {0}'.format(self.__bearerToken)},
            files=files,
            verify=False,
        )
        print(response)

Normally I'd fire up Wireshark and look at what's actually sent, but the API is over HTTPS and I don't have the server's private key to decrypt. I enabled the logging mobule but it doesn't seem to be able to show POST requests:

code:

DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): 10.75.231.30:443
DEBUG:urllib3.connectionpool:[url]https://10.75.231.30:443[/url] "POST /api/v2/files?type=multipart HTTP/1.1" 400 71
<Response [400]>

What are my options to find out what's wrong with my request? Note that "file" in the method's signature is just a string that contains the full, absolute path to the actual file.

Sorry for the newbie questions

Furism fucked around with this message at 22:10 on Dec 17, 2018

# ? Dec 17, 2018 22:05

necrotic: Aug 2, 2005; I owe my brother big time for this!

That is the server telling you it will not accept the request for not matching whatever schema it expects. Inspecting the traffic wouldn't help you there. It does look like you're using requests correctly. Does the API also expect a body in the request instead of only the file part?

# ? Dec 17, 2018 22:21

SurgicalOntologist: Jun 17, 2004

CarForumPoster posted:

How do I do something to a range of columns in a df?

I have 10 columns named "DispoClass_#" with the # being 1-10. I want to set to ordinality of the categorical values they contain with .cat.set_categories

How do I select all 10? I need to do this with other things structured as "Name_#" so just writing them out isn't that deirable.

Something like this, but, ya know, works...
code:
df_raw[["DispoClass_1":"DispoClass_10"]].cat.set_categories(['High', 'Medium', 'Low'], ordered=True, inplace=True)

In general if you find yourself wanting to do computation on the names of columns (e.g. .str.startswith), you are doing something wrong and should reorganize your data. In your case I would create a dataframe with only the "DispoClass" data, with columns 1-10. Then you don't have to do any column subsetting when you want to do something to the DispClass data only. You can still coordinate the data with columns from another dataframe if they share the same row labels (index).

# ? Dec 17, 2018 22:40

Dr Subterfuge: Aug 31, 2005; TIME TO ROC N' ROLL

cinci zoo sniper posted:

For regex/-like subsetting it�s just a question of code clarity. Functionally nothing will change at lower levels, I think. Like, what would be your proposed .loc example here?

This seems like it would work?

Python code:

test.loc[:, 'a_01':'a_03'].apply(do_stuff)

I'm pretty sure .loc returns a view? If it doesn't I don't understand anything.

# ? Dec 17, 2018 22:46

SurgicalOntologist: Jun 17, 2004

Indexing without the loc is just a shortcut that sometimes works. cinci zoo sniper's suggestion could easily have been

Python code:

target = test.columns[test.columns.str.contains(pat='a_')]
test.loc[:, target] = test.loc[:, target].apply(lambda x: x.str.capitalize())

The distinction that matters here is boolean indexing (based on a computation on the column labels) vs. slice indexing. I think slice indexing is more clear in this case but also relies on column order to an extent that makes it too fragile IMO.

In any case, if you are naming your columns XX_1, XX_2, ..., XX_n then you have >2D data and either split up into multiple dataframes as I suggested or look into other data structures like xarray.

# ? Dec 17, 2018 22:52

cinci zoo sniper: Mar 15, 2013

^^ For a single column, list of columns, or slice of rows [] and .loc behave identically. For single rows, list of rows, slice of columns, or combined selection of rows and columns within one operation .loc is the only appropriate option, and please don�t ask me why.

Dr Subterfuge posted:

This seems like it would work?
Python code:
test.loc[:, 'a_01':'a_03'].apply(do_stuff)
I'm pretty sure .loc returns a view? If it doesn't I don't understand anything.

Right, this would work but as SurgicalOntologist points out, it relies on column order (also see comment above about slicing columns). I don�t think there�s something inherently wrong with that to a severe extent, but I prefer to defensively avoid doing operations like that.

# ? Dec 17, 2018 23:28

Dominoes: Sep 20, 2007

Does anyone else dislike the if __name__ == __main__ syntax? I still have to look it up every time.

Dominoes fucked around with this message at 02:46 on Dec 18, 2018

# ? Dec 18, 2018 02:43

bob dobbs is dead: Oct 8, 2017; I love peeps; Nap Ghost

Dominoes posted:

Does anyone else dislike the if __name__ == __main__ syntax? I still have to look it up every time.

you forgot quote marks

at this point it's muscle memory to me

# ? Dec 18, 2018 02:59

Cute n Popular: Oct 12, 2012

I was in the same position not too long ago and I can't recommend Effective Python enough after a goon recommended it in this thread. I did find that I occasionally needed external resources to supplement the material but that's more on me being a novice then anything else.

I also picked up a lot by doing advent of code and going through various solutions that were posted online.

Cute n Popular fucked around with this message at 08:23 on Dec 18, 2018

# ? Dec 18, 2018 08:20

cinci zoo sniper: Mar 15, 2013

Dominoes posted:

Does anyone else dislike the if __name__ == __main__ syntax? I still have to look it up every time.

Sign me up on the �this feels awkward� list.

# ? Dec 18, 2018 10:09

QuarkJets: Sep 8, 2008

I dislike that syntax as well even if I'm used to it now

It feels like it's a hack rather than a feature

# ? Dec 18, 2018 10:54

necrotic: Aug 2, 2005; I owe my brother big time for this!

What would you proposed it look like instead?

# ? Dec 18, 2018 15:42

Furism: Feb 21, 2006; Live long and headbang

necrotic posted:

That is the server telling you it will not accept the request for not matching whatever schema it expects. Inspecting the traffic wouldn't help you there. It does look like you're using requests correctly. Does the API also expect a body in the request instead of only the file part?

Yes that was my understanding of the problem too. The API documentation only says this:

I wanted to inspect the traffic to make sure my code did what I thought it did (being new at Python).

# ? Dec 18, 2018 16:37

necrotic: Aug 2, 2005; I owe my brother big time for this!

Looks like it is expecting a specific mime type on the file payload. I dont know how to do that with requests off the top of my head, but look around for customizing the mime type on the file in the request.

# ? Dec 18, 2018 16:43

Furism: Feb 21, 2006; Live long and headbang

Tried that (sorry, that wasn't in my original code) this way:

code:

headers={'Authorization': 'Bearer {0}'.format(self.__bearerToken),
                     'Content-Type': 'multipart/form-data'},

Still no luck. I'll look around.

# ? Dec 18, 2018 16:46

necrotic: Aug 2, 2005; I owe my brother big time for this!

No, the file itself is attached with a different content type. It's multipart, like email.

here https://stackoverflow.com/questions/15746558/how-to-send-a-multipart-related-with-requests-in-python

# ? Dec 18, 2018 16:48

baka kaba: Jul 19, 2003; PLEASE ASK ME, THE SELF-PROFESSED NO #1 PAUL CATTERMOLE FAN IN THE SOMETHING AWFUL S-CLUB 7 MEGATHREAD, TO NAME A SINGLE SONG BY HIS EXCELLENT NU-METAL SIDE PROJECT, SKUA, AND IF I CAN'T PLEASE TELL ME TO
EAT SHIT

Can't you just do it like this?
http://docs.python-requests.org/en/master/user/quickstart/#post-a-multipart-encoded-file
(second example, specifying the content-type explicitly)

also are you doing the auth right? It looks like it's complaining that the auth header is bad in some way

# ? Dec 18, 2018 17:07

cinci zoo sniper: Mar 15, 2013

necrotic posted:

What would you proposed it look like instead?

Literally anything that isn't 26 "out-of-context" characters?

# ? Dec 18, 2018 17:42

Dominoes: Sep 20, 2007

necrotic posted:

What would you proposed it look like instead?

def main():, where main is a built-in.

Dominoes fucked around with this message at 20:11 on Dec 18, 2018

# ? Dec 18, 2018 20:08

cinci zoo sniper: Mar 15, 2013

Dominoes posted:

def main():, where main is a built-in.

I was thinking about some �on import:� construct, but this would do better probably. Either way I agree to this approach, my main philosophical argument is against the comparison of an ostensible implicit.

# ? Dec 18, 2018 20:20

bob dobbs is dead: Oct 8, 2017; I love peeps; Nap Ghost

Dominoes posted:

def main():, where main is a built-in.

save it for python 4, i guess

(lots of peeps have a main() in python scripts already, so if you make

code:

if __name__ == "__main__":
    do_some_shit()
    main()

and then make the main() the entrance bit, shenanigans)

# ? Dec 18, 2018 20:20

Nippashish: Nov 2, 2005; Let me see you dance!

You could also just commit to not import your scripts as modules and then you don't need any double underscore shenanigans.

# ? Dec 18, 2018 20:22

cinci zoo sniper: Mar 15, 2013

bob dobbs is dead posted:

save it for python 4, i guess

(lots of peeps have a main() in python scripts already, so if you make
code:
if __name__ == "__main__":
    do_some_shit()
    main()
and then make the main() the entrance bit, shenanigans)

I guess they could do then a new reserved keyword or the like, e.g. what I thought of, to preserve legacy code. Or probably some actually competent solution, I�m not a compaci person by large and wide margin.

# ? Dec 18, 2018 20:23

QuarkJets: Sep 8, 2008

Nippashish posted:

You could also just commit to not import your scripts as modules and then you don't need any double underscore shenanigans.

It's cool and good to have a file that is both importable and runnable as a script

# ? Dec 18, 2018 20:37

necrotic: Aug 2, 2005; I owe my brother big time for this!

baka kaba posted:

Can't you just do it like this?
http://docs.python-requests.org/en/master/user/quickstart/#post-a-multipart-encoded-file
(second example, specifying the content-type explicitly)

Yeah that looks way better. I just did a phone search :effort:

# ? Dec 18, 2018 20:50

Nippashish: Nov 2, 2005; Let me see you dance!

QuarkJets posted:

It's cool and good to have a file that is both importable and runnable as a script

I guess what I'm trying to say is that having a slightly weird syntax to do a slightly weird thing is one of the less objectionable features of python imo.

# ? Dec 18, 2018 21:18

Adbot: ADBOT LOVES YOU

# ? May 14, 2024 00:18

QuarkJets: Sep 8, 2008

Lots of languages treat a function named main() as, well, main. It wouldn't have been unusual for Python to do the same thing. It's not really objectionable just a weird design choice to break from convention and make everyone check the value of the __name__ variable

# ? Dec 18, 2018 21:24

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Python

«‹›230 »