Python information and short questions megathread.

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Python information and short questions megathread.

«‹›484 »

Sweeper: Nov 29, 2007; The Joe Buck of Posting; Dinosaur Gum

is there anyone working on eliminating the GIL still?

# ? Sep 17, 2010 11:50

Adbot: ADBOT LOVES YOU

# ? May 24, 2024 03:04

king_kilr: May 25, 2007

Sweeper posted:

is there anyone working on eliminating the GIL still?

At present, not as far as I know. If you'd like to know why its an incredily hard problem press 1, to speak with a operator press 2, to rant and rave about how python-core are idiots press /dev/null.

# ? Sep 17, 2010 15:02

Lurchington: Jan 2, 2003; Forums Dragoon

3.2 had a new GIL implementation: http://docs.python.org/dev/py3k/whatsnew/3.2.html#multi-threading

It's supposed to help with threads thrashing

# ? Sep 17, 2010 16:01

Sweeper: Nov 29, 2007; The Joe Buck of Posting; Dinosaur Gum

king_kilr posted:

At present, not as far as I know. If you'd like to know why its an incredily hard problem press 1, to speak with a operator press 2, to rant and rave about how python-core are idiots press /dev/null.

I realize it is a hard problem, but how do other languages get around the issue?

# ? Sep 17, 2010 16:46

tef: May 30, 2004; -> some l-system crap ->

the gil is a consequence of the threading model of python. other languages have different models, and different problems.

in java, you have to have explicit locking which causes it's own problems, and in erlang you can't share memory at all, which has different tradeoffs.

ultimately there are two problems: the threading model in python demands automatic locking, and either you have fine grained locks (which destroy the single thread case) or one big lock (which destroys the multi thread case somewhat).

and that the original implementation of the one big lock didn't cooperate with the underlying os-scheduler, and could thrash.basically, removing the gil makes other things worse, and there is something to be said about refcounting here too.

how do you get around the gil in python? stop using threads. threads are terrible. use multiprocessing.

go back to work people, nothing to see here.

# ? Sep 17, 2010 16:58

king_kilr: May 25, 2007

Sweeper posted:

I realize it is a hard problem, but how do other languages get around the issue?

Ruby (MRI): GIL
Perl: Native threading, but no shared state.
Lua: No native threading.
JVM (including Jython): Locking on everything, escape analysis in the JIT to remove some locks.
CLR (including IronPython): Same.
PHP: No native threads.

It's really a 2 out of 3 situation:

1) Good single threaded performance.
2) Backwards compatibility.
3) Free-threading.

Needing to lock all objects for all ops (particularly INCREF/DECREF) results in bad single threaded performance (50% hit), switching to a better GC to aleviate these problems causes backwards compatibility issues.

# ? Sep 18, 2010 00:13

tripwire: Nov 19, 2004; _{ghost flow}

king_kilr posted:

Ruby (MRI): GIL
Perl: Native threading, but no shared state.
Lua: No native threading.
JVM (including Jython): Locking on everything, escape analysis in the JIT to remove some locks.
CLR (including IronPython): Same.
PHP: No native threads.

It's really a 2 out of 3 situation:

1) Good single threaded performance.
2) Backwards compatibility.
3) Free-threading.

Needing to lock all objects for all ops (particularly INCREF/DECREF) results in bad single threaded performance (50% hit), switching to a better GC to aleviate these problems causes backwards compatibility issues.

Is there any reason why python 3 didn't pick 1 and 3 in that list? (Did they, and I'm just not aware?)

# ? Sep 18, 2010 00:15

king_kilr: May 25, 2007

tripwire posted:

Is there any reason why python 3 didn't pick 1 and 3 in that list? (Did they, and I'm just not aware?)

No, the status of the GIL is unchanged in 3 (except for Antoine's work to improve it's performance). I don't know why they made this decision (I became a Pythonista relatively recently), but I suppose it was because no one volunteered to do the work. Given py3k is already backwards incompatible it would have been the right time.

Also, it's still not really possible to be 100% speed wise, locking on objects is still going to be a hit, but locking on INCREF/DECREF just happens to be completely untennable.

# ? Sep 18, 2010 00:20

m0nk3yz: Mar 13, 2002; Behold the power of cheese!

tripwire posted:

Is there any reason why python 3 didn't pick 1 and 3 in that list? (Did they, and I'm just not aware?)

We picked 1 - the initial patch(es) for free threading seriously harmed multithreaded performance; and Python has historically not been used in heavily multithreaded environments, the choice was made. After that point, no one picked the work back up, and no one volunteered to take it on for python 3.0.

I do think it will eventually be fixed; but it takes someone funding the work.

# ? Sep 18, 2010 01:17

king_kilr: May 25, 2007

If you don't care about backwards compatibility it's not such a hard problem (technically speaking), it's still a lot of work, but if you're a company who needs it I think it could be done in under 6 months of developer time. I suppose you could even compile time flag it up and make it backwards compatible, but that seems like a bad idea IMO.

# ? Sep 18, 2010 02:00

m0nk3yz: Mar 13, 2002; Behold the power of cheese!

king_kilr posted:

If you don't care about backwards compatibility it's not such a hard problem (technically speaking), it's still a lot of work, but if you're a company who needs it I think it could be done in under 6 months of developer time. I suppose you could even compile time flag it up and make it backwards compatible, but that seems like a bad idea IMO.

Yup, when I make my millions, I'm going to pay someone to JFFI (Just loving Fix It)

# ? Sep 18, 2010 02:04

tef: May 30, 2004; -> some l-system crap ->

Also, replacing the locking and threading model requires an extensive rewrite of the c-extensions :woop:

# ? Sep 18, 2010 13:17

king_kilr: May 25, 2007

tef posted:

Also, replacing the locking and threading model requires an extensive rewrite of the c-extensions

Yeah, that's why I said, "if you ignore backwards compatibility"

# ? Sep 18, 2010 19:04

Clanpot Shake: Aug 10, 2006; shake shake!

I wrote a short python script that will scan an XML file and count instances of an attribute called 'id'. Every value in that attribute must be unique, which doesn't always happen (which is why I wrote this script). The XML file is composed of section elements which can nest, and I'm storing the id of the section that the duplicates appear on.

I'm storing the ids, their count, line numbers, and containing section id in a dictionary:

pre:

ids = {'id':{'count':c,
             'lines':[line1,line2...],
             'sections':[sec1,sec2]}

When printing out the results of this, I use this:

pre:

dups = False
for id in ids:
    if ids[id]['count'] > 1:
        if not dups:
            print "Found duplicate IDs in file!"
        dups = True
        print id," occurs",ids[id]['count'],"times:\n\tLINE\t\tID"
        for i in range(len(ids[id]['lines'])):
            print "\t-%6s\t\t%6s"%(ids[id]['lines'][i],ids[id]['sections'][i])
if not dups:
    print 'File contained no duplicate IDs'

This is a lot sloppier than I would like, particularly the second for loop. Is there some other way to write this print loop?

# ? Sep 20, 2010 18:03

king_kilr: May 25, 2007

pre:

dups = False
for id, data in ids.iteritems():
    if data['count'] > 1:
        if not dups:
            print "Found duplicate IDs in file!"
        dups = True
        print id, " occurs %s times:\n\tLINE\t\tID" % data['count']
        for line, section in zip(data['lines'], data['sections']):
            print "\t-%6s\t\t%6s" % (line, section)
if not dups:
    print 'File contained no duplicate IDs'

That should do it.

# ? Sep 20, 2010 19:15

Clanpot Shake: Aug 10, 2006; shake shake!

king_kilr posted:

That should do it.

Thanks, this works. I'm looking at my professors old notes for python and they don't include zip() or iteritems(). What exactly do these do?

\/\/\/ That website loses something when you have javascript disabled.

Clanpot Shake fucked around with this message at 21:07 on Sep 20, 2010

# ? Sep 20, 2010 20:55

shrughes: Oct 11, 2008; (call/cc call/cc)

Clanpot Shake posted:

Thanks, this works. I'm looking at my professors old notes for python and they don't include zip() or iteritems(). What exactly do these do?

Check out http://tinyurl.com/2uqyyyy

# ? Sep 20, 2010 21:04

king_kilr: May 25, 2007

Clanpot Shake posted:

Thanks, this works. I'm looking at my professors old notes for python and they don't include zip() or iteritems(). What exactly do these do?

\/\/\/ That website loses something when you have javascript disabled.

iteritems() lets you loop over tuples of (key, value), zip() lets you basically pair two iterators, yielding items from them in parallel.

# ? Sep 20, 2010 21:29

Clanpot Shake: Aug 10, 2006; shake shake!

king_kilr posted:

iteritems() lets you loop over tuples of (key, value), zip() lets you basically pair two iterators, yielding items from them in parallel.

That's awesome - that's exactly what I needed to do. Can zip() take any number of arguments?

Also thanks to you both.

# ? Sep 20, 2010 21:46

kuffs: Mar 29, 2007; Projectile Dysfunction

Clanpot Shake posted:

That's awesome - that's exactly what I needed to do. Can zip() take any number of arguments?

He gave you a lmgtfy link and you didn't follow the advice?

# ? Sep 20, 2010 21:57

FamDav: Mar 29, 2008

I'm having trouble trying to be an awful python coder. I'm trying to obfuscate this

code:

Seq = Hailstone(int(raw_input('start value? ')))
	print "\n".join([str(Seq)[1:-1], " ".join([str(len(Seq)-1), ('steps' if len(Seq) != 2 else 'step')])])

Into a one liner. The idea would be that I replace the first instance of "Seq" with "Seq = Hailstone(int(raw_input('start value? ')))". However, this gives me a TypeError, with the full error code being "TypeError: 'Seq' is an invalid keyword argument for this function".

Any ideas why this horrible affront to python isn't working? Do I need to slaughter more lambs?

# ? Sep 23, 2010 19:28

king_kilr: May 25, 2007

I'm not answering this question. You are terrible.

# ? Sep 23, 2010 20:01

Plorkyeran: Mar 22, 2007; To Escape The Shackles Of The Old Forums, We Must Reject The Tribal Negativity He Endorsed

Assignment in python is a statement, not an expression.

In this case you can use a lambda to do what you want.

# ? Sep 23, 2010 20:36

tef: May 30, 2004; -> some l-system crap ->

You can't do assignment in assignment, or function calls, in python.

Additionally: Things that have side-effects often return none to stop you composing them. (i.e foo.sort() is essentially a statement, not a function.)

i.f foo(x=y) is passing the keyword argument x in python.

if you're being sick and twisted, you want to use lambda to bind names inside of oneliners:

print (lambda Seq:"\n".join([str(Seq)[1:-1], " ".join([str(len(Seq)-1), ('steps' if len(Seq) != 2 else 'step')])]))(Hailstone(int(raw_input('start value? '))))

Bonus points: use a ycombinator

# ? Sep 23, 2010 20:36

tef: May 30, 2004; -> some l-system crap ->

for example:

code:

print "\n".join(x.rstrip() for x in (lambda y: y(y,3,3))\
    (lambda x,n,w: ["%s/%s\\%s"%(" "*j, " "*(2*i)," "*j)\
    for (i,j) in zip(xrange(0,w),xrange(w-1,0,-1))]+\
    ["%s"%("-"*w*2)] if n == 0 else ["%s%s%s"%(" "*(w*2**\
    (n-1)), l, " "*(w*2**(n-1))) for l in x(x,n-1,w)]+\
    ["%s%s"%(l,l) for l in x(x,n-1,w)]))

# ? Sep 23, 2010 20:40

MaberMK: Feb 1, 2008; BFFs

tef posted:

for example:

code:

print "\n".join(x.rstrip() for x in (lambda y: y(y,3,3))\
    (lambda x,n,w: ["%s/%s\\%s"%(" "*j, " "*(2*i)," "*j)\
    for (i,j) in zip(xrange(0,w),xrange(w-1,0,-1))]+\
    ["%s"%("-"*w*2)] if n == 0 else ["%s%s%s"%(" "*(w*2**\
    (n-1)), l, " "*(w*2**(n-1))) for l in x(x,n-1,w)]+\
    ["%s%s"%(l,l) for l in x(x,n-1,w)]))

The Perl thread called...

# ? Sep 23, 2010 20:41

tef: May 30, 2004; -> some l-system crap ->

MY PEP8

# ? Sep 23, 2010 20:51

FamDav: Mar 29, 2008

tef posted:

for example:

code:

print "\n".join(x.rstrip() for x in (lambda y: y(y,3,3))\
    (lambda x,n,w: ["%s/%s\\%s"%(" "*j, " "*(2*i)," "*j)\
    for (i,j) in zip(xrange(0,w),xrange(w-1,0,-1))]+\
    ["%s"%("-"*w*2)] if n == 0 else ["%s%s%s"%(" "*(w*2**\
    (n-1)), l, " "*(w*2**(n-1))) for l in x(x,n-1,w)]+\
    ["%s%s"%(l,l) for l in x(x,n-1,w)]))

Crosspost this in TCC. Thanks!

EDIT: For clarification, y-combinators will help me with writing Hailstone as part of my one-liner, yes?

FamDav fucked around with this message at 22:09 on Sep 23, 2010

# ? Sep 23, 2010 21:58

tef: May 30, 2004; -> some l-system crap ->

for my next trick:

print "\n".join(x.rstrip() for x in (lambda y:y["t"](y,5,1))({"t":lambda y,n,w:[" "*w for _ in xrange(1,w)]+["__"*w] if n == 0 else ["%s%s%s"%(" "*(len(t)//2),t," "*(len(t)//2)) for t in y["t"](y,n-1,w)]+["%s%s"%m for m in zip(y["l"](y,n-1,w),y["r"](y,n-1,w))],"l":lambda y,n,w:["%s/%s"%(" "*i," "*j) for (i,j) in zip(xrange(w-1,-1,-1),xrange(w,w*2))] if n==0 else ["%s%s%s"%(" "*(len(t)//2),t," "*(len(t)//2)) for t in y["r"](y,n-1,w)]+["%s%s"%m for m in zip(y["t"](y,n-1,w),y["l"](y,n-1,w))],"r":lambda y,n,w:["%s\\%s"%(" "*i," "*j) for (j,i) in zip(xrange(w-1,-1,-1),xrange(w,w*2))] if n==0 else ["%s%s%s"%(" "*(len(t)//2),t," "*(len(t)//2)) for t in y["l"](y,n-1,w)]+ ["%s%s"%m for m in zip(y["r"](y,n-1,w),y["t"](y,n-1,w))]}))

and I'll stop now before someone explodes

tef fucked around with this message at 00:22 on Sep 24, 2010

# ? Sep 24, 2010 00:14

JimboMaloi: Oct 10, 2007

Unbelievably newbie question here. I'm just starting to learn Python (some previous background in Java, bash, and MATLAB), and while working through the tutorial I got to the discussion of functions.

In the tutorial they introduce the .append method, saying that

code:

 a.append(b)

is equivalent to

code:

 a = a + [b]

Now while this makes perfect sense, my question is why you wouldn't simply say

code:

 a += [b]

This is, as far as I understand it, exactly the same as the second bit of code, and is just as, if not more, brief as using the .append method, which leads to my question: is there any functional distinction between the .append method and the += code that I used?

# ? Sep 24, 2010 00:28

tef: May 30, 2004; -> some l-system crap ->

generally, x = x + [1] is not the same as x.append(1)

prefer using append over anything else.

x = x + [1] creates a new list made from x, and [1] and then assigns it to x.
x+= [1] should add in place (but it isn't guaranteed to do this, as it will fall back to
normal add if it isn't implemented)

# ? Sep 24, 2010 00:58

tripwire: Nov 19, 2004; _{ghost flow}

JimboMaloi posted:

Unbelievably newbie question here. I'm just starting to learn Python (some previous background in Java, bash, and MATLAB), and while working through the tutorial I got to the discussion of functions.

In the tutorial they introduce the .append method, saying that
code:
 a.append(b) 
is equivalent to
code:
 a = a + [b] 
Now while this makes perfect sense, my question is why you wouldn't simply say
code:
 a += [b] 
This is, as far as I understand it, exactly the same as the second bit of code, and is just as, if not more, brief as using the .append method, which leads to my question: is there any functional distinction between the .append method and the += code that I used?

To be pedantic, this:

code:

a = a + [b]

is actually equivalent to this:

code:

a.extend([b])

You're making a new list with one element in it and extending the previous list with it, rather than appending

When the python interpreter has to evaluate an expression like (a + b), I imagine it checks for and uses an __add__ method on the operands (and maybe a __radd__ method as well, since addition is not associative in every context). In the context of lists, the order of operands matters for addition; the resulting list is just the elements of the two lists concatenated in the order they were supplied.

If your question is whether you should use a.extend(somelist) or a+=somelist, then the answer is that it kiinnnnnnnnnda depends, but you should in 99% of cases use extend and be explicit.

A case where I could see using += instead is if you know the type of a and b is going to change down the line to something else, but you will still perform what is conceptually addition.
One example I can think of, if a and b become strings at some point. The += operator will still concatenate the elements of a and b together, whereas extend/append aren't defined on strings. But you really shouldn't be adding strings together like that anyway. _{_{_{_{(its quadratic time! use lists of strings and ''.join instead!)}}}}

# ? Sep 24, 2010 01:22

MaberMK: Feb 1, 2008; BFFs

tripwire posted:

really tiny poo poo

Reminds me of a question I've always wondered about but never asked... how does using a formatting string compare to ''.join() with respect to performance?

# ? Sep 24, 2010 04:36

Bodhi Tea: Oct 2, 2006; seconds are secular, moments are mine, self is illusion, music's divine.

Can someone tell me the proper way to open an url like this:
http://www.example.com/read.php?id=123&c=456

for reading?

I have an url that I know exists but I can't seem to open it; I keep getting 404's and blank pages in python. I'm guessing that I'm passing the GET params wrong or something.

Bodhi Tea fucked around with this message at 04:33 on Sep 25, 2010

# ? Sep 25, 2010 02:27

Scaevolus: Apr 16, 2007

Bodhi Tea posted:

Can someone tell me the proper way to open an url like this:
http://www.example.com/read.php?id=123&c=456

for reading?

I have an url that I know exists but I can't seem to open it; I keep getting 404's and blank pages in python. I'm guessing that I'm padding the GET params wrong or something.

Try faking your user agent.

# ? Sep 25, 2010 04:05

Bodhi Tea: Oct 2, 2006; seconds are secular, moments are mine, self is illusion, music's divine.

Scaevolus posted:

Try faking your user agent.

Thanks! That worked.

# ? Sep 25, 2010 04:38

RobotEmpire: Dec 8, 2007

How would one go about approaching a client-side script for scanning e-mails in exchange?

I'm probably explaining that poorly, so, let me explain. We have a huge mass-mailing system for our press releases. Many, many e-mail addresses in our database are no longer valid. We have a special mailbox that receives the "Unable to deliver" messages. I would like to write a script that scans the e-mails in the inbox, parses them and writes e-mail addresses to a text file. Is this possible?

edit: I suppose the real challenge is exporting the text from the e-mails into a readable format. From there it's pretty straightforward, yeah?

RobotEmpire fucked around with this message at 23:02 on Sep 26, 2010

# ? Sep 26, 2010 22:42

king_kilr: May 25, 2007

Yep, getting access to the data is the biggest thing. If you can run the script on the machine with access itself you should be able to control exchange (assuming that's what you're running) directly via COM. Or if you're using IMAP or something you can just use imaplib to get the data.

# ? Sep 26, 2010 23:06

shrughes: Oct 11, 2008; (call/cc call/cc)

RobotEmpire posted:

How would one go about approaching a client-side script for scanning e-mails in exchange?

I spent much of my former job doing exactly this.

If you're talking about Exchange 2007 or later, there is one good way to do it in most circumstances: Exchange Web Services. Google it, figure out how to use it. There's a decent book about it. You'll want to use C#. Make sure you have good error-reporting though, you'll eventually need to tweak your XML parser because Exchange likes to send invalid XML. Unless that was fixed.

If you need to access earlier versions of Exchange, the easiest way is probably their WebDAV interface. You'll probably want to use C#.

Beware that you might have to do some postprocessing of Exchange email addresses to canonicalize them or to simply convert from EX addresses to SMTP addresses. Maybe that was just with MAPI, I forget. Also, you'll sometimes get addresses like "Sam"@example.com instead of sam@example.com and both should be treated as equal.

Just be thankful you don't have to parse Domino email addresses.

RobotEmpire posted:

edit: I suppose the real challenge is exporting the text from the e-mails into a readable format. From there it's pretty straightforward, yeah?

IIRC Exchange has internal fields where it converts the body of a message to something consisting of just text. You can look up these fields' hex codes in the MAPI documentation (hahaha!) and get them through EWS or WebDAV that way. Really, forget the documentation, and get the free utility MFCMAPI and look around. (You'll also begin to feel like the protagonist of "They Live".)

# ? Sep 26, 2010 23:27

Adbot: ADBOT LOVES YOU

# ? May 24, 2024 03:04

RobotEmpire: Dec 8, 2007

Well, I'm not a system administrator. No admin privileges at all, I'm just a normal user. Just trying to solve an annoying problem that I encounter in the normal course of my (non-programmer) workday.

# ? Sep 26, 2010 23:35

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Python information and short questions megathread.

«‹›484 »