Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Locked thread
Stoatbringer
Sep 15, 2004

naw, you love it you little ho-bot :roboluv:

I've been using Python for about six months now, and I really like it.

Apart from the indentation, which I consider to be dangerous nonsense which will only end in tears. I don't mind using it, but the old-school part of my brain is always screaming "One slip of the auto-formatter, or accidentally deleting a tab will break everything and nobody will ever know why! Oh woe, woe unto the poor sod who has to maintain this code in five years time!" :bahgawd:

Adbot
ADBOT LOVES YOU

brosmike
Jun 26, 2009

Stoatbringer posted:

Apart from the indentation, which I consider to be dangerous nonsense which will only end in tears. I don't mind using it, but the old-school part of my brain is always screaming "One slip of the auto-formatter, or accidentally deleting a tab will break everything and nobody will ever know why! Oh woe, woe unto the poor sod who has to maintain this code in five years time!" :bahgawd:

How exactly is that any different from the risk that you (or your auto-formatter) could accidentally delete a }?

dis astranagant
Dec 14, 2006

brosmike posted:

How exactly is that any different from the risk that you (or your auto-formatter) could accidentally delete a }?

If nothing else, most any text editor or ide worth a drat can tell you how your parens/braces/brackets match up. And many languages that use them will throw an error if you have a stray one.

brosmike
Jun 26, 2009

dis astranagant posted:

If nothing else, most any text editor or ide worth a drat can tell you how your parens/braces/brackets match up.

I don't see how this is better than using indentation, whereupon any human eye worth a drat can tell how your code blocks match up.

dis astranagant posted:

And many languages that use them will throw an error if you have a stray one.

I think this is pretty much a neutral trade-off; it's true that a deleted tab in a python script is more likely to result in an error not caught til runtime than a deleted brace in a C program, but giving whitespace semantic meaning also allows you to eliminate errors from things like missing semicolons that mark statement endings. (Those would often be caught as syntax errors, but then, so would most instances of deleting a tab in a python script)

Detetsu
Jan 14, 2006

Your loyal assistant Dr. Meowgon is all over this one.

Is there an easy answer for getting SSL running on a Win 7 python installation?

Thermopyle
Jul 1, 2003

...the stupid are cocksure while the intelligent are full of doubt. —Bertrand Russell

Stoatbringer posted:

I've been using Python for about six months now, and I really like it.

Apart from the indentation, which I consider to be dangerous nonsense which will only end in tears. I don't mind using it, but the old-school part of my brain is always screaming "One slip of the auto-formatter, or accidentally deleting a tab will break everything and nobody will ever know why! Oh woe, woe unto the poor sod who has to maintain this code in five years time!" :bahgawd:

The biggest thing that should make you feel better about this is that it's not a problem.

Billions of lines of code in real applications are a testament to that.

Stabby McDamage
Dec 11, 2005

Doctor Rope

brosmike posted:

I don't see how this is better than using indentation, whereupon any human eye worth a drat can tell how your code blocks match up.

The human eye can't spot ALL indentation problems, e.g.


Yeah, it's a dumb one -- spaces and tabs at the same time, and you shouldn't do that, blah blah blah, but it still is an invisible mistake.

That said, it's one that almost never occurs in practice. I actually had to play with the example a bit just to make it be an error.

spankweasel
Jan 4, 2006

#!/usr/bin/python -tt solves the tabs/spaces problem

TOO SCSI FOR MY CAT
Oct 12, 2008

this is what happens when you take UI design away from engineers and give it to a bunch of hipster art student "designers"
Well, yeah, it's hard to spot mistakes if you're using a broken editor. If your editor didn't render { or }, programming in C'd be awfully hard too.

Every decent editor can render tabs. Go find the preference, turn it on, and be happy.

Stabby McDamage
Dec 11, 2005

Doctor Rope

Janin posted:

Well, yeah, it's hard to spot mistakes if you're using a broken editor. If your editor didn't render { or }, programming in C'd be awfully hard too.

Every decent editor can render tabs. Go find the preference, turn it on, and be happy.

Wow, that was super smug, even for a post about text editors. I was just trying to show that spacing errors can exist, but they're largely pathological.

What do you mean "render tabs"? You mean one that displays some kind of glyph for them? I've never needed a feature like that, because again, the error I showed never actually comes up in practice.

German Joey
Dec 18, 2004

Stabby McDamage posted:

Wow, that was super smug, even for a post about text editors. I was just trying to show that spacing errors can exist, but they're largely pathological.

What do you mean "render tabs"? You mean one that displays some kind of glyph for them? I've never needed a feature like that, because again, the error I showed never actually comes up in practice.

Oh, well, good thing you decided to make a big deal out of something that doesn't exist then!

chemosh6969
Jul 3, 2004

code:
cat /dev/null > /etc/professionalism

I am in fact a massive asswagon.
Do not let me touch computer.

Janin posted:

Well, yeah, it's hard to spot mistakes if you're using a broken editor. If your editor didn't render { or }, programming in C'd be awfully hard too.

Every decent editor can render tabs. Go find the preference, turn it on, and be happy.

The only time I have dumb poo poo like that happen is when I load a file that was done in one editor, that uses spaces, and then load it in one that by default uses tabs.

Then any decent IDE has a switch that fixes the poo poo.

MaberMK
Feb 1, 2008

BFFs
The solution to this problem is to use soft tabs and code in vim

BeefofAges
Jun 5, 2004

Cry 'Havoc!', and let slip the cows of war.

This is the dumbest argument.

Thermopyle
Jul 1, 2003

...the stupid are cocksure while the intelligent are full of doubt. —Bertrand Russell

BeefofAges posted:

This is the dumbest argument.

No it's not.

This is the dumbest argument!

BeefofAges
Jun 5, 2004

Cry 'Havoc!', and let slip the cows of war.

Thermopyle posted:

No it's not.

This is the dumbest argument!

Nuh uh!

Opinion Haver
Apr 9, 2007

I'm using xml.etree to do some XML parsing, and I came across some rather curious behavior:

code:
for node in xml:
    if node.find("invalid-data"): print "hi"
produces no output, but
code:
for node in xml:
    if node.find("invalid-data") != None: print "hi"
does. What's going on here?

Jonnty
Aug 2, 2007

The enemy has become a flaming star!

yaoi prophet posted:

I'm using xml.etree to do some XML parsing, and I came across some rather curious behavior:

code:
for node in xml:
    if node.find("invalid-data"): print "hi"
produces no output, but
code:
for node in xml:
    if node.find("invalid-data") != None: print "hi"
does. What's going on here?

Print the output of find() directly. It might be illegally returning False or something.

Opinion Haver
Apr 9, 2007

Oh, apparently elements with no subelements test as false, and since the invalid-data elements have no subelements, I have to use '!= None' or 'is not None'. That's... really kind of a stupid design decision.

No Safe Word
Feb 26, 2005

also it's more "pythonic" to say is not None than to do !=. None is a singleton so there's only ever one None.

freezepops
Aug 21, 2007
witty title not included
Fun Shoe
I have some code that uses the distance formula, and currently it takes ~20seconds to finish a job. This code:

code:
    def distance3D(self,x,y,z,x1,y1,z1):
        dist = math.sqrt((x-x1)*(x-x1) + (y-y1)*(y-y1) + (z-z1)*(z-z1))
        return dist
Is faster than this code, which takes ~30seconds:

code:
    def distance3D(self,x,y,z,x1,y1,z1):
        dist = (x-x1)*(x-x1) + (y-y1)*(y-y1) + (z-z1)*(z-z1)
        return dist
Why would adding square root make my code faster?

dis astranagant
Dec 14, 2006

Are you using that distance as a loop counter or something? sqrt(big pile of numbers) is a very different thing from (big pile of numbers)

freezepops
Aug 21, 2007
witty title not included
Fun Shoe
It's looping through an image, so its run ~3x on each pixel in the image, but the values are discarded after comparison, and the max value would be 255^2*3.

tripwire
Nov 19, 2004

        ghost flow

freezepops posted:

It's looping through an image, so its run ~3x on each pixel in the image, but the values are discarded after comparison, and the max value would be 255^2*3.

Well you're wrong about it being faster when you add an extra function call to math.sqrt at least.


code:
setup_trailer = '''
import random, math
width,height = 512,512
values = range(256)
pixels = [ tuple( random.choice(values) for _ in xrange(3)) 
    for _ in xrange(width*height) ]
'''


distance_1_setup = '''
def distance(x,y,z,x1,y1,z1):
    dist = math.sqrt((x-x1)*(x-x1) + (y-y1)*(y-y1) + (z-z1)*(z-z1))
    return dist

''' + setup_trailer

distance_2_setup = '''
def distance(x,y,z,x1,y1,z1):
    dist = (x-x1)*(x-x1) + (y-y1)*(y-y1) + (z-z1)*(z-z1)
    return dist

''' + setup_trailer

distance_3_setup = '''
def distance(x,y,z,x1,y1,z1):
    return (
        (x1-x)**2 +
        (y1-y)**2 +
        (z1-z)**2 )

''' + setup_trailer



statement = '''
index = 0
for x in xrange(width):
    for y in xrange(height):
        pixel = pixels[index]
        distance(pixel[0],pixel[1],pixel[2],127,127,127)
        index += 1
'''

import timeit

print timeit.timeit(statement,distance_1_setup,number=20)
print timeit.timeit(statement,distance_2_setup,number=20)
print timeit.timeit(statement,distance_3_setup,number=20)
Output:
9.2920000553131104
5.871999979019165
5.0909998416900635

tripwire fucked around with this message at 04:32 on Jun 22, 2011

German Joey
Dec 18, 2004
maybe it would be faster to cache to function call?

Computer viking
May 30, 2011
Now with less breakage.

German Joey posted:

maybe it would be faster to cache to function call?

And on an even more brute-force methodological level, I'm sure a dash of Cython would speed that up a lot.

More to the point, if it's genuinely faster when wrapped in a sqrt, the most obvious guess is that the compiler produces better code in the latter case, either because you trigger some heuristic, or because it's able to infer more useful info (e.g. about types)? Just out of curiosity, does it change anything time-wise if you replace the sqrt() with e.g. float()?

Computer viking fucked around with this message at 19:04 on Jun 22, 2011

FoiledAgain
May 6, 2007

I'm not familiar with the use of triple quotes in tripwire's post. How does that work? Is this because timeit needs strings? (I've never used timeit so I have no idea.) Or is this some other convention? I'm getting an IndentationError when I copy the code, so I apparently don't understand this.

Computer viking
May 30, 2011
Now with less breakage.

FoiledAgain posted:

I'm not familiar with the use of triple quotes in tripwire's post. How does that work? Is this because timeit needs strings? (I've never used timeit so I have no idea.) Or is this some other convention? I'm getting an IndentationError when I copy the code, so I apparently don't understand this.

Looks like timeit wants code strings, yes. Triple quotes can contain newlines and single quotes, so they're useful for things like that.
As for indent errors, it worked for me, though I copied each block on its own. I also got the same results, so it seems to have been a fluke...

edit:
Just for the record, Cython is indeed a good bit faster. Using the definitions above, I get this:
Distance 1: 4.63275718689
Distance 2: 3.84116697311
Distance 3: 3.87355780602

pre:
>>> import pyximport; pyximport.install()
>>> distance_4_setup="from distance_test import distance;" + setup_trailer
>>> print timeit.timeit(statement,distance_4_setup,number=20)
1.72920703888
And "distance_test.pyx" is this:
pre:
def distance(int x, int y, int z, int x1, int y1, int z1):
    dist = (x-x1)*(x-x1) + (y-y1)*(y-y1) + (z-z1)*(z-z1)
    return dist

Computer viking fucked around with this message at 19:34 on Jun 22, 2011

Unknownmass
Nov 3, 2007
I am new to python and getting back into programming after a few years. My question is I have a tab-delimited text file with indices and then grouping of data. What would be the best way to import these, and rank them? Also if possible even import them as separate groups. I have been trying to use numpy but have not had to much luck so far. Thanks

Computer viking
May 30, 2011
Now with less breakage.

Unknownmass posted:

I am new to python and getting back into programming after a few years. My question is I have a tab-delimited text file with indices and then grouping of data. What would be the best way to import these, and rank them? Also if possible even import them as separate groups. I have been trying to use numpy but have not had to much luck so far. Thanks

I don't quite get the structure here, could you elaborate?

brosmike
Jun 26, 2009

Unknownmass posted:

I am new to python and getting back into programming after a few years. My question is I have a tab-delimited text file with indices and then grouping of data. What would be the best way to import these, and rank them? Also if possible even import them as separate groups. I have been trying to use numpy but have not had to much luck so far. Thanks

What you describe is a bit vague, but probably pretty easy to do (you probably don't need to bother with numpy for the importing). Can you give us an actual example of the format you're trying to read? Telling us what you mean by "separate groups" of data, as well as how you want to rank the data, would help us help you.

Unknownmass
Nov 3, 2007
Sorry for being vague. The file is an excel matrix that has been exported as a tab-delimited text file, of 30 points split into 3 groups (points 1-10 group 1, 11-20 group 2 and 21-30 group 3). Each of the 30 points has a distance to each other ie:

A B C D
A 0 2 5 10
B 2 0 4 8
C 5 4 0 3
D 10 8 3 0

That is the general structure but with 30 points and the first column and row are the index points (in the example A,B,C,D). What I am trying to do is import all the points and then rank them in some manner like largest to smallest. Hope this helps. I will post what I have so far soon, but its likely to be ugly as I'm just returning to coding. Thanks for the help.

BeefofAges
Jun 5, 2004

Cry 'Havoc!', and let slip the cows of war.

There are modules that let you read Excel files directly through Python, you know.

FoiledAgain
May 6, 2007

Unknownmass posted:

That is the general structure but with 30 points and the first column and row are the index points (in the example A,B,C,D). What I am trying to do is import all the points and then rank them in some manner like largest to smallest. Hope this helps. I will post what I have so far soon, but its likely to be ugly as I'm just returning to coding. Thanks for the help.

Is this the kind of thing you want to do?

code:
lines =  [line for line in open(your_file_name_here)]
for line in lines:
    line = line.split('\t')
    line = line[1:]
    line.sort()

Lurchington
Jan 2, 2003

Forums Dragoon

BeefofAges posted:

There are modules that let you read Excel files directly through Python, you know.

specifically, http://pypi.python.org/pypi/xlrd

it's pretty nice

Computer viking
May 30, 2011
Now with less breakage.

Unknownmass posted:

Sorry for being vague. The file is an excel matrix that has been exported as a tab-delimited text file, of 30 points split into 3 groups (points 1-10 group 1, 11-20 group 2 and 21-30 group 3). Each of the 30 points has a distance to each other ie:
pre:
     A  B  C  D
   A 0  2  5  10
   B 2  0  4  8
   C 5  4  0  3
   D 10 8  3  0
That is the general structure but with 30 points and the first column and row are the index points (in the example A,B,C,D). What I am trying to do is import all the points and then rank them in some manner like largest to smallest. Hope this helps. I will post what I have so far soon, but its likely to be ugly as I'm just returning to coding. Thanks for the help.

If I get it right, that's a distance matrix for all the points. What do you count as the value of a single point? Something derived from the distances, or do you have a separate table for that? Also, are the points grouped just by the external knowledge that the first ten are group 1 and so on, or is this encoded somehow?

BTW, the [ pre] tag is useful for fixed-width text. ;)

Unknownmass
Nov 3, 2007

Computer viking posted:

If I get it right, that's a distance matrix for all the points. What do you count as the value of a single point? Something derived from the distances, or do you have a separate table for that? Also, are the points grouped just by the external knowledge that the first ten are group 1 and so on, or is this encoded somehow?

BTW, the [ pre] tag is useful for fixed-width text. ;)

Yes it is a distance matrix. The files I'm currently working with are just the distance values, and not the points. The points are just grouped by the external knowledge and have to be separated out. Thanks for everyone's help.

Computer viking
May 30, 2011
Now with less breakage.

Unknownmass posted:

Yes it is a distance matrix. The files I'm currently working with are just the distance values, and not the points. The points are just grouped by the external knowledge and have to be separated out. Thanks for everyone's help.

Right, which still doesn't answer what you're sorting the points by. :)

Anyway. To read a distance matrix like that into a numpy array, you can do something like this:
pre:
import numpy as np

infile = open("distance.txt", "r")
header = infile.readline().strip().split("\t")
header_len = len(header)
data = []
rownames = []
for line in infile:
	parts = line.strip().split("\t")
	if len(parts) < header_len:
		break
	rownames.append(parts[0])
	values = [int(p) for p in parts[1:] ]
	data.append(values)

data_array = np.array(data)
That leaves you with the header and the row names (should be identical), a list of lines ("data"), and a numpy array of the same numbers ("data_array"). It's possible to compact this down to just a few lines by nesting two list comprehensions, but ... let's not go there.

I'm still not sure what you're sorting by, so I'll use the sum of distances as a placeholder. To do this, you basically want to sort a list of key,value - pairs on the value - a neat way is to use operators.itemgetter to create a "get the second element"-function, and give that to "sorted". (Remember that we count from 0.)

pre:
import operator
sum_of_distances = map(sum,data_array)
name_with_distance = zip(header,sum_of_distances)
nwd_sorted = sorted(name_with_distance, key=operator.itemgetter(1))
(Of course, you could just swap the order of the arguments in the zip function, to put the value first ... but that wouldn't let me talk about itemgetter.)

Oh, and output:
pre:
>>> header
['a', 'b', 'c', 'd', 'e']
>>> data_array
array([[0, 1, 1, 5, 4],
       [1, 0, 2, 5, 4],
       [1, 2, 0, 3, 3],
       [5, 5, 3, 0, 5],
       [4, 4, 3, 5, 0]])
>>> name_with_distance
[('a', 11), ('b', 12), ('c', 9), ('d', 18), ('e', 16)]
>>> nwd_sorted
[('c', 9), ('a', 11), ('b', 12), ('e', 16), ('d', 18)]
As for the groups, uhm. You can get group N by grabbing header[N*10:(N+1)*10] and data_array[N*10:(N+1)*10, N*10:(N+1)*10], then work with those (slices are from-and-including:to-but-not-including).

Computer viking fucked around with this message at 20:17 on Jun 24, 2011

Clandestine!
Jul 17, 2010
Here with more stupid questions. I finished a text handling program, which was painfully easy to do (it counted the words and sentences in a file, nothing too crazy). I, however, have been stumped for the past hour on another text handling program which should've been even easier: one using a dictionary to count all of the words in a text file and display the output in alphabetical order.

code:
word_counts = {}
word_items = {}

myfile = open('gettysburg.txt', 'r')

for words in myfile:
    word_items[words] = word_counts.items()
    word_items.sort()
    
print word_items

(I'm using a text file of the Gettysburg address, in case anyone cares :v: ) I have a feeling that this is FAR too simple and I am doing things badly; as well, I'm getting an attribute error that says that the dictionary object has no sort function. For reference: I'm using the online guide "How to Think Like a Computer Scientist" now; it's been pretty good to me so far, but it's providing no hints right now.

Adbot
ADBOT LOVES YOU

tripwire
Nov 19, 2004

        ghost flow
First: Dictionaries are not for counting. There is a defaultDict class however which you can use for counting (and if you are using 2.7 or higher, theres even a specialized Tally class).

What do you think this line does?
code:
word_items[words] = word_counts.items()
You never update the contents of the dictionary "word_counts", but you query it a whole lot. Also, calling sort() on a dictionary is a nonsequitor: dictionaries are only mappings between keys and values; there is no concept of ordering in a dictionary. If you want though, you can ask the dictionary for its keys, and throw those into the "sorted" function if you want the keys in sorted order.

I don't know what your textfile looks like, but I just googled for a text file of the address and got this:
http://morphadorner.northwestern.edu/morphadorner/techtalk/sentenceandtokenoffsets/gettysburg.txt

I think this is what you are trying to do:
code:
from collections import defaultdict
word_tally = defaultdict(int)
#a default dict takes a factory function as an argument. whenever you lookup a key
#which isn't in the dictionary, it uses that factory function to make a value for
#the key. In this case, int() is used to return the integer value 0.

with open('gettysburg.txt', 'r') as myfile:
    for line in myfile:
        for word in line.strip().split(' '):
            word_tally[word] += 1

for word in sorted( word_tally.keys() ):
    print word, word_tally[word]
Do you understand what this code is doing?

tripwire fucked around with this message at 17:48 on Jun 26, 2011

  • Locked thread