Python information and short questions megathread.

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Python information and short questions megathread.

«‹›484 »

Lonely Wolf: Jan 20, 2003; Will hawk false idols for heaps and heaps of dough.

Yes, threads are not completely free. You'd get a better improvement by splitting the files up into n batches each processed by n threads where each thread processes each of the files in its batch serially.

# ? Apr 14, 2011 22:40

Adbot: ADBOT LOVES YOU

# ? May 29, 2024 21:31

Deus Rex: Mar 5, 2005

Dren posted:

The sys.stdout answer sounds good enough for you but I'd like to mention that I like to build my print strings like so:
code:
out_str = '\n'.join([
    '---------Condition A----------------',
    'Hits:\t {0}'.format(hits),
    'Misses:\t {0}'.format(misses),
    'Correct Rejections:\t {0}'.format(correct_rejections),
    'False Alarms:\t {0}'.format(false_alarms),
    ])

print out_str
It helps avoid the sort of problem you're running into and helps keep lines limited to 80 characters.

Triple-quotes are nice for things like this:

code:

out_str = """
---------Condition A----------------
Hits:\t {0}
Misses:\t {1}
Correct Rejections:\t {2}
False Alarms:\t {3}""".format(hits, misses, correct_rejections,
false_alarms)

# ? Apr 14, 2011 22:41

Stabby McDamage: Dec 11, 2005; Doctor Rope

Modern Pragmatist posted:

So if I want to dispatch jobs to multiple threads, theoretically they should be long-running jobs right? I'm assuming there is some overhead associated with the spawning of a thread.

I have the following code:
code:
for root,dirs,files in os.walk(path):
    for file in files[2:]:
        performOperation(file)
Ideally I would like to speed this up. The problem is that performOperation only requires about 0.001 seconds to complete per file. I have about 50k files.

Theoretically if I broke the files into two chunks and sent each of those to their own threads, would I really see much of a performance increase or would the reading/writing from disk be a bottleneck?

What would be the best way to approach this?

The best way would be to implement it and see. I'd divide the array into N pieces for N threads, then play with N to see the performance effects.

# ? Apr 15, 2011 00:51

bakersk8r6301: Mar 24, 2008

So im trying to write a program that will search within a sequence of DNA, find a START codon, and then replace the "A" with the current count number in a given frame and output the modified sequence.

i.e. if i enter: AAAATGAATGGCTAACTTTTGATG

it should output something like:

Frame 1: AAA1TGAATGGCTAACTTTTG2TG
Frame 2: AAAATGA1TGGCTAACTTTTGATG
Frame 3: AAAATGAATGGCTAACTTTTGATG

any ideas??

# ? Apr 15, 2011 01:00

Stabby McDamage: Dec 11, 2005; Doctor Rope

bakersk8r6301 posted:

So im trying to write a program that will search within a sequence of DNA, find a START codon, and then replace the "A" with the current count number in a given frame and output the modified sequence.

i.e. if i enter: AAAATGAATGGCTAACTTTTGATG

it should output something like:

Frame 1: AAA1TGAATGGCTAACTTTTG2TG
Frame 2: AAAATGA1TGGCTAACTTTTGATG
Frame 3: AAAATGAATGGCTAACTTTTGATG

any ideas??

Certainly doable, but we'd need more info. What constitutes a start codon? An end codon? A frame?

The answer will undoubtedly involve regular expressions from the re module.

# ? Apr 15, 2011 01:08

bakersk8r6301: Mar 24, 2008

Stabby McDamage posted:

Certainly doable, but we'd need more info. What constitutes a start codon? An end codon? A frame?

The answer will undoubtedly involve regular expressions from the re module.

START codon is usually ATG. end codon isnt neccesary. A frame refers to each possibly way that the sequence can be read by reading 3 letters at a time, this is called a codon.

# ? Apr 15, 2011 01:12

Stabby McDamage: Dec 11, 2005; Doctor Rope

bakersk8r6301 posted:

START codon is usually ATG. end codon isnt neccesary. A frame refers to each possibly way that the sequence can be read by reading 3 letters at a time, this is called a codon.

Oh, so there are only 3 frames max, one for each offset into the string (0,1,2)? Then:

code:

from UserString import MutableString
# mutable strings let us modify the string in place, which makes jamming in numbers easier;
# otherwise we'd have to build a result string as we went.

def mark_start_codons(sequence, frame=0, start='ATG'):
	sequence = MutableString(sequence)
	index=1
	for i in xrange(frame, len(sequence), 3):
		if sequence[i:i+3] == start:
			sequence[i:i+3] = "%d%s" % (index,start[1:])
			index += 1
	return sequence


s='AAAATGAATGGCTAACTTTTGATG'

print "original: %s" % s
for frame in xrange(3):
	print "frame %d:  %s" % (frame, mark_start_codons(s,frame))

Output:

code:

original: AAAATGAATGGCTAACTTTTGATG
frame 0:  AAA1TGAATGGCTAACTTTTG2TG
frame 1:  AAAATGA1TGGCTAACTTTTGATG
frame 2:  AAAATGAATGGCTAACTTTTGATG

This was a fun problem...I tried some stuff based on itertools at first, then a crazy regex, but in the end the easiest thing was making the string mutable and just walking through it.

# ? Apr 15, 2011 02:12

duck monster: Dec 15, 2004

Any PyCharm gurus here?

After growing slowly sick of eclipse/pydev's bullshit and piss-poor code completion, I thought I'd give pycharm 30 day trial a go, after all its Jetbrains, it must be good right?

It pretty much HAS been a great IDE, and the mercurial support is spot on but I've hit an absolute show-stopper of a problem.

I've been using expandrive on my mac to hook into my dev server where I keep the source and stuff, but had some connectivity issues the other day due to my wireless being shite. In the end the freeze ups pissed me off too much and I force quit (aka "best quit"!) the program.

Now whenever I restart it complains of a corrupt index and just sits there for hours on end "rebuilding" it.

Is there anyway to fix this. Its rendered the program unable to be used, and I really like this thing, but can't get into the thing.

Not finding much joy on google either, alas. The one place I did see a guy ask about the same problem just had some asprergic dorks hurf durfing about how he should be using VIM.

# ? Apr 15, 2011 04:06

BeefofAges: Jun 5, 2004; Cry 'Havoc!', and let slip the cows of war.

Email their support people, perhaps?

# ? Apr 15, 2011 04:44

duck monster: Dec 15, 2004

BeefofAges posted:

Email their support people, perhaps?

Seems there was some wierd hg-blah files splattered around that I just moved out , then after a bit of tweaking mercurials archives where fixed and pycharm was fixed.

Bleah. Its a great IDE, but I wont deny that its flakey. I'll give it a bit more time.

# ? Apr 15, 2011 06:49

Harold Ramis Drugs: Dec 6, 2010; by Y Kant Ozma Post

I got a couple books from a surplus sale for Python 2, and I was wondering if the information in them would still be mostly usable for learning Python 3. I'm completely new to computer programming, and I want to get some experience before I take my first programming class this Summer. I'm still awaiting word from the instructor, but I'm almost certain that the main language taught in that class will be Python 3.

Dive Into Python was mentioned in the OP. I haven't really read that far into it, but it looks like it's written for novice programmers. If this is the case, then this guide will probably be sufficient for me to start.

# ? Apr 15, 2011 20:06

BeefofAges: Jun 5, 2004; Cry 'Havoc!', and let slip the cows of war.

It shouldn't be especially hard to transition from Python 2 to Python 3.

# ? Apr 15, 2011 20:14

MaberMK: Feb 1, 2008; BFFs

Harold Ramis Drugs posted:

I got a couple books from a surplus sale for Python 2, and I was wondering if the information in them would still be mostly usable for learning Python 3. I'm completely new to computer programming, and I want to get some experience before I take my first programming class this Summer. I'm still awaiting word from the instructor, but I'm almost certain that the main language taught in that class will be Python 3.

Dive Into Python was mentioned in the OP. I haven't really read that far into it, but it looks like it's written for novice programmers. If this is the case, then this guide will probably be sufficient for me to start.

Do not, under any circumstances, ever, for any reason, read Dive Into Python.

Read Learn Python the Hard Way instead.

# ? Apr 15, 2011 20:18

Thermopyle: Jul 1, 2003; ...the stupid are cocksure while the intelligent are full of doubt. �Bertrand Russell

MaberMK posted:

Do not, under any circumstances, ever, for any reason, read Dive Into Python.

This internet person speaks the truth.

# ? Apr 15, 2011 20:46

Harold Ramis Drugs: Dec 6, 2010; by Y Kant Ozma Post

Ok, ok thanks for the pointer. I'm currently on Exercise 0 of Python the Hard Way and it's directing me to run Python through the command prompt. CMD doesn't recognize the command "python" though. How do I get it to do this?

Edit: Nevermind, I had to restart CMD

Harold Ramis Drugs fucked around with this message at 21:31 on Apr 15, 2011

# ? Apr 15, 2011 21:26

duck monster: Dec 15, 2004

Harold Ramis Drugs posted:

Ok, ok thanks for the pointer. I'm currently on Exercise 0 of Python the Hard Way and it's directing me to run Python through the command prompt. CMD doesn't recognize the command "python" though. How do I get it to do this?

Edit: Nevermind, I had to restart CMD

I'm not sure if python for windows (I havent used windows in 4-5 years in any major capacity, macfag here) still includes idle, but idle at least USED to be included, and was a simple but functional little editor and shell.

I will also say, that the python tutorial on the python website is *superb*. Thats how I learned nearly a decade ago, and I did most of it in one afternoon alt-tabbing at work, and by the end of the day I had a functional serial port controller for some hardware I was developing.

HOWEVER if you have no computing skills at all, it might not be the best approach as from memory it does pre-suppose some prior knowledge of basic level programming concepts.

edit: Two more things. You MIGHT be better off with python 2.7 if you want to start playing around with modules and stuff. Python has two "branches" at this point, the 2.x series and the 3.x series. Whilst 3.x has been around a few years, a lot of module programmers and OS vendors have dragged their feet moving across to it, so 2.7 for most real world stuff is the gold standard. 3.x is a superior version of the language , but its subtly different enough for in depth stuff that a few things don't work the same. However the 2.7 range is sort of a bridge between the two implementing enough of 3.0s quirks that stuff written for 3.0 *ought* to work for it. This of course is a controversial suggestion, so expect others to disagree with me!

Secondly, download iPython, which is a replacement shell to the python interpretter (written in python). Its loving fantastic and adds tab completion, proper command history and a big bunch of other features. Its really handy to just load a module and type module.<hit tab> and it lists the classes and functions available, then interogate the help data in the module for more clues on useing it. Most coders suck at writing documentation, but having the docstrings and module help available makes life so much easier.

duck monster fucked around with this message at 01:41 on Apr 16, 2011

# ? Apr 16, 2011 01:30

Your Computer: Oct 3, 2008; Grimey Drawer

Trying to think of something to code in Python at 4AM led me to write this:

code:

import os
import time
import shutil

num = int(__file__[0:-3])
num += 1

new = str(num)+".py"
old = str(num-1)+".py"

if(num <= 10):
    shutil.copy(__file__, new)
    time.sleep(0.5)
    if(os.path.exists(old)):
        os.remove(old)
    time.sleep(0.5)

    os.system("python "+new)

A self-replicating-and-destroying program.

That said, I've been wanting to get into Python but I'm so used to Java now that it's a bit hard to get used to. Are there any tutorials or similar written for (semi-experienced) Java developers? I really like Python and would love to start coding stuff in it instead of Java.

# ? Apr 16, 2011 02:40

posting smiling: Jun 22, 2008

Speaking of late night coding, I just did this while helping a friend in an intro to Python class.

code:

from sys import stdout

(lambda b,p,s:stdout.write(s(b))
or map(lambda i:b.__setitem__(*(
lambda m,p:(ord(m)-ord('a'),p))(
raw_input('Move for player %s: '
%p[i%2]),p[i%2]))or stdout.write
(s(b)),range(9)))([chr(ord('a')+
i) for i in range(9)],['X','O'],
lambda r:'\n'.join(' '.join(r[i:
i+3]) for i in range(0,7,3))+'\n')

(Tic-tac-toe)

# ? Apr 16, 2011 10:51

MaberMK: Feb 1, 2008; BFFs

Classicist posted:

Speaking of late night coding, I just did this while helping a friend in an intro to Python class.

code:

from sys import stdout

(lambda b,p,s:stdout.write(s(b))
or map(lambda i:b.__setitem__(*(
lambda m,p:(ord(m)-ord('a'),p))(
raw_input('Move for player %s: '
%p[i%2]),p[i%2]))or stdout.write
(s(b)),range(9)))([chr(ord('a')+
i) for i in range(9)],['X','O'],
lambda r:'\n'.join(' '.join(r[i:
i+3]) for i in range(0,7,3))+'\n')

(Tic-tac-toe)

# ? Apr 16, 2011 16:57

tripwire: Nov 19, 2004; _{ghost flow}

FoiledAgain posted:

I've inherited code which has blocks of print statements like this:
code:
print '---------Condition A----------------'
print 'Hits:\t ', hits
print 'Misses:\t ', misses
print 'Correct Rejections:\t ', correct_rejections
print 'False Alarms:\t ', false_alarms
Is there any way to easily convert these blocks so that they print to file instead? My (clumsy) idea is to do something like Find->Replace "print" -> "print>>filename, ", but is there a more pythony thing to do?

vvvv Wow, that's so simple. Thanks!

I have fallen in love with column select mode in notepad++/pycharm

code:

print >> myfile, '---------Condition A----------------'
print >> myfile, 'Hits:\t ', hits
print >> myfile, 'Misses:\t ', misses
print >> myfile, 'Correct Rejections:\t ', correct_rejections
print >> myfile, 'False Alarms:\t ', false_alarms

# ? Apr 16, 2011 18:15

FoiledAgain: May 6, 2007

Classicist posted:

Speaking of late night coding, I just did this while helping a friend in an intro to Python class.

code:

from sys import stdout

(lambda b,p,s:stdout.write(s(b))
or map(lambda i:b.__setitem__(*(
lambda m,p:(ord(m)-ord('a'),p))(
raw_input('Move for player %s: '
%p[i%2]),p[i%2]))or stdout.write
(s(b)),range(9)))([chr(ord('a')+
i) for i in range(9)],['X','O'],
lambda r:'\n'.join(' '.join(r[i:
i+3]) for i in range(0,7,3))+'\n')

(Tic-tac-toe)

There's no winning condition. It plays until there is no more room then outputs a list of Nones. Also you can "overwrite" symbols (which adds a new element to the game, I supppose).
(I'm still impressed though)

# ? Apr 16, 2011 21:37

bakersk8r6301: Mar 24, 2008

Stabby McDamage posted:

Oh, so there are only 3 frames max, one for each offset into the string (0,1,2)? Then:

code:

from UserString import MutableString
# mutable strings let us modify the string in place, which makes jamming in numbers easier;
# otherwise we'd have to build a result string as we went.

def mark_start_codons(sequence, frame=0, start='ATG'):
	sequence = MutableString(sequence)
	index=1
	for i in xrange(frame, len(sequence), 3):
		if sequence[i:i+3] == start:
			sequence[i:i+3] = "%d%s" % (index,start[1:])
			index += 1
	return sequence


s='AAAATGAATGGCTAACTTTTGATG'

print "original: %s" % s
for frame in xrange(3):
	print "frame %d:  %s" % (frame, mark_start_codons(s,frame))

Output:

code:

original: AAAATGAATGGCTAACTTTTGATG
frame 0:  AAA1TGAATGGCTAACTTTTG2TG
frame 1:  AAAATGA1TGGCTAACTTTTGATG
frame 2:  AAAATGAATGGCTAACTTTTGATG

This was a fun problem...I tried some stuff based on itertools at first, then a crazy regex, but in the end the easiest thing was making the string mutable and just walking through it.

This isn't making too much sense. I'm looking for a code with a simple iteration that can do this for ANY given sequence.

# ? Apr 16, 2011 22:23

posting smiling: Jun 22, 2008

FoiledAgain posted:

There's no winning condition. It plays until there is no more room then outputs a list of Nones. Also you can "overwrite" symbols (which adds a new element to the game, I supppose).
(I'm still impressed though)

Funny you should mention that

code:

from sys import stdout

(lambda b,p,s,z,w:stdout.write(s(b))or z
(lambda f:(lambda i:stdout.write("Cat's"
" game.\n")if i==9 else b.__setitem__(*(
lambda p:(lambda m:(ord(m)-ord('a'),p))(
z(lambda f:lambda:(lambda k:k[0] if k and
'a'<=k[0]<'j'and k[0] in b else f())(
raw_input('Move for player %s: '%p)))())
)(p[i%2]))or stdout.write(s(b))or(lambda
w:f(i+1) if not w else stdout.write('Pl'
'ayer %s won!\n'%w))(z(lambda f:lambda i
:False if not i else reduce(lambda a,b:a
==b and a,[b[k]for k in i[0]])or f(i[1:]
))(w))))(0))([chr(ord('a')+i) for i in
range(9)],['X','O'],lambda r:'\n'.join(
' '.join(r[i:i+3])for i in range(0,7,3))
+'\n',lambda f:(lambda x:f(lambda*args:x
(x)(*args)))(lambda x:f(lambda*args:x(x)
(*args))),(lambda r:r+zip(*r)+[range(0,9
,4)]+[range(2,7,2)])([range(i,i+3)for i
in range(0,7,3)]))

Now does input validation and checks for winning conditions.

# ? Apr 16, 2011 23:32

Stabby McDamage: Dec 11, 2005; Doctor Rope

bakersk8r6301 posted:

This isn't making too much sense. I'm looking for a code with a simple iteration that can do this for ANY given sequence.

You're going to have to be more specific, because it does use iteration and can work for any given sequence :confused:

.

Do you mean you want it to take input from the user? You could just have it call raw_input() where s is assigned or take a filename argument or whatever.

# ? Apr 16, 2011 23:53

bakersk8r6301: Mar 24, 2008

New problem: I'm trying to create a script that will add up the molecular weights of a given protein. I've made a dictionary with all the proteins and their molecular weights. The question is, how do i add up the weight of a user inputted protein? Which function or method should I use?
I have this so far:

code:

protein = raw_input("Enter a protein sequence: ")

MW = 0.0
weights = {'A':'89.093','G':'75.067', 'M':'149.211','S':'105.093',
           'C':'121.158','H':'155.155','N':'132.118','T':'119.119',
           'D':'133.103', 'I':'131.173', 'P':'115.131', 'V':'117.146',
           'E':'147.129', 'K':'146.188', 'Q':'146.145 W 204.225',
           'F':'165.189', 'L':'131.173', 'R':'174.201', 'Y':'181.189'}

for x in protein:

Where do I go from here?
For example, If i enter the sequence "VLSPADKTNVKAAW� it should print out something like: MW = 1499.7
Help plzz!

# ? Apr 17, 2011 00:22

Scaevolus: Apr 16, 2007

bakersk8r6301 posted:

Where do I go from here?

If you know that there will only be valid letters:
total = sum(weights[c] for c in protein)

A bit more lax, ignores unknown characters and converts to uppercase first:
total = sum(weights.get(c., 0.0) for c in protein.upper())

Scaevolus fucked around with this message at 00:27 on Apr 17, 2011

# ? Apr 17, 2011 00:25

_aaron: Jul 24, 2007; The underscore is silent.

I'm trying to install the Polygon package (http://polygon.origo.ethz.ch/download) for Python 3, and I'm running into an issue that's making me tear my hair out.

There hasn't been a binary released for the Python3 version of the package (I'm running Windows 7), so I'm trying to build it myself. I'm running "python setup.py install" from the command line, and I'm getting an error in cygwincompiler.py (part of the Python 3.2 release, in the Python32\lib\distutils\ folder). Line 124 in this file is trying to determine the version of "ld" (whatever that is), and it's giving me the following error:

TypeError: Unorderable types: NoneType() >= str()

I assume this means it can't compare Python None to a string. Looking at the cygwincompiler module, it seems that the only way the version that it's comparing can be none is if it can't find the ld.exe file. I know that file is on my machine; it's at C:\MinGW\mingw32\bin. Running that file in the command line with the -v option tells me that it's version 2.21. So I hard-coded that in the cygwincompiler.py file (stupid, but I'm desperate) only to have it fail when trying to check the version of gcc.

So it seems like something is fundamentally messed up with either my Python on MinGW installation which is preventing me from building this package. I've never built any other package like this before, and this is getting fairly frustrating. Any ideas on what I can try next?

edit: OK, I needed to restart cmd. Jesus christ I need to stop drinking so heavily while I do this. Also, I could only get v3.0.3 of Polygon installed, something might be wrong with 3.0.4

_aaron fucked around with this message at 01:34 on Apr 17, 2011

# ? Apr 17, 2011 01:26

tripwire: Nov 19, 2004; _{ghost flow}

Heres a question about python and memory usage/garbage collection.

From reading the documentation, I'm aware that pythons memory management is based on reference counting, so ideally once there are no further references to an object the garbage collection will automatically clean it up, although any cyclic references aren't guaranteed to be collected.
I'm also aware that for various reasons, cpython often cannot return all of the memory it allocates from the operating system. For example, I've read that the sys module always holds a reference to the last traceback, with a big fat copy of all its local and globals. Theres other optimizations like lists of recently used floats and ints I think which will never be returned to the OS.

The gc module will actually tell you which items are sticking around and which ones are waiting to be collected with gc.get_objects() and gc.garbage.

I'm trying to write a somewhat memory efficient text processing script- I have a class which takes a file object and yields consecutive entries parsed from that file (each entry is an instance of another class which uses properties to provide getters/setters on the data within).

It seemed sensible to pass in a file object in the constructor, and have the class implement iterable (i.e. has next method and __iter__ returns self). What I'm noticing though is that despite my thinking that this approach would only need to keep the current snippet of data in memory, if I try to stream a few different files through it, the memory usage just grows and grows and grows until it dies from a MemoryError exception.

Looking at the output of gc.get_objects(), the problem seems to be that every single Entry instance adds a new set of getter/setter functions, so even if the entries are discarded in a loop their functions stick around soaking up all the ram. Is this a quirk in pythons garbage collection or is it that these proeprty functions really are reffered to by something somewhere?

# ? Apr 17, 2011 06:39

king_kilr: May 25, 2007

a) Not Python, *C*Python, b) I can't tell what you're describing, perhaps pasting some code would clarify.

# ? Apr 17, 2011 06:45

Jam2: Jan 15, 2008; With Energy For Mayhem

Looking for a Python library for image processing. More specifically, I need to import an image so I can perform an RGB to HSI conversion. I'm rewriting some Mathematica code in Python to start learning. MMCA does a lot of this stuff internally.

What's a good way to go about finding libraries? I googled and found this one:

http://www.pythonware.com/products/pil/

However, I can't tell if this is the *best* one available, except for the fact that it's the top search result.

# ? Apr 17, 2011 07:05

tripwire: Nov 19, 2004; _{ghost flow}

I don't have the code on this computer but something like this lovely example.

code:

class Entry(object):
    def __init__(self,line):
        self._field1 = lines[0:24]
        self._field2 = lines[24:44]
        self._field3 = lines[44:66]
        self._field4 = lines[66:].rstrip()
    
    @property
    def field1(self):
        return self._field1
    
    @field1.setter
    def field1(self, value):
        self._field1 = value
    
    @property
    def field2(self):
        return self._field2
    
    @field2.setter
    def field2(self, value):
        self._field2 = value
    
    @property
    def field3(self):
        return self._field3
    
    @field3.setter
    def field3(self, value):
        self._field3 = value
    
    @property
    def field4(self):
        return self._field4
    
    @field4.setter
    def field4(self, value):
        self._field4 = value

class Header(object):
    def __init__(line):
        self._field1 = lines[0:44]
        self._field2 = lines[44:]
    
    @property
    def field1(self):
        return self._field1
    
    @field1.setter
    def field1(self, value):
        self._field1 = value
    
    @property
    def field2(self):
        return self._field2
    
    @field2.setter
    def field2(self, value):
        self._field2 = value
    

class MyCrazyFixedWidthFileFormat(object):
    def __init__(self, file_object):
        self.file_object = file_object
        self.header = Header(file_object.read(81))
        
        def next_iteration(self):
            line = self.file_object.read(81)
            if line.startswith("some value indicating trailer"):
                raise StopIteration
            return Entry(line)
        
        def first_iteration(self):
            self.next = next_iteration
            return self.header
            
        self.next = first_iteration
        
    def __iter__(self):
        return self
    
    def close(self):
        self.file_object.close()

I'll open a large file, feed it into that class and iterate through it looking for entries of interest and skipping whatever I don't want; however every time I make an instance of that class with a new file, Ill find more and more of those getters and setters sitting around, impervious to any collection.

# ? Apr 17, 2011 07:10

DirtyDiaperMask: Aug 11, 2003; by Ozmaugh

tripwire posted:

Looking at the output of gc.get_objects(), the problem seems to be that every single Entry instance adds a new set of getter/setter functions, so even if the entries are discarded in a loop their functions stick around soaking up all the ram. Is this a quirk in pythons garbage collection or is it that these proeprty functions really are reffered to by something somewhere?

Call get_referrers() on the returned objects of interest and see what's referencing them? Turn on debugging and see what's left in gc.garbage after a collection? Do any instances of Entry or Header or the main class stick uncollected, or just the property functions?

DirtyDiaperMask fucked around with this message at 07:13 on Apr 17, 2011

# ? Apr 17, 2011 07:11

tripwire: Nov 19, 2004; _{ghost flow}

Jam2 posted:

Looking for a Python library for image processing. More specifically, I need to import an image so I can perform an RGB to HSI conversion. I'm rewriting some Mathematica code in Python to start learning. MMCA does a lot of this stuff internally.

What's a good way to go about finding libraries? I googled and found this one:

http://www.pythonware.com/products/pil/

However, I can't tell if this is the *best* one available, except for the fact that it's the top search result.

PIL is as good as youre going to get for things like the loading and saving of images in a variety of formats, although its support for things like animated gifs is terrible as I've posted in this thread before.

I've done a LOT of image processing stuff with it and found it fine; you'll find that if you take a PIL image and throw it into numpy/scipy however, you will have all the power you are used to from matlab at your disposal.

# ? Apr 17, 2011 07:14

Jam2: Jan 15, 2008; With Energy For Mayhem

tripwire posted:

PIL is as good as youre going to get for things like the loading and saving of images in a variety of formats, although its support for things like animated gifs is terrible as I've posted in this thread before.

I've done a LOT of image processing stuff with it and found it fine; you'll find that if you take a PIL image and throw it into numpy/scipy however, you will have all the power you are used to from matlab at your disposal.

Numpy/Scipy is exactly what I need. Thanks.

How do these add-ons work exactly? If I install them locally for development, what happens when I deploy my code onto a server to run on the web? Will I need to have these add-ons installed there as well? Is there a way to take only the parts of the add-ons I actually use once deployed? I guess I'm just not sure how the guts of the system fit together.

# ? Apr 17, 2011 08:28

Captain Capacitor: Jan 21, 2008; The code you say?

Jam2 posted:

Numpy/Scipy is exactly what I need. Thanks.

How do these add-ons work exactly? If I install them locally for development, what happens when I deploy my code onto a server to run on the web? Will I need to have these add-ons installed there as well? Is there a way to take only the parts of the add-ons I actually use once deployed? I guess I'm just not sure how the guts of the system fit together.

1. Download this
2. python script-you-just-downloaded.py <-- Can be done on both your machine and the server
3. `source bin/activate` And you should be all set

You'll need the development version/files for Python and the necessary build tools. Those depend on what platform you're using.

# ? Apr 17, 2011 14:36

Jam2: Jan 15, 2008; With Energy For Mayhem

Captain Capacitor posted:

1. Download this
2. python script-you-just-downloaded.py <-- Can be done on both your machine and the server
3. `source bin/activate` And you should be all set

You'll need the development version/files for Python and the necessary build tools. Those depend on what platform you're using.

What does this do, sync libraries/modules from local machine to and from server?

# ? Apr 17, 2011 15:38

Captain Capacitor: Jan 21, 2008; The code you say?

Jam2 posted:

What does this do, sync libraries/modules from local machine to and from server?

Not necessarily. It creates an isolated, self contained Python environment separate from the OS one. It just makes it trivial to keep them in sync.

Client Machine:
1. Install packages as necessary.
2. `pip freeze > reqs.txt`
3. Upload reqs.txt to Server

Server:
1. `pip install -r reqs.txt`

"reqs.txt" becomes your de-facto environment. If you ever need to share the project with anyone else, you can just send them that and they'll get the same package versions as you.

Edit:
Or if you're really comfortable with it all, use Fabric to automate everything.

Captain Capacitor fucked around with this message at 16:56 on Apr 17, 2011

# ? Apr 17, 2011 16:54

king_kilr: May 25, 2007

tripwire posted:

I don't have the code on this computer but something like this lovely example.

code:

class Entry(object):
    def __init__(self,line):
        self._field1 = lines[0:24]
        self._field2 = lines[24:44]
        self._field3 = lines[44:66]
        self._field4 = lines[66:].rstrip()
    
    @property
    def field1(self):
        return self._field1
    
    @field1.setter
    def field1(self, value):
        self._field1 = value
    
    @property
    def field2(self):
        return self._field2
    
    @field2.setter
    def field2(self, value):
        self._field2 = value
    
    @property
    def field3(self):
        return self._field3
    
    @field3.setter
    def field3(self, value):
        self._field3 = value
    
    @property
    def field4(self):
        return self._field4
    
    @field4.setter
    def field4(self, value):
        self._field4 = value

class Header(object):
    def __init__(line):
        self._field1 = lines[0:44]
        self._field2 = lines[44:]
    
    @property
    def field1(self):
        return self._field1
    
    @field1.setter
    def field1(self, value):
        self._field1 = value
    
    @property
    def field2(self):
        return self._field2
    
    @field2.setter
    def field2(self, value):
        self._field2 = value
    

class MyCrazyFixedWidthFileFormat(object):
    def __init__(self, file_object):
        self.file_object = file_object
        self.header = Header(file_object.read(81))
        
        def next_iteration(self):
            line = self.file_object.read(81)
            if line.startswith("some value indicating trailer"):
                raise StopIteration
            return Entry(line)
        
        def first_iteration(self):
            self.next = next_iteration
            return self.header
            
        self.next = first_iteration
        
    def __iter__(self):
        return self
    
    def close(self):
        self.file_object.close()

Well of course, you're creating a new function for each MyCrazyFixedWidthFileFormat instance, why not write it like this like a normal person?

code:

class MyCrazyFixedWidthFileFormat(object):
    def __init__(self, file_object):
        self.file_object = file_object
        self.header = Header(file_object.read(81))
        self.first = True

    def __iter__(self):
        return self
    
    def next(self):
        if self.first:
            self.first = False
            return self.header
        line = self.file_object.read(81)
        if line.startswith("some value indicating trailer"):
            raise StopIteration
        return Entry(line)


    def close(self):
        self.file_object.close()

# ? Apr 17, 2011 18:46

tripwire: Nov 19, 2004; _{ghost flow}

king_kilr posted:

Well of course, you're creating a new function for each MyCrazyFixedWidthFileFormat instance, why not write it like this like a normal person?

code:

class MyCrazyFixedWidthFileFormat(object):
    def __init__(self, file_object):
        self.file_object = file_object
        self.header = Header(file_object.read(81))
        self.first = True

    def __iter__(self):
        return self
    
    def next(self):
        if self.first:
            self.first = False
            return self.header
        line = self.file_object.read(81)
        if line.startswith("some value indicating trailer"):
            raise StopIteration
        return Entry(line)


    def close(self):
        self.file_object.close()

Sorry I guess I didn't explain myself clearly enough. I'm only dealing with about 4-5 of those FixedWidthFileFormat instances; 4-5 functions is what, a few kb of memory? The problem is that if those files have a lot of entries, all of the getters and setters on the entries which are no longer referenced still persist. If I create a new instance of FixedWidthFileFormat using the same source file and look at the memory as its going through the file, I can see that it always returns to roughly the same level; seemingly the entries are being cached somewhere/somehow. If I make an instance which goes through a new file, then it will seem to create all these new orphans and so on for every new source file.

tripwire fucked around with this message at 21:13 on Apr 17, 2011

# ? Apr 17, 2011 21:09

Adbot: ADBOT LOVES YOU

# ? May 29, 2024 21:31

king_kilr: May 25, 2007

tripwire posted:

Sorry I guess I didn't explain myself clearly enough. I'm only dealing with about 4-5 of those FixedWidthFileFormat instances; 4-5 functions is what, a few kb of memory? The problem is that if those files have a lot of entries, all of the getters and setters on the entries which are no longer referenced still persist. If I create a new instance of FixedWidthFileFormat using the same source file and look at the memory as its going through the file, I can see that it always returns to roughly the same level; seemingly the entries are being cached somewhere/somehow. If I make an instance which goes through a new file, then it will seem to create all these new orphans and so on for every new source file.

No, you get one getter/setter per class, not per instance. If you're seeing an effective "leak" in your code something else is going on, post more (preferably in the form of a minimally reduced case).

A handle of functions is probably less than a kb

# ? Apr 17, 2011 23:38

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Python information and short questions megathread.

«‹›484 »