Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Locked thread
JetsGuy
Sep 17, 2003

science + hockey
=
LASER SKATES

Suspicious Dish posted:

Inheriting from object in Python 2 gives you a new-style class. It's a fancy system, but if you're using Python 2, always inherit from object.

Ok, so I'm confused, when I declare a class, I'll do something like:

Python code:
class hurf:

    def __init__(self, a, b, c):
        self.a = a
        self.b = b
        self.c = str(c)

    def durf(self, d):
        do_something
What is wrong with doing that? :confused:

Adbot
ADBOT LOVES YOU

evensevenone
May 12, 2001
Glass is a solid.
Nothing. You're just getting an old-style class (which is generally fine). If you do class hurf(Object): then you get a new-style class which has some advantages (basically the new object model makes a lot more sense and is more powerful). And if you're using Python 3 you're always getting the new-style classes.

There's a lot of resources out there comparing the two so I won't go into it all.

QuarkJets
Sep 8, 2008

Suspicious Dish posted:

So, you can't make objects threaded. Threads are like additional programs that run in the same memory space (in Linux, that's literally the only difference from forked processes — they get their own PID and everything)

The issue with this is that if one thread reads memory and then another thread writes memory, you end up with inconsistency issues between the two processes. Since Python's objects are refcounted, this can be disasterous: one object's refcount is at 1, Thread 1 reads it, Thread 2 increments it, and then Thread 1 decrements it to 0 and destroys the object, leaving Thread 2 with some freed memory. Boom.

To account for this, Python introduced something called the Global Interpreter Lock, or GIL, and a thread must take it when it wants to deal with Python objects. All Python threads wait on this lock until it becomes available, and the performance of the GIL is quite poor when combined with how kernel scheduling works.

Note that the GIL is only taken when something wants to interact with Python, so a thread can be waiting on I/O (read, poll), and another thread can come in and do its stuff. Some libraries like NumPy also release the GIL when some kinds of calculations go on, since they've been extremely careful to make all their special calculation objects thread-safe and such.

So, your directory system, it depends on what method you want to thread. If you want to thread the filling of each host object, perhaps through readdir (through os.walk or os.listwhatever), then each thread is going to be hanging in I/O, waiting to grab the lock, and it's unlikely you'll see much of a speedup, but try it out and profile. Computers are so complex that I can make a guess as to some performance thing based on everything I know, but I can't be 100% sure at all, and have been proven wrong before.

Note that you might have bugs when you start to thread! That's OK. Note that there still can be race conditions and thread bugs in your Python code — the GIL doesn't prevent those.

Some people use what's called a job thread model, where they have one thread dedicated to Graphics, one thread dedicated to disk I/O, one thread dedicated to Sound, etc. This is usually a good model for video games because most APIs used there, especially ones like OpenGL, are in no way thread-safe.

I tend to use what's called a worker thread model, where threads are spawned off to do a very specific task (read this PNG file from disk, decode the bytes) — where I share pretty much nothing except some starting task data, and the result data when nothing is complete. It's effectively a separate process without the overhead of IPC.

Oh boy, that was probably a bit too much. If you read all this and have any questions on it, feel free to ask more questions. It's complex stuff.

This is all easy to understand, from the way you've described it. Thanks.

My project consists of a step in which specific files are read in and then a second step in which calculatios are performed using data from those files, but the files and the data in the files is not being changed (IE results are stored in new variables and saved elsewhere on the disk, MySQL insertions are performed, etc). So I could set up a Queue for the read-in (which shouldn't benefit much due to a file I/O bottleneck) and another queue for the operations, and before queuing operations I would just need to wait for the read-in queue to empty, right? Alternatively, if the file I/O queue really doesn't improve speed at all, then I could just read everything in normally and then setup a queue for processing the data, so long as I'm careful about not changing the data that is being operated on.

I'm very comfortable with coding in Python and using Python, I've just never done Computer Science, which I see as "understanding what the code is making the computer actually do." I have a lot of experience in C++ as well, so I know all about memory management and reference passing and how memory management in Python works differently (IE basically Python uses something analogous to smart pointers, a chunk of memory can only cleaned up after the number of active references to that chunk of memory becomes 0, all done automatically), I'm just really clueless in threading and multiprocessing, and I'm not well-versed in the "guts" of Python, just the high-level stuff.

AKA there is a gap between what many of the hard science programs teach (a bare-bones introduction to scientific computing) and what is actually needed in the real world of hard science (actual computer science knowledge for real and efficient scientific computing), and individually overcoming this gap is what I've been trying to do for the last few years. This July I'll become eligible for free remote-learning university courses through my employer, I already have plans to get some intro-level CS courses under my belt in the hope of building a better understanding of what is going on under the hood.

Thern posted:

You can release the GIL? I need to investigate this further as I have something that is very I/O bound. And Multiprocessing is a bit of hack I feel.

According to this page the GIL is always unlocked when doing I/O. I don't know whether this helps you.

QuarkJets fucked around with this message at 20:58 on Mar 26, 2013

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe
The relationship between "Computer Science" and "Programming" is really vague and everybody has some different understanding of what they mean. I just meant more low-level programming more than anything, but it seems you have a decent handle on that.

So, I'm going to step it up to a higher level. What you want isn't threading, but parallelism. Threading is a technology that can be used to implement parallelism. I don't know enough details about your problem (your problem statement is an extremely generic, common problem), but it sounds like there's no reason that you couldn't run ten programs on four or so different computers that crunch the numbers, and then you put them into a database at the end.

There's of course no reason that you couldn't just have one program and a bunch of threads, but that design makes it harder to distribute to different computers in the end. Or one computer and multiple processes, etc.

There's of course trade-offs to all of these designs, and understanding all of these trade-offs is something that you'll come to learn as an engineer, but it's important to think of threading as a technology that can solve lots of problems, parallelization being one of them, not a solution.

There is so much to explore in this topic, but to pick one example, try Googling for "MapReduce". We've expanded this topic enough so that this might be a better question for the Ask General Questions, or perhaps it's time for a new General Concurrency/Parallelism thread.

FoiledAgain
May 6, 2007

Emacs Headroom posted:

I would think about splitting off all the classes into their own files, maybe inside of a package. For instance you could have
code:
LanguageFuckery/
    init.py
    Segment.py
    Nobule.py
    GlottalFricative.py
    PBase/
        init.py
        Morphemes.py
        Phonemes.py
        Monotremes.py
Then you could put all this in your path, and in your main you would be importing stuff like
Python code:
import LanguageFuckery.Segment as Segment
from LanguageFuckery import PBase

my_monotremes = PBase.Monotremes.global_monotreme_list()
or whatever.

I tried playing around with this, and I guess I'm not understanding something because I'm still running into import problems. For instance, virtually all my modules need import itertools. I thought I could import that once in my main.py module, but if I do that all my other modules throw NameError at me. It's not an effective solution to do import itertools in every module, because then why split them apart in the first place. Everything is sitting in /Lib/site-packages/LanguageFuckery so the problem is not in locating the modules.

evensevenone
May 12, 2001
Glass is a solid.
You need to do it in every module you write, because each module has its own namespace and can't mess with the namespace of whatever imported it. This is actually a good thing, otherwise one module could cause bugs in another module.

Also, you shouldn't be putting your own code in /Lib/site-packages, or need to.

FoiledAgain
May 6, 2007

evensevenone posted:

You need to do it in every module you write, because each module has its own namespace and can't mess with the namespace of whatever imported it. This is actually a good thing, otherwise one module could cause bugs in another module.

I have a pretty naive understanding of how importing works, so let me ask this:
if module A imports module B and C, and module B needs things from module C, will I always have to write import C in module B? In other words, can B access stuff that A imported by virtue of it having been imported into A, or does it have to import that stuff itself? (I hope this makes sense.)

edit: my original question for context.

quote:

Also, you shouldn't be putting your own code in /Lib/site-packages, or need to.

I did some googling, and came across some forum/blog posts that suggested this. Is it a bad idea stylistically, or is there something more serious that could go wrong? (I'd actually be glad to know this is a bad idea since I found it really inconvenient)

FoiledAgain fucked around with this message at 22:47 on Mar 26, 2013

Emacs Headroom
Aug 2, 2003

FoiledAgain posted:

I did some googling, and came across some forum/blog posts that suggested this. Is it a bad idea stylistically, or is there something more serious that could go wrong? (I'd actually be glad to know this is a bad idea since I found it really inconvenient)

Add the directory with all your work to your PYTHONPATH (or to sys.path) when you're working on it. When it's mature and ready to distribute / put on github, you can use setuptools or whatever to make a package that will go into site-packages.

evensevenone
May 12, 2001
Glass is a solid.

FoiledAgain posted:

I have a pretty naive understanding of how importing works, so let me ask this:
if module A imports module B and C, and module B needs things from module C, will I always have to write import C in module B? In other words, can B access stuff that A imported by virtue of it having been imported into A, or does it have to import that stuff itself? (I hope this makes sense.)

edit: my original question for context.


I did some googling, and came across some forum/blog posts that suggested this. Is it a bad idea stylistically, or is there something more serious that could go wrong? (I'd actually be glad to know this is a bad idea since I found it really inconvenient)


If module b wants to use C directly, you need an import C in B.

Now, you could import B and C in A, and make an object in A that is of a class from C, and pass that object to B, and B could use it. But you couldn't make an object in B that is of a class in C, without importing B.

If you think about the full names of the objects as they are created/instantiated/passed it makes sense.

FoiledAgain
May 6, 2007

evensevenone posted:

Now, you could import B and C in A, and make an object in A that is of a class from C, and pass that object to B, and B could use it. But you couldn't make an object in B that is of a class in C, without importing B.

This is a really helpful explanation. Thank you!

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe

evensevenone posted:

Now, you could import B and C in A, and make an object in A that is of a class from C, and pass that object to B, and B could use it. But you couldn't make an object in B that is of a class in C, without importing B.

Well, sure you can. If A passes an object of C to B, then you can use type(obj_C)() in B.

Importing is about names, not objects or anything else like that. If you want to access the value at some name in another module, you have to import it.

evensevenone
May 12, 2001
Glass is a solid.
That's true. Although, you'd have to be an rear end in a top hat.

DoggesAndCattes
Aug 2, 2007

spankweasel posted:

Assigning a card from the deck to a player is akin to moving one element from a list to another.

# non random version
deck = [2H, 3H, 4H, 5H]
player.draw_card() # pseudo function...
deck = [3H, 4H, 5H]
player = [2H]

Right?

So, you just need to manipulate the lists.

You could use random.choice to pick one of the cards. Then use deck.remove(choice) followed by player.append(choice).

evilentity posted:

I was bored so I did this for you. Try to use this as guidelines for your own work and dont just copy verbatim. At least try to understand why and how it does things. Ive commented things that I thought might be hard. There is no error checking so you could add that.

Python code:
import random

DECK_ID = 0
PLAYER_ID = 1
COMP_ID = 2

SUITS = ("hearts", "diamonds", "spades", "clubs")
RANKS = ("Ace", "Two", "Three", "Four", "Five", "Six", "Seven",
            "Eight", "Nine", "Ten", "Jack", "Queen", "King")

def create_deck():
    deck = []
    for suit in SUITS:
        for rank in RANKS:
            deck.append([rank, suit, DECK_ID])
    return deck

def clear_deck(deck):
    # change id in card to deck
    for card in deck:
        card[2] = DECK_ID
   
def print_deck(deck):
    # print simple header
    print ' ID  CARD               LOCATION'
    for i, card in enumerate(deck):
        # create string with cards name
        card_name = card[0] + ' of ' + card[1] 
        # create columns, :>3 pads string to len 3
        print '{id:>3}  {card:<17}  {deck}'.format(id=i, card=card_name, deck=card[2])
    # footer
    print ' ID  CARD               LOCATION'

def new_hand(deck, id):
    # create list with cards in deck that are not in other hands
    available_cards = [x for x in deck if x[2] == 0]
    # get 5 random cards from them
    hand = random.sample(available_cards, 5)
    # cards int hand are references for cards in deck
    # so changing them affects deck as well
    for card in hand:
        card[2] = id
    return hand

def print_hand(hand):
    for card in hand:
        print card
    
def main():
    deck = create_deck()
    player_hand = new_hand(deck, PLAYER_ID)
    print_hand(player_hand)
    comp_hand = new_hand(deck, COMP_ID)
    print_hand(comp_hand)
    
    print_deck(deck)
    clear_deck(deck)
    print_deck(deck)
        
if __name__ == '__main__':
    main()
Feel free to improve this.

Popper posted:

This is basically how people are thought classes.
Think like this:

Python code:

class Deck(object):
	
	def __init__(self):
		self.suits = #suit names
		self.ranks = #rank names
		self.cards = []

	def build(self):
		# Cards have three attributes, suit, rank, and the player who owns them. Here they all belong to "Dealer"
		[[self.cards.append(Card(suit, rank, "Dealer")) for rank in self.ranks] for suit in self.suits]
etc...

Card should be a class too.

There are faster ways to do this with dicts and namedtuples but get a class implementation working first.

Sorry that this was last page, but I just wanted to say thanks for your help and advice. I haven't looked at anything you guys posted because I've been so busy studying for a math test over the weekend, Spanish, and catching up on this computer sciency stuff. I already turned in what I have, but thanks for helping me out so I can get some inspiration and figure out how all this code stuff works. Thanks!

Popper
Nov 15, 2006

JetsGuy posted:

What is wrong with doing that? :confused:

If you're going to use inheritance in 2.x, you will at some point need to inherit from object.
Classes work fine without it but you can't use super().
At the end of the day I do it out of habit, but it's a good habit.

Dren
Jan 5, 2001

Pillbug
Anyone know a python 2.7 lib that can give me a file like object handle to the binary data in an mp3 file?

I want to md5sum it.

Lysidas
Jul 26, 2002

John Diefenbaker is a madman who thinks he's John Diefenbaker.
Pillbug
Use the mmap module.

Python code:
#!/usr/bin/env python2
from __future__ import print_function
from argparse import ArgumentParser
from hashlib import sha256
from mmap import mmap, ACCESS_READ

def hash_file(filename):
    with open(filename, 'rb') as f:
        m = mmap(f.fileno(), length=0, access=ACCESS_READ)
        h = sha256(m).hexdigest()
        m.close()
        return h

if __name__ == '__main__':
    p = ArgumentParser()
    p.add_argument('filename', nargs='+')
    args = p.parse_args()
    for f in args.filename:
        print('{}  {}'.format(hash_file(f), f))
code:
$ ./hash.py test.mp3 
af96c138010186b1606473fbbf28221c761d43a00d0ee8559d4239ac1db45ee9  test.mp3
Substituting a broken hash function is left as an exercise.

The Gripper
Sep 14, 2004
i am winner

Dren posted:

Anyone know a python 2.7 lib that can give me a file like object handle to the binary data in an mp3 file?

I want to md5sum it.
You could try PyMedia, I think all you'd need to do is:
Python code:
import pymedia.audio.acodec as acodec
import pymedia.muxer as muxer
demuxer = muxer.Demuxer('mp3')
file = open("dongs.mp3",'rb') 
data = f.read()
frames = demuxer.parse(data)
That should get you just the actual compressed audio so you can compare with other mp3s (I'm assuming you want to find duplicate MP3s with differing metadata).

e; I guess this doesn't give you a file-like object though, but it can probably be shoehorned in to something like it.

The Gripper fucked around with this message at 15:21 on Mar 27, 2013

Dren
Jan 5, 2001

Pillbug
PS - If anyone wants to do some interesting reading about the GIL here's a python 3 ticket where some valiant efforts were made to try to replace the GIL with a scheduler. http://bugs.python.org/issue7946

One of the topics that comes up is that the current GIL really sucks at task switching when you use the threading model of I/O bound thread + computationally bound thread. That is, under the current GIL either the I/O bound thread or the computationally bound thread tends to hog execution instead of evenly splitting time. Solving this task switching problem would go a long way toward removing the stigma surrounding python threading. Unfortunately, the scheduler implementations in that ticket were rejected for 3.2.

Dren
Jan 5, 2001

Pillbug
Lysidas, thank you for your suggestion. Perhaps I wasn't clear but I actually wanted the solution The Gripper suggested. I want to compare file data (using a broken hash function) without taking the metadata into account.

The Gripper posted:

You could try PyMedia, I think all you'd need to do is:
Python code:
import pymedia.audio.acodec as acodec
import pymedia.muxer as muxer
demuxer = muxer.Demuxer('mp3')
file = open("dongs.mp3",'rb') 
data = f.read()
frames = demuxer.parse(data)
That should get you just the actual compressed audio so you can compare with other mp3s (I'm assuming you want to find duplicate MP3s with differing metadata).

e; I guess this doesn't give you a file-like object though, but it can probably be shoehorned in to something like it.

Thanks, I should be able to shoehorn that into something that the md5sum module will accept.

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe
If you have two different MP3 files that were gotten from two different sources, it's extremely unlikely that they'll compare equal, even after removing metadata. There are too many MP3 encoders out there, with too many encoder options and things to tweak. Even if you have two different uncompressed audio tracks ripped from the same album, minor variations in CDs, CD disc drives and rippers can produce wildly different results.

You probably want to build an audio fingerprint instead to compare similar audio tracks.

Dren
Jan 5, 2001

Pillbug

Suspicious Dish posted:

If you have two different MP3 files that were gotten from two different sources, it's extremely unlikely that they'll compare equal, even after removing metadata. There are too many MP3 encoders out there, with too many encoder options and things to tweak. Even if you have two different uncompressed audio tracks ripped from the same album, minor variations in CDs, CD disc drives and rippers can produce wildly different results.

You probably want to build an audio fingerprint instead to compare similar audio tracks.

Yeah, I'm doing this for someone else. I told him the same thing. However, he's concerned about honest duplicate data rather than stuff that fingerprints the same.

pymedia is loving me. Doesn't build with gcc 4.x and doesn't find any of the libraries it wants on ubuntu.

Emacs Headroom
Aug 2, 2003

Suspicious Dish posted:

You probably want to build an audio fingerprint instead to compare similar audio tracks.

Plus the audio fingerprinting will be more fun to write. You'll have to decide on a good feature space based on the spectrogram or something, which will be interesting (you could also use PCA to find features or employ some machine learning if you have enough examples).

Dren
Jan 5, 2001

Pillbug
In case anyone is interested, here is a "good enough" solution using audioread (https://github.com/sampsyo/audioread). You can get audioread with pip.

Python code:
import audioread
import hashlib

with audioread.audio_open("poop.mp3") as f:
  m = hashlib.md5()
  for buf in f:
    m.update(buf)
  print m.hexdigest()
The current APIs for doing this stuff are lacking. audioread got me sort of where I want to go. However, it hands me back a PCM version of the audio datastream. It's not what I wanted but it's close enough.

It seems most music file libraries are coded to do some specific task but there are no good general purpose ones. E.g. mutagen has support for python native metadata reading with a ton of container types but doesn't give you a handle to the music datastream. pyglet can play lots of different audio file types but that's all it can do, play them. There is no hook between the decode step and the play step. pymedia is a piece of a larger project called pycar "a media center for a car based PC( personal companion )". As such, it hasn't been updated since 2006, only builds on GCC 3.x, and fails to detect the existence of installed dev libraries it depends on.

If mutagen would mmap and return back the music datastream it'd be nearly a one stop shop. For the returned music to be generally useful there would need to be a hook to decode the datastream to a common format like PCM, the way audioread does.

I saw some fingerprinting libraries while I was poking around but since I'm not particularly interested in them I'm not gonna do a write up about them.

Dren
Jan 5, 2001

Pillbug
Here's a program to generate a pickled version a dictionary that maps file paths to hashes of audio data.

http://pastebin.com/7kbARcgN

It uses coroutines. Also, you can interrupt it with ctrl+c then resume from where you left off the next time you execute.

edit: If anyone wants to get nuts hook up multiprocessing or multithreading to the coroutine stuff so that multiple files are processed at a time.

edit 2: I wrote a mt version. GIL is limiting me to 130%-225% cpu while processing with 4 producer threads and 1 consumer but that's better than 100%.

edit 3: fixed some bugs in the mt version and I'm up to 400%-450% cpu w/ 4 producers and 1 consumer

Dren fucked around with this message at 22:02 on Mar 27, 2013

JetsGuy
Sep 17, 2003

science + hockey
=
LASER SKATES
Man, I tried to install pyqt today, and easy_install bitched and moaned that there wasn't a setup script. This isn't a great start for my exploration... :/

EDIT:
And the source straight from riverbank is all screwy. You need to install sip first, and the README there says to use a file that doesn't exist.

EDIT2:
Figured out part of it, still getting fun errors out the rear end...

For the life of me, I can't figure out what the poo poo this is bitching about. The only thing google says is to have XTools with command line support, which I do...

quote:

clang++ -c -pipe -mmacosx-version-min=10.6 -fno-strict-aliasing -O2 -fPIC -Wall -W -DQT_DISABLE_DEPRECATED_BEFORE=0x040900 -DQT_NO_DEBUG -DQT_GUI_LIB -DQT_CORE_LIB -I/Users/JetsGuy/Qt5.0.1/5.0.1/clang_64/mkspecs/macx-clang -I. -I/Library/Frameworks/Python.framework/Versions/2.7/include/python2.7 -I../../QtCore -I. -I/Users/JetsGuy/Qt5.0.1/5.0.1/clang_64/include -I/Users/JetsGuy/Qt5.0.1/5.0.1/clang_64/include/QtGui -I/Users/JetsGuy/Qt5.0.1/5.0.1/clang_64/lib/QtGui.framework/Versions/5/Headers -I/Users/JetsGuy/Qt5.0.1/5.0.1/clang_64/include/QtCore -I/Users/JetsGuy/Qt5.0.1/5.0.1/clang_64/lib/QtCore.framework/Versions/5/Headers -I. -I/System/Library/Frameworks/OpenGL.framework/Versions/A/Headers -I/System/Library/Frameworks/AGL.framework/Headers -o qpycore_qabstracteventdispatcher.o qpycore_qabstracteventdispatcher.cpp
make[2]: clang++: No such file or directory
make[2]: *** [qpycore_qabstracteventdispatcher.o] Error 1
make[1]: *** [all] Error 2
make: *** [all] Error 2

JetsGuy fucked around with this message at 05:07 on Mar 28, 2013

Dren
Jan 5, 2001

Pillbug
The error you posted seems to indicate that you don't have the clang++ compiler installed, have a screwy makefile, or some combination of the two.

Try installing the clang++ compiler.

accipter
Sep 12, 2003
PySide has pre-compiled binaries for OS X.

Dominoes
Sep 20, 2007

Learning Qt with no programming experience is proving to be a PITA. The Zetcode tutorial has some great example code that I've been able to use and manipulate, but is fairly limited in what it demonstrates.

There are several other tutorials online, all of which seem to use very different syntax - and I can't get any of them to work. I setup Qt Designer, and have been creating windows, then analyzing the code using puic, but can't get the gui to display, even when trying wrapper styles from several tutorials. Can anyone explain how to get QT Designer 5 code ->puic to run in Python 3? I'm not getting any errors, just an output of >>>.

What style do you recommend as the best way to learn Qt? I'm planning to stick with the Zetcode style, but am having some issues doing things not directly described in it: For example, how can I have two separate layout widgets in a main window? This seems dramatically more difficult than learning basic Python - it might be due to lack of strong tutorials, like Codeacademy's.

Dominoes fucked around with this message at 04:29 on Mar 29, 2013

evensevenone
May 12, 2001
Glass is a solid.
I can't help you with Designer, but you can use QFrames to nest layouts.

wwb
Aug 17, 2004

wwb posted:

I'm starting to get into python a bit. I've done a few tutorials, learned the hard way, etc. But I'm running into an issue my google-fu fails at -- how does one properly structure a non-trivial python project.

For example, lets say you decided to roll your own RSS aggregator after Google told you go GFY. I come from a .NET background, given that I would structure it something like:

-- RSSReader.Core -> class library that provides most of the core functionality to the project, no UI just a .DLL
-- RSSReader.Core.Tests -> unit tests for said library
-- RSSReader.Web -> Web app dependent upon MyProject.Core
-- RSSReader.Cli -> Command line app for cron jobs, etc.

Is this just insane to approach this way in Python? Or if it isn't how would I build this?

If it helps I'm using jetbrains PyCharm as my IDE.

Bumping this a bit as it got subsumed by a much more exciting conversation about multithreaded python.

Anyhow, any advices guys?

Smarmy Coworker
May 10, 2008

by XyloJW
Why does packaging in Python have to be so weird :(

I am following instructions but stuff doesn't work. My package structure is:
code:
package/
| __init__.py
|
| subpackage1/
| | __init__.py
| | {files}
|
| subpackage2/
| | __init__.py
| | {files}
But for a file in subpackage1,
Python code:
import package.subpackage2.x
- ImportError: No module named package.subpackage2.x

from package.subpackage2 import x
- ImportError: No module named package.subpackage2

import subpackage2.x
- ImportError: No module named subpackage2.x

from subpackage2 import x
- ImportError: No module named subpackage2
how do I even do this


edit:
I used PyDev for Eclipse to convert the main package folder to a source folder because it wasn't for some reason and it works?? Doesn't make sense to me though.

edit again:
It only works in Eclipse :(

Smarmy Coworker fucked around with this message at 17:18 on Mar 29, 2013

Thern
Aug 12, 2006

Say Hello To My Little Friend
I'm giving this a shot because this confuses me in Python sometimes to.

code:
from package.subpackage2.x import ClassName
from package.subpackage2.x import *

OnceIWasAnOstrich
Jul 22, 2006

If the package/ folder isn't in your PYTHONPATH then the interpreter doesn't know how to find those, I don't think it will automatically just search parent directories to find other packages, there is nothing that indicates to python that what you have is a subpackage necessarily. You can only import things in the same package or from things in PYTHONPATH. Presumably converting it to a source folder causes Eclipse to add the folder to your PYTHONPATH for whatever method it uses to run the script.

OnceIWasAnOstrich fucked around with this message at 17:59 on Mar 29, 2013

DoggesAndCattes
Aug 2, 2007

Hi, I'm trying the exercises in chapter 9 of How to Think Like a Computer Scientist. I'm going through this whole book that was recommended for the intro CompSci class that I'm currently taking.

Book: http://openbookproject.net/thinkcs/python/english2e/ch09.html

I have a few questions about some of the exercises.

Exercise #1
Write a loop that traverses:

code:
['spam!', 1, ['Brie', 'Roquefort', 'Pol le Veq'], [1, 2, 3]]
and prints the length of each element. What happens if you send an integer to len? Change 1 to 'one' and run your solution again.

This is what I wrote:
code:
testlist = ['spam!', 'one', ['Brie', 'Roquefort', 'Pol le Veq'], ['1', '2', '3']]
elem1 = testlist[2]
elem2 = testlist[3]

for index, value in enumerate(testlist):
    print len(testlist[index])

print "Now for the first nested list!"
for index, value in enumerate(elem1):
    print len(elem1[index])

print "Now for the second nested list!"
for index, value in enumerate(elem2):
    print len(elem2[index])
Is there a better way of doing this or anything I should consider in making this more simple or efficient while trying to comply to.. being Pythonistic?


EXERCISE #2
Open a file named ch09e02.py and with the following content:

code:
#  Add your doctests here:
"""
"""

# Write your Python code here:


if __name__ == '__main__':
    import doctest
    doctest.testmod()
Add each of the following sets of doctests to the docstring at the top of the file and write Python code to make the doctests pass.

Here's one of the doctests it wants me to pass.
code:
"""
  >>> a_list[3]
  42
  >>> a_list[6]
  'Ni!'
  >>> len(a_list)
  8
"""
And this is the code I wrote.
code:
a_list = [0,0,0,42,0,0,'Ni!',0]

def test(a_list):
    """
        >>> a_list[3]
        42
        >>> a_list[6]
        'Ni!'
        >>> len(a_list)
        8
    """
    return a_list

if __name__ == '__main__':
    import doctest
    doctest.testmod()
I'm run it through the interpreter and I get this:

*** DocTestRunner.merge: '__main__.test' in both testers; summing outcomes.
*** DocTestRunner.merge: '__main__' in both testers; summing outcomes.

Now I'll go through the examples in the chapter listed above and get it to say that it's passed(at least I remember doing so about an hour ago), so why am I getting this now?

Question About a Piece of Code I wrote for an Exercise
code:
list1 = [1,2,3,4]
list2 = [5,6,7,8]

def add_vectors(u,b):
    index = 0
    list3 = [i for i in u]
    for index, value in enumerate(u):
        answer = u[index] + b[index]
        list3[index] = answer
        index += 1
    return list3

list3 = add_vectors(list1,list2)
print list3
Well, I wrote this to pass a doctest in one of the exercises, but I don't understand a piece of code. I actually copied it from another project I wrote earlier in this semester.
list3 = [i for i in u]
I don't know how to translate i for i in u in an English statement. I know what it does, but I don't know how to say it if I were having a conversation with someone. I get to the assignment to list3 is the list with as many indices... and then I can't really think of how to say it.

Thanks for your help!

QuarkJets
Sep 8, 2008

[i for i in u] will create a list that is just all of the elements in u. If u is list1, then list3 will be filled with the same values as list1 with that line. You don't appear to be using enumerate, so instead of keeping track of an index and incrementing it you could just:

code:
    list3 = [i for i in u]
    for i in xrange(len(u)):
        list3[i] = u[i]+b[i]
        index += 1
    return list3
Or the far more awesome one-line way:

code:
    return [u[i]+b[i] for i in xrange(len(u))]
This generates an index i that loops over the length of u. For each value of i, a new entry is the list is created that is equal to u[i]+b[i].

This isn't great though because if b is shorter than u then the code will bomb out

Smarmy Coworker
May 10, 2008

by XyloJW

OnceIWasAnOstrich posted:

If the package/ folder isn't in your PYTHONPATH then the interpreter doesn't know how to find those, I don't think it will automatically just search parent directories to find other packages, there is nothing that indicates to python that what you have is a subpackage necessarily. You can only import things in the same package or from things in PYTHONPATH. Presumably converting it to a source folder causes Eclipse to add the folder to your PYTHONPATH for whatever method it uses to run the script.

why is Python so weird??

What if I did something with site or added the main package location to sys.path in every module? I don't really want to dick around with PYTHONPATH right now because I'm only doing test stuff.

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe
It's actually pretty standard -- it's known as "the search path problem". Eclipse just sets up the search path for you. Don't mess around with sys.path or the site module. Those are bad ideas and will break your environment.

evilentity
Jun 25, 2010

ARACHNOTRON posted:

why is Python so weird??

What if I did something with site or added the main package location to sys.path in every module? I don't really want to dick around with PYTHONPATH right now because I'm only doing test stuff.

What is so weird about it? It will search folder the file is in, subfolders, path and libs. It doesnt magically know wheres your stuff located. Try structuring your project in different way. Path is hardly magic. If you really must you can use this:
Python code:
import imp
foo = imp.load_source('module.name', '/path/to/file.py')
via SO

Mad Pino Rage posted:

Write a loop that traverses:

code:
['spam!', 1, ['Brie', 'Roquefort', 'Pol le Veq'], [1, 2, 3]]
and prints the length of each element. What happens if you send an integer to len? Change 1 to 'one' and run your solution again.

Python code:
testlist = ['spam!', 'one', ['Brie', 'Roquefort', 'Pol le Veq'], ['1', '2', '3']]

for item in testlist:
    print len(item)
...
Unless you need the index for something other than acces you dont need to enumerate. Are you sure you have to do the same for inner lists?

Mad Pino Rage posted:

Open a file named ch09e02.py and with the following content:

I'm run it through the interpreter and I get this:

*** DocTestRunner.merge: '__main__.test' in both testers; summing outcomes.
*** DocTestRunner.merge: '__main__' in both testers; summing outcomes.

Now I'll go through the examples in the chapter listed above and get it to say that it's passed(at least I remember doing so about an hour ago), so why am I getting this now?
Did you forget -v after your script for verbose output?

Mad Pino Rage posted:

Well, I wrote this to pass a doctest in one of the exercises, but I don't understand a piece of code. I actually copied it from another project I wrote earlier in this semester.
list3 = [i for i in u]
I don't know how to translate i for i in u in an English statement. I know what it does, but I don't know how to say it if I were having a conversation with someone. I get to the assignment to list3 is the list with as many indices... and then I can't really think of how to say it.

Thanks for your help!
Its a list comprehension. [i for i in u] could be written more clearly like this: [(do) x for (each) item in list].

QuarkJets
Sep 8, 2008

ARACHNOTRON posted:

why is Python so weird??

What if I did something with site or added the main package location to sys.path in every module? I don't really want to dick around with PYTHONPATH right now because I'm only doing test stuff.

C and other languages work in a similar way; would you prefer it if Python/C/etc recursively searched your entire file system every time that you tried to import something?

Setting your PYTHONPATH only takes a second. Just point it to the directory where you're currently testing things. It would also be way faster than any of the workarounds that you're considering

Adbot
ADBOT LOVES YOU

Smarmy Coworker
May 10, 2008

by XyloJW
It would be nice if it searched back until it didn't find an init file. I don't know.

I have only done Java packaging and it kinda just works, especially with JARs :(

  • Locked thread