Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Locked thread
QuarkJets
Sep 8, 2008

theguyoncouch posted:

Alright i'll mess with pylint instead

Python code:
class Employee:
    'base class called for employee'
    def __init__(self, lastname, firstname, eid,
                 clockin, clockout, flag, admin):
        self.lastname = lastname
        self.firstname = firstname
        self.eid = eid
        self.clockin = clockin
        self.clockout = clockout
        self.flag = flag
        self.admin = admin

You can still use defaults, like you had in your original example. For instance, if you wanted to make flat and admin operational parameters that default to False:

Python code:
class Employee:
    'base class called for employee'
    def __init__(self, lastname, firstname, eid,
                 clockin, clockout, flag=False, admin=False):
        self.lastname = lastname
        self.firstname = firstname
        self.eid = eid
        self.clockin = clockin
        self.clockout = clockout
        self.flag = flag
        self.admin = admin
Having 8 arguments in a class constructor is fine, just disable that feature. It's warning you because you can hypothetically design a function or class method to have fewer arguments. For instance, if some of the arguments were redundant or unused, then you'd want to remove those.

QuarkJets fucked around with this message at 20:09 on Sep 11, 2014

Adbot
ADBOT LOVES YOU

grate deceiver
Jul 10, 2009

Just a funny av. Not a redtext or an own ok.
This is probably something super dumb, but can anyone tell me why this thing doesn't work:

code:
def pick_best_word(hand, words_points):
    best_word = ""
    best_score = 0
    for word in words_points:
        temp_hand = hand
        error = 0
        for letter in word:
            if letter in temp_hand and temp_hand[letter] > 0:
                temp_hand[letter] -= 1                 
            else:
                error += 1
        if error == 0 and words_points[word] > best_score: 
            best_word = word
            best_score = words_points[word]
    return best_word
Both arguments are dictionaries, hand containing letters paired with the number of times a letter can be used, and words_points containing words and score for each word. For a given collection of letters, this function should return the highest scoring word that can be composed of those letters (I know it's inefficient as hell, that's beside the point). When I run this function, it returns an empty string. The problem is, the function somehow modifies the original hand, despite the fact that all operations are done on temp_hand. When I print the contents of hand after the function runs, all values are zeroed out.

Why is it happening? Shouldn't temp_hand be refreshed on every pass of the loop? And why is the original hand modified in the first place?

coaxmetal
Oct 21, 2010

I flamed me own dad

grate deceiver posted:

This is probably something super dumb, but can anyone tell me why this thing doesn't work:

code:
def pick_best_word(hand, words_points):
    best_word = ""
    best_score = 0
    for word in words_points:
        temp_hand = hand
        error = 0
        for letter in word:
            if letter in temp_hand and temp_hand[letter] > 0:
                temp_hand[letter] -= 1                 
            else:
                error += 1
        if error == 0 and words_points[word] > best_score: 
            best_word = word
            best_score = words_points[word]
    return best_word
Both arguments are dictionaries, hand containing letters paired with the number of times a letter can be used, and words_points containing words and score for each word. For a given collection of letters, this function should return the highest scoring word that can be composed of those letters (I know it's inefficient as hell, that's beside the point). When I run this function, it returns an empty string. The problem is, the function somehow modifies the original hand, despite the fact that all operations are done on temp_hand. When I print the contents of hand after the function runs, all values are zeroed out.

Why is it happening? Shouldn't temp_hand be refreshed on every pass of the loop? And why is the original hand modified in the first place?

what you want is
code:
from copy import copy
temp_hand = copy(hand)
what you are doing isn't making a different object, just assigning the name 'temp_hand' to exactly the same dictionary. Not sure if that would solve the problem entirely, but that is why hand is being modified.

alternatively, instead of using copy, you could do something like

code:
temp_hand = {}
temp_hand.update(hand)

coaxmetal fucked around with this message at 21:19 on Sep 11, 2014

grate deceiver
Jul 10, 2009

Just a funny av. Not a redtext or an own ok.
Thanks, that was it.

So does it work this way with all data types or is it just a dictionary thing? Like, if I did list2 = list1, it wouldn't make a copy as well?

Edison was a dick
Apr 3, 2010

direct current :roboluv: only

coaxmetal posted:

code:
temp_hand = {}
temp_hand.update(hand)

Or
code:
temp_hand = dict(hand)

namaste friends
Sep 18, 2004

by Smythe
I'm trying to scrape CLI output and I've written a crude finite state machine to help me. I'm just not sure that I'm doing this right so any feedback is appreciated.

The input looks something like this:

code:
  Node: 1
  Interface: interface1
  NIC Name: eth0
  Abstract: 1
                 subnetA:poolA
  Addresses: 1
                    128.9.1.1

  Node: 1
  Interface: interface2
  NIC Name: eth1
  Abstract: 3
                 subnetB:poolB
		 subnetB:poolC
		 subnetB:poolD
  Addresses: 3
             	    128.10.1.1
		    128.10.1.2
		    128.10.1.3
I'm taking this output and stuffing it into a dict. As you can see, the problem here is that the 'Abstract' and 'Addresses' attributes are variable in element numbers.

My code looks something like this:

code:
class State():

    def __init__(self, state):
        self.state = state
        self.sub_state = None

    def proc_enter(self, in_state):
        print 'Entering State Change'
        self.state = in_state

    def proc_exit(self, in_state):
        print 'Exiting State Change'
        self.state = in_state

    def proc_sub_state_enter(self, sub_state):
        self.sub_state = sub_state

    def proc_sub_state_exit(self):
        self.sub_state = None

def process(in_array):

    machine = State(False)
    data = []

    for element in in_array:
        if re.search(r'Node:', element):
            data_element = defaultdict()
            data_element['Node'] = re.findall(r'Node:\s(\d)', element)[0]
            print 'Begin processing'
            machine.proc_enter(True)

        elif re.search(r'Abstract', element):
            print 'Entering Abstract'
            if re.findall(r'Abstract:\s(\d+)', element)[0] != 0:
                machine.proc_sub_state_enter('Abstract')

        elif machine.sub_state == 'Abstract' and re.search(r'\S+:\S+', element):
            print 'Processing Abstract'
            data_element.setdefault('Abstract', list()).append(element)

        elif machine.sub_state == 'Abstract' and re.search(r'Addresses', element):
            print 'Entering Addresses'
            if re.findall(r'Addresses:\s(\d+)', element)[0] != 0:
                machine.proc_sub_state_enter('Addresses')

        elif machine.sub_state == 'Addresses' and re.search(r'\b\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}\b', element):
            print 'Processing Addresses'
            data_element.setdefault('Addresses', list()).append(element)

        elif machine.state is True:
            <process the rest of the fields>

        if element == '':
            print 'End processing'
            data.append(data_element)
            machine.proc_exit(False)
            machine.proc_sub_state_exit()

    print json.dumps(data, indent=4)
My code works quit nicely. I think I'm going to precompile my regular expressions and just reuse them to speed up the code. Have I implemented the FSM right or is there room for improvement?

Thanks

coaxmetal
Oct 21, 2010

I flamed me own dad

Edison was a dick posted:

Or
code:
temp_hand = dict(hand)

oh yeah I forgot you could just pass a dict to dict. That's probably the best way to do it.

SurgicalOntologist
Jun 17, 2004

dict has a copy method. Someone needed to say it.

Abd yes, it works that way with all data types. When you assign something in Python, whatever object that name used to point to is unaffected. This is a really good explanation.

Edison was a dick
Apr 3, 2010

direct current :roboluv: only

Cultural Imperial posted:

My code works quit nicely. I think I'm going to precompile my regular expressions and just reuse them to speed up the code.

It probably won't save you much time, since you're not using enough different regular expressions for any of them to fall out of the built-in regular expression cache, and the cache lookup is pretty quick.
However, if precompiling your regular expressions makes it more readable, go ahead.

You are using collections.defaultdict wrong though. You should be constructing it like

code:
data_element = defaultdict(list)
And using it like

code:
data_element['Abstract'].append(element)
If you're using setdefault, it doesn't even need to be a defaultdict, as regular dicts have that method.

In general, I prefer defaultdicts over setdefault, since you need to construct the default value you want to pass into it every time you call setdefault, since python doesn't lazily evaluate function parameters.


If you can help it, avoid parsing command output, as a lot of the time the author assumes that it's output designed to be consumed by humans, so you end up needing to rework your code every time you update your version of the tool.

If there's an official API then it's definitely worth working out how to use it in preference to parsing the output of the command line tool.

Edison was a dick fucked around with this message at 22:20 on Sep 11, 2014

Literally Elvis
Oct 21, 2013


Did you guys present at the APUG meeting last night? I could have sworn someone involved in Bokeh had posted here before, but I was too impressed/dumbstruck to have bothered with the usual stairs/protected ordeal.

Comrade Gritty
Sep 19, 2011

This Machine Kills Fascists
More pandas questions!

I have a dataframe which has a DatetimeIndex with an entry for each day. I want to resample this with:

code:
df.resample("W"), how="sum")
This works fine except the start and end of my dataframe fall within the middle of what pandas is considering a week and thus the numbers are wrong/incomplete compared to the rest of the graph. Is there a sane way I can filter the dataframe (either before or after the resample) so it'll just exclude the leading/trailing partial weeks? I can figure it out manually and just hardcode the dates but i'd rather not need to update this script anytime I add more data.

BigRedDot
Mar 6, 2008

Literally Elvis posted:

Did you guys present at the APUG meeting last night? I could have sworn someone involved in Bokeh had posted here before, but I was too impressed/dumbstruck to have bothered with the usual stairs/protected ordeal.

Yes, that was me giving the talk. :)

SurgicalOntologist
Jun 17, 2004

Steampunk Hitler posted:

More pandas questions!

I have a dataframe which has a DatetimeIndex with an entry for each day. I want to resample this with:

code:
df.resample("W"), how="sum")
This works fine except the start and end of my dataframe fall within the middle of what pandas is considering a week and thus the numbers are wrong/incomplete compared to the rest of the graph. Is there a sane way I can filter the dataframe (either before or after the resample) so it'll just exclude the leading/trailing partial weeks? I can figure it out manually and just hardcode the dates but i'd rather not need to update this script anytime I add more data.

This is just simple indexing.

code:
df.resample("W"), how="sum").iloc[1:-1, :]

Comrade Gritty
Sep 19, 2011

This Machine Kills Fascists

SurgicalOntologist posted:

This is just simple indexing.

code:
df.resample("W"), how="sum").iloc[1:-1, :]

A durr, I'm still coming to grips with all the different pandas stuff :/

On the plus side, charts:

Blinkz0rz
May 27, 2001

MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS
I think I asked this in the Django thread but since Bokeh is coming up here I may as we'll ask now. Is there any way to integrate Bokeh visualizatons into Django other than using an iframe?

Ideally I'd love if the plot server could generate a lot of the boilerplate interface code and allow me to render it via a Django template. Is this possible?

BigRedDot
Mar 6, 2008

Blinkz0rz posted:

I think I asked this in the Django thread but since Bokeh is coming up here I may as we'll ask now. Is there any way to integrate Bokeh visualizatons into Django other than using an iframe?

Ideally I'd love if the plot server could generate a lot of the boilerplate interface code and allow me to render it via a Django template. Is this possible?

Absolutely. All our examples happen to use Flask but that's just because it's quick and simple. There are alot of ways to embed Bokeh plots, the User Guide section on embedding has been recently expanded and improved: http://bokeh.pydata.org/docs/user_guide_embedding.html. Hopefully that page has enough information to get you going, but if not, please let me know and I will update the docs. You probably want the components function (if you want all the data inline in the document), or one of the autoload functions (if you want the data in a sidecar .js script, or on the bokeh-server).

Edit: Though since you mentioned using an iframe, you probably have plots on a bokeh-server? In that case you'd want autoload_server

namaste friends
Sep 18, 2004

by Smythe

Edison was a dick posted:

It probably won't save you much time, since you're not using enough different regular expressions for any of them to fall out of the built-in regular expression cache, and the cache lookup is pretty quick.
However, if precompiling your regular expressions makes it more readable, go ahead.

You are using collections.defaultdict wrong though. You should be constructing it like

code:
data_element = defaultdict(list)
And using it like

code:
data_element['Abstract'].append(element)
If you're using setdefault, it doesn't even need to be a defaultdict, as regular dicts have that method.

In general, I prefer defaultdicts over setdefault, since you need to construct the default value you want to pass into it every time you call setdefault, since python doesn't lazily evaluate function parameters.


If you can help it, avoid parsing command output, as a lot of the time the author assumes that it's output designed to be consumed by humans, so you end up needing to rework your code every time you update your version of the tool.

If there's an official API then it's definitely worth working out how to use it in preference to parsing the output of the command line tool.

Cool thanks!

FoiledAgain
May 6, 2007

I'm wondering if anyone could tell me about alternatives to Python's built-in pickle module? We're currently using this on a project as a simple way to save and load user data, but some problems have come up recently that make me wonder about other options. In particular, this is a very active codebase, so pickled instances of classes often fail to be unpickled correctly when the attributes of their classes have been changed. We have an easy work-around, but it is annoying. This is even more annoying for a user of this program who has to update their saved files every time we update our code. This is the first time that I've had to deal with pickling/serialization, so I'm not even sure where to go looking. I know that there are various security issues surrounding this, so maybe it's also worth mentioning that users should never exchange pickled objects with each other, and the typical use case is for a user to save and load files from their own computer.

KICK BAMA KICK
Mar 2, 2009

FoiledAgain posted:

I'm wondering if anyone could tell me about alternatives to Python's built-in pickle module?
Is it the kind of data that might be suitable to store in a database via an ORM? Of the ones not tied to a web framework I think SQLAlchemy is the big one but there are others designed to be more lightweight and friendly. If you've encapsulated your code well it's not really a massive deal to incorporate one into your existing code.

QuarkJets
Sep 8, 2008

FoiledAgain posted:

I'm wondering if anyone could tell me about alternatives to Python's built-in pickle module? We're currently using this on a project as a simple way to save and load user data, but some problems have come up recently that make me wonder about other options. In particular, this is a very active codebase, so pickled instances of classes often fail to be unpickled correctly when the attributes of their classes have been changed. We have an easy work-around, but it is annoying. This is even more annoying for a user of this program who has to update their saved files every time we update our code. This is the first time that I've had to deal with pickling/serialization, so I'm not even sure where to go looking. I know that there are various security issues surrounding this, so maybe it's also worth mentioning that users should never exchange pickled objects with each other, and the typical use case is for a user to save and load files from their own computer.

Could you provide an example of how this is breaking? IIRC, a pickled class instance should be a self-contained instance, so unpickling it shouldn't fail.

And what's your work-around?

Why does your user data need to be written as a class instance that is apparently prone to being redefined? Could the issue be solved by simplifying what gets saved? For instance, if you pickled a bunch of primitive variables instead of class instances, then you could redefine classes all you want and not break things. You could define a class that does the saving, loading, and returning of the basic user data that you want to pickle

An alternative to pickling is an sqlite3 database, which would still require you to simplify what's getting saved

QuarkJets fucked around with this message at 03:18 on Sep 13, 2014

FoiledAgain
May 6, 2007

KICK BAMA KICK posted:

Is it the kind of data that might be suitable to store in a database via an ORM?

Thanks for this suggestion. "ORM" is a technical term I didn't really know, and it bring up a whole new world of search results for me to look through.

QuarkJets posted:

Could you provide an example of how this is breaking? IIRC, a pickled class instance should be a self-contained instance, so unpickling it shouldn't fail.

Sure. The basic object that a user interacts with is a Corpus (although they don't know this because there's a GUI). Sometimes we add new methods to the corpus (e.g. corpus.get_random_subset() or somesuch) or we change the attributes (e.g. we add corpus.specifier, get rid of corpus.custom). When an older Corpus that was pickled before these changes is unpickled, we get an AttributeError. This seems to make sense, given what I understood from the Python docs.

quote:

And what's your work-around?

There's an option for users to import/export Corpus from/as a text file, so if they get caught by an UnpicklingError, they can recreate the object form a previous text file. However, creating an object anew from a text file is REALLY slow, and we'd like to have the user do it only once, then use the much faster unpickling every time after.

quote:

Why does your user data need to be written as a class instance that is apparently prone to being redefined? Could the issue be solved by simplifying what gets saved? For instance, if you pickled a bunch of primitive variables instead of class instances, then you could redefine classes all you want and not break things. You could define a class that does the saving, loading, and returning of the basic user data that you want to pickle.

The object of interest is a Corpus, and a Corpus consists of Words, which have numerous attributes, some of which are objects themselves. Users can modify the attributes of a Corpus, and even add new attributes, which is part of the reason this is "prone to being redefined" as you put it. Often, changes that users make are changes which affect every individual Word, so there isn't an obvious shortcut other than saving the whole Corpus at once. The project is also early enough in development that we are frequently making major updates.

quote:

An alternative to pickling is an sqlite3 database, which would still require you to simplify what's getting saved

I will look at this too. Thanks!

Nippashish
Nov 2, 2005

Let me see you dance!
A simple option is to just write all the words (and whatever else you need to store I guess) into a text file and make Corpus objects aware of how the files are formatted so they can read and write them. There really isn't any need to involve a database if you just want to stick a bunch of data in a file.

SurgicalOntologist
Jun 17, 2004

You should be able to write a __setstate__ method that handles missing attributes. When you add something to the class definition, also add handling to __setstate__ regarding how to set it when it doesn't exist in the pickled file.

suffix
Jul 27, 2013

Wheeee!

FoiledAgain posted:

The object of interest is a Corpus, and a Corpus consists of Words, which have numerous attributes, some of which are objects themselves. Users can modify the attributes of a Corpus, and even add new attributes, which is part of the reason this is "prone to being redefined" as you put it. Often, changes that users make are changes which affect every individual Word, so there isn't an obvious shortcut other than saving the whole Corpus at once. The project is also early enough in development that we are frequently making major updates.

So major changes like renaming properties is going to require some manual handling no matter what.

One way to do it is to implement __getstate__ and __setstate__ to control how the object is pickled.

Python code:
class Corpus:
    def __init__(self, words):
        self.words = words

    def __getstate__(self):
        return {'version': 2, 'words': self.words}

    def __setstate__(self, state):
        if state['version'] == 1:
            self.words = state['wrods'] # renamed field
        elif state['version'] == 2:
            self.words = state['words']
        else:
            raise Exception('Unknown Corpus format')
And make sure you have tests, because code like this can easily break without you noticing.
Python code:
class TestCorpus(unittest.TestCase):
    def test_deserialization_from_v1(self):
        corpus_v1_pickle = b'\x80\x03c__main__\nCorpus\nq\x00)\x81q\x01}q\x02(X\x05\x00\x00\x00wrodsq\x03]q\x04(X\x04\x00\x00\x00blahq\x05h\x05eX\x07\x00\x00\x00versionq\x06K\x01ub.'
        self.assertEqual(['blah', 'blah'], pickle.loads(corpus_v1_pickle).words)
It's going to be a similar pattern even if you use some other serialization format. Include a version when you save the data, and handle quirks from older versions when you load it.

Literally Elvis
Oct 21, 2013

Symbolic Butt posted:

That Business object thing sounds like it's begging to be a namedtuple.
I wanted to take the time to say thanks for recommending this. I just implemented namedtuples instead of that dumb Business object, and I feel way better about it. It also made it possible to eliminate duplicate items from the list the old fashioned way (set(a_list)) and seriously cut my line count down by a few dozen lines. I also implemented the other suggestions you made, most notably configparser.

Thanks again.

Hughmoris
Apr 21, 2007
Let's go to the abyss!
I'm learning the very basics of using Python to scrape a page for data. Do methods or difficulty significantly change when you have to use a username and password to access the page? Is it as simple as an additional 2 lines to the script, one for user and one for pw? For instance, if I wanted to scrape data from my Fantasy Football League on Yahoo.

*I realize this question is pretty drat vague...

Hughmoris fucked around with this message at 01:22 on Sep 16, 2014

Symbolic Butt
Mar 22, 2009

(_!_)
Buglord

Literally Elvis posted:

I wanted to take the time to say thanks for recommending this. I just implemented namedtuples instead of that dumb Business object, and I feel way better about it. It also made it possible to eliminate duplicate items from the list the old fashioned way (set(a_list)) and seriously cut my line count down by a few dozen lines. I also implemented the other suggestions you made, most notably configparser.

Thanks again.

I'm glad I could help you! :tipshat:

Hughmoris posted:

I'm learning the very basics of using Python to scrape a page for data. Do methods or difficulty significantly change when you have to use a username and password to access the page? Is it as simple as an additional 2 lines to the script, one for user and one for pw? For instance, if I wanted to scrape data from my Fantasy Football League on Yahoo.

*I realize this question is pretty drat vague...

In this case instead of scraping the page it's far easier to work with the api yahoo provides you: https://developer.yahoo.com/fantasysports/guide/

Just look around how to use oauth with requests and it'll be a breeze

theguyoncouch
May 23, 2014

QuarkJets posted:

You can still use defaults, like you had in your original example. For instance, if you wanted to make flat and admin operational parameters that default to False:

Python code:
class Employee:
    'base class called for employee'
    def __init__(self, lastname, firstname, eid,
                 clockin, clockout, flag=False, admin=False):
        self.lastname = lastname
        self.firstname = firstname
        self.eid = eid
        self.clockin = clockin
        self.clockout = clockout
        self.flag = flag
        self.admin = admin
Having 8 arguments in a class constructor is fine, just disable that feature. It's warning you because you can hypothetically design a function or class method to have fewer arguments. For instance, if some of the arguments were redundant or unused, then you'd want to remove those.

As to keep learning i changed my class object that really had no other use than a glorified ordered dict, to a namedtuple as per the suggestion of a friend. program runs the same but i think i got pylint mad again

W: 22,22: Access to a protected member _replace of a client class (protected-access)
E: 27,23: Instance of 'Employee' has no 'eid' member (no-member)

how would i go about using _replace without popping up this warning or the error bello it. Context below

Python code:
def create_employee_template():
    'creates the employee namedtuple'
    field_names = ['lastname', 'firstname', 'eid',
                   'timestamp', 'flag', 'admin']
    employee = namedtuple('Employee', field_names)
    current = employee('', '', '', '', 'False', 'False')
    return current

current = create_employee_template()
current = current._replace(eid=input('Please enter your id: '))
Again the code does run no problem yet pylint doesn't like it. is it a ignore pylint continue on or am I not doing this the correct way.

ohgodwhat
Aug 6, 2005

Speaking of pylint, and flake8, are there any tools that turn their output into something a bit prettier? I'm considering trying to gently push coding standards on the non devs who contribute to some of our code base, and nicer output would make it more palatable to them, I believe.

Also, has anyone had any luck with getting vim's YouCompleteMe to work with conda environments instead of the default python install? There's some code in jedi to handle virtualenvs, but it doesn't work with conda, I don't believe.

BeefofAges
Jun 5, 2004

Cry 'Havoc!', and let slip the cows of war.

ohgodwhat posted:

Speaking of pylint, and flake8, are there any tools that turn their output into something a bit prettier? I'm considering trying to gently push coding standards on the non devs who contribute to some of our code base, and nicer output would make it more palatable to them, I believe.

Also, has anyone had any luck with getting vim's YouCompleteMe to work with conda environments instead of the default python install? There's some code in jedi to handle virtualenvs, but it doesn't work with conda, I don't believe.

Set up http://pre-commit.com/ and you'll get automatic flake8 runs every time your people try to commit, with a decent (not amazing) interface.

Harriet Carker
Jun 2, 2009

I asked a while back but didn't get any bites, so I'll throw it out again. I'm at high-beginner level level with coding (took a C++ class in University and went through the entire Code Academy Python lesson, as well as made a few small projects of my own) and I'm looking to learn more by self-study. I bought a few books (Python Cookbook and Python Essential Reference, both by Beazley) but neither are quite what I'm looking for. I want something almost like a college textbook, with a chapter followed by projects/questions that I can attempt followed by solutions. Any good recommendations?

ShadowHawk
Jun 25, 2000

CERTIFIED PRE OWNED TESLA OWNER

dantheman650 posted:

I asked a while back but didn't get any bites, so I'll throw it out again. I'm at high-beginner level level with coding (took a C++ class in University and went through the entire Code Academy Python lesson, as well as made a few small projects of my own) and I'm looking to learn more by self-study. I bought a few books (Python Cookbook and Python Essential Reference, both by Beazley) but neither are quite what I'm looking for. I want something almost like a college textbook, with a chapter followed by projects/questions that I can attempt followed by solutions. Any good recommendations?
Check out (the later parts of) the free Think Python PDF

Harriet Carker
Jun 2, 2009

ShadowHawk posted:

Check out (the later parts of) the free Think Python PDF

Wow, this looks great. Thank you!

Lyon
Apr 17, 2003
Think Python is pretty great and is available as a print book or is also available totally free online which is good because it may be too easy for you (HTML, PDF).

There is also Problem Solving with Algorithms and Data Structures which is also available in print or for free online (HTML).

Once you've got a bit more Python fundamentals the Python Cookbook is phenomenal. I'm actually reading that right now and really enjoying it. Lots of good stuff in there.

As far as other decent Python books... Fundamentals of Python: Data Structures was okay. If you're interested in learning a bit about the classic data structures (stacks, queues, linked lists, etc) and their implementation in Python it isn't bad. The Python code isn't the greatest and some people will question whether implementing those data structures in Python is worthwhile but it was still pretty interesting.

Python Algorithms: Mastering Basic Algorithms in the Python Language is another decent book but I would do the above resources before this one. It was a bit more complex than I was expecting and jumped pretty quickly into graph theory which I wasn't quite prepared for at the time.

There are also some really great Coursera classes you could do, the two I've done are Introduction Interactive Programming with Python and Principles of Computing both by the same professors at Rice University. They are currently running a third one called Algorithmic Thinking which is the third and most complex course. I have unfortunately fallen behind in that one due to work but I'm going to try and catch back up.

There are also some great books on using the standard library. Good examples include The Python Standard Library by Example, Core Python Application Programming, and Python in Practice.

Harriet Carker
Jun 2, 2009

Lyon posted:

Think Python is pretty great and is available as a print book or is also available totally free online which is good because it may be too easy for you (HTML, PDF).

There is also Problem Solving with Algorithms and Data Structures which is also available in print or for free online (HTML).

Once you've got a bit more Python fundamentals the Python Cookbook is phenomenal. I'm actually reading that right now and really enjoying it. Lots of good stuff in there.

As far as other decent Python books... Fundamentals of Python: Data Structures was okay. If you're interested in learning a bit about the classic data structures (stacks, queues, linked lists, etc) and their implementation in Python it isn't bad. The Python code isn't the greatest and some people will question whether implementing those data structures in Python is worthwhile but it was still pretty interesting.

Python Algorithms: Mastering Basic Algorithms in the Python Language is another decent book but I would do the above resources before this one. It was a bit more complex than I was expecting and jumped pretty quickly into graph theory which I wasn't quite prepared for at the time.

There are also some really great Coursera classes you could do, the two I've done are Introduction Interactive Programming with Python and Principles of Computing both by the same professors at Rice University. They are currently running a third one called Algorithmic Thinking which is the third and most complex course. I have unfortunately fallen behind in that one due to work but I'm going to try and catch back up.

There are also some great books on using the standard library. Good examples include The Python Standard Library by Example, Core Python Application Programming, and Python in Practice.

Thanks a lot! I can tell the Python Cookbook is going to be great once I'm more practiced with the fundamentals. It's actually nice reading through Think Python so far just to get a crystal clear and concise set of definitions and good practices from the very start; Code Academy was good at getting me up and running on making my own scripts but I feel like I didn't get a great overview of what was actually going on behind the scenes. I will check out those Coursera classes as well!

SurgicalOntologist
Jun 17, 2004

I have an idea for a mini-project and I'd like to run it by here to see what people think, and if there's a better way to accomplish this.

The problem: I have a separate virtualenv/conda environment for every project, I do a lot of my work in IPython Notebooks, and much of that is remote work. Since I have separate envs, I usually have to ssh in, check my tmux sessions for what servers are already running, possibly start up a new server, note what port it's on, and open up that port in the firewall before finally connecting. It's tedious and one of these days I'm bound to find myself wanting to get some work done without access to one of my authorized ssh keys.

The idea: Setup a web server, such that if you access the URL <server>/<env> (or perhaps <server>/ipython/<env>, you will be connected to the environment according to the URL. If a notebook is not already running, it will spin up.

Issues:
  • Framework? I've never done web stuff before, but the decorator notation of Flask looks nice. Are there other minimal frameworks I should consider?
  • Starting notebook servers. Does IPython have an API for this (apparently not) or do I have to shell out for lots of stuff?
  • When do I shut them down? Adding a home page at <server>/ipython to start and stop servers is probably a good option but that makes it much less minimal. Any downside to leaving them running?
  • Directing traffic. This is my biggest roadblock conceptually. Every Notebook server must run on a unique port. This means I must do some reverse proxy work, and furthermore I will have to change the redirection rules on the fly. That rules out all the easy options, I think. Are there any reverse proxy implementations in Python that expose an API? Or is there a better way...

Edit: just realized that nginx has a graceful restart control signal (HUP). This could work, but it would still be nice to decouple this app from the reverse proxy somehow. It would work for a purely local solution but would be hard to package as a stand-alone drop-in app for others to use. Still, it would be relatively easy to change the nginx configuration, gracefully restart it, and have Flask serve a redirect to the new address. Tempting.

Alternative: Write a shell script to add a location block to my nginx reverse proxy and create a new ipython profile with the same port. Call the shell script every time I make a new environment. Easy but not as fun.

SurgicalOntologist fucked around with this message at 01:08 on Sep 17, 2014

FoiledAgain
May 6, 2007

Question about pickles and bytes. I have the following code, which works.

code:
class Unpickler(pickle.Unpickler):

    def __init__(self, file):
        self.data = io.BytesIO(file.read())
        super(pickle.Unpickler, self).__init__(self.data)

    def load(self):
    #This overrides the original Unpickler.load() function

        try:
            while 1:
    	        bite = self.data.read(1)
	        dispatch[bite[0]](self) #does something appropriate with the current byte

        except pickle._Stop as stopinst:
            #success!
            return stopinst.value


However, this minimally different load() function does not work:

code:

def load(self):
    try:
        for bite in self.data:
            dispatch[bite[0]](self)
    except pickle._Stop as stopinst:
        #success!
        return stopinst.value
The error I am getting is raised in pickle.py, and it says "Wrong protocol number: 67". My understanding is that for a bytes object b, b[0] is an int, so I'm guessing that the 67 is one of these ints that is not being treated correctly by pickle. But how come calling self.data.read(1) doesn't cause the same problem? Does the for-loop not iterate over chunks of the same size as what read(1) does?

Haystack
Jan 23, 2005





SurgicalOntologist posted:

Framework? I've never done web stuff before, but the decorator notation of Flask looks nice. Are there other minimal frameworks I should consider?

I always like advocating for Pyramid since I think it's somewhat better designed, but Flask should be more than ok.

Hughmoris
Apr 21, 2007
Let's go to the abyss!

Symbolic Butt posted:

I'm glad I could help you! :tipshat:


In this case instead of scraping the page it's far easier to work with the api yahoo provides you: https://developer.yahoo.com/fantasysports/guide/

Just look around how to use oauth with requests and it'll be a breeze

It definitely is not a breeze! :mad:

I'm in way over my head with this 0Auth crap, and I can't find a good simple example of someone using the Yahoo API with python. Scraping is way easier compared to this. :smith:

Adbot
ADBOT LOVES YOU

SurgicalOntologist
Jun 17, 2004

Haystack posted:

I always like advocating for Pyramid since I think it's somewhat better designed, but Flask should be more than ok.

That's actually one I haven't looked into so thanks for the tip, I'll check it out.

  • Locked thread