Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Locked thread
MockingQuantum
Jan 20, 2012



Dominoes posted:

Start a project. Codeacademy's Python course is all you need for now.

Well I'll be looking things up every ten seconds, but I suppose that's a viable way of learning the language better. I'm just concerned about developing poor programming habits out of inexperience or something like that.

Adbot
ADBOT LOVES YOU

FoiledAgain
May 6, 2007

MockingQuantum posted:

Well I'll be looking things up every ten seconds, but I suppose that's a viable way of learning the language better.

Not be too sarcastic, but what else did you expect to do? Magically divine the answers ahead of time? Of course you'll spend a lot of time looking stuff up when you start. Everyone does.

MockingQuantum
Jan 20, 2012



FoiledAgain posted:

Not be too sarcastic, but what else did you expect to do? Magically divine the answers ahead of time? Of course you'll spend a lot of time looking stuff up when you start. Everyone does.

Well yes, that's inevitable, but I guess I'm asking, as someone who is completely new to programming in general: Codecademy only gets you so far, so am I better off continuing with structured self-instruction, or try a project when I have only a loose understanding of how dicts and classes (for example) work in python? I get it if there's not a good answer that someone else can give me, but I'm trying to at least minimize frustration.

Symbolic Butt
Mar 22, 2009

(_!_)
Buglord

sharktamer posted:

Is this gross?

Yes kinda, not really because it has a return or print thing, in a debugging context that sounds normal. But it's gross because you're calling inspect.stack just to see if you're gonna sort a thing or not... That's way too ingenious.

Symbolic Butt
Mar 22, 2009

(_!_)
Buglord

MockingQuantum posted:

Well I'll be looking things up every ten seconds, but I suppose that's a viable way of learning the language better. I'm just concerned about developing poor programming habits out of inexperience or something like that.

I guess you could read a lot of other people's code to get a better instinct on how to do things. Preferably cool people.

You'll end up seeing a lot of features that you're not familiar with so it'll be good motivation to dig on the official documentation.

baka kaba
Jul 19, 2003

PLEASE ASK ME, THE SELF-PROFESSED NO #1 PAUL CATTERMOLE FAN IN THE SOMETHING AWFUL S-CLUB 7 MEGATHREAD, TO NAME A SINGLE SONG BY HIS EXCELLENT NU-METAL SIDE PROJECT, SKUA, AND IF I CAN'T PLEASE TELL ME TO
EAT SHIT

MockingQuantum posted:

Well yes, that's inevitable, but I guess I'm asking, as someone who is completely new to programming in general: Codecademy only gets you so far, so am I better off continuing with structured self-instruction, or try a project when I have only a loose understanding of how dicts and classes (for example) work in python? I get it if there's not a good answer that someone else can give me, but I'm trying to at least minimize frustration.

You could have a look at CheckIO if you want to explore a bit further, they have a lot of structured tasks and challenges that go in lots of different directions, and generally leave you to find your own solution. Once you've written some code that solves the challenge, you can look at other people's solutions and pick up some neat tricks

Having an idea of what you eventually want to do is always a good thing though, because at some point you need to jump in and get on with it. And it'll probably be bad at first, but that's all part of the learning process. You can learn to not be bad in general, pick up some useful tools and patterns, but specific projects have their own unique aspects you'll have to work out yourself, that you can't necessarily prepare yourself for.

If you're stuck for project ideas though, that's a whole other problem!

KICK BAMA KICK
Mar 2, 2009

If you want a text, Think Python Like a Computer Scientist, check the OP or Google, is a good bit more comprehensive than I remember Codecademy being when I was going through the same process. But that and "just start something" are not exclusive. You want to learn how to build an actual application of any kind, I never really found anything better than the iterative process of "do it the wrong way, come to understand why it's wrong and in doing so learn what would have been the right way, repeat." I'm still writing garbage but it's a decidedly better class of garbage than it was when I started.

The single most important general principle in my experience is to properly structure your code so it's easy to refactor as you learn (like in the Model-View-* patterns). Small specific Python things I wish I'd known earlier? Use the logging module instead of littering your code with print everywhere for quick "wait, what is that variable getting set to?" debugging. Almost every problem has a solution in itertools. Use PyCharm; it's constantly giving you "that's not how you do that" warnings about Python conventions that will stick with you even when you're writing code outside of it.

QuarkJets
Sep 8, 2008

MockingQuantum posted:

Well yes, that's inevitable, but I guess I'm asking, as someone who is completely new to programming in general: Codecademy only gets you so far, so am I better off continuing with structured self-instruction, or try a project when I have only a loose understanding of how dicts and classes (for example) work in python? I get it if there's not a good answer that someone else can give me, but I'm trying to at least minimize frustration.

Start a project. You'll get a better feel for how things work by actually using those things. Then go and see if your understanding matches the documentation and the understanding that others have developed.

I think that the most effective kind of learning is studying coupled with actual work. The structured instruction gets your foot in the door, and then you do some projects (because there is no substitute for experience), and then you repeat that. In my experience, the best software developers don't just *know* everything, they review documentation and they keep an active interest in the field in addition to doing a lot of actual coding.

sharktamer
Oct 30, 2011

Shark tamer ridiculous

Symbolic Butt posted:

Yes kinda, not really because it has a return or print thing, in a debugging context that sounds normal. But it's gross because you're calling inspect.stack just to see if you're gonna sort a thing or not... That's way too ingenious.

Yeah, I get that too. It's just I'm not sure how else I'd do this without having code repeated everywhere.

edit: OK I think I've got a nicer solution. I got the solution from [url=http://stackoverflow.com/questions/2828953/silence-the-stdout-of-a-function-in-python-without-trashing-sys-stdout-and-resto]here[/here]. I'm using the contextlib solution, so now I have my "the_cool_task" function from before both printing and returning. The function that then calls it and only wants the return value then uses the with nostdout() mentioned in that post. Looks and feels a lot cleaner, even if it is still a bit gross.

code:
@contextlib.contextmanager
def nostdout():
    save_stdout = sys.stdout
    sys.stdout = cStringIO.StringIO()
    yield
    sys.stdout = save_stdout

@task
def the_cool_task():
    out = '1\n2\n4\n5\n3'
    print out
    return out

@task
@runs_once
def sort_output(f):
    with nstdout():
        results = execute(f)
    print sort_results(results)
I'm losing a little bit of fabric's output that's usually given when running the execute function, but that's really no big loss at all.

e2: Since I can never leave anything alone, would it be possible to somehow just override/decorate fabric's execute function to not print anything? I suppose that would be doing the exact same thing as I'm already doing here anyway.

sharktamer fucked around with this message at 14:30 on Jan 15, 2015

Rlazgoth
Oct 10, 2003
I attack people at random.
As someone who is currently learning how to program, I'll be seconding most of the advice already given. I completed the course at Codecademy - and it was absolutely necessary given that I had no prior experience in programming (unless HTML/CSS counts?) - but the actual learning only truly began once I started working on a project. I reckon that most people end up reading one of the many avaliable books after completing the course, but I skipped that step and so far, so good.

Getting acquainted with basic logic such as if', for's and such is necessary, but I soon realized that any practical programming would require some degree of familiarity with the existing modules/libraries, at the very least to avoid the pitfall of programming something which already exists and could be simply imported from elsewhere. With that said, just by learning through a hands-on-approach, in order to complete my project I had to familiarize myself with 7 modules, understand what MVC is, and learned how to design a GUI in the process. That's significantly more than what I would have learned in a similar timespan from doing course-work or reading a book, with the added benefit of yielding a tangible output at its conclusion, which is a great motivator.

The initial "breaking in" of learning a programming language, specially the first one, is the most difficult part. Afterwards, learning-by-doing comes as a natural process of programming.

QuarkJets
Sep 8, 2008

I'd also suggest learning how to use git. Version control is very important whether you're working in a professional setting or just doing a throwaway project. Looking around on github is also a great way to find something interesting to work on, or you might come up with an idea that hasn't been implemented yet

MockingQuantum
Jan 20, 2012



Thank you everyone for the advice, I feel like I have better perspective on how to proceed, and what pitfalls to expect. Now I'll go make Battleship or something.

Fergus Mac Roich
Nov 5, 2008

Soiled Meat
I'm in here because I'm following a Google Python tutorial, and I have a question about an exercise. The task I'm working on right now is printing a sorted list of the top 20 words in a document along with their count. Each word gets their own line. I have a dictionary because that seems the most natural structure besides a list of tuples, but since tuples are immutable, I thought a dictionary would be more efficient(not making a new tuple every time I increment the count, etc). I'm a relatively new programmer, but I know a bit of C.

code:
   
for k, v in sorted(word_dict.items(), key=val*, reverse=True):
    print "%s %d" % (k, v)
*simply returns the value associated with the key, pardon the lovely method name

1. Am I right that there's no built-in way to slice a dictionary? I want to cut it off when it reaches 20 items, and I could easily break the loop when an iterator reaches 20, but that seems to be coming from the C center of my brain, and I'm thinking Python might have something more elegant. I'm aware I could use something like for x in range(), but I don't know how or even if that can be combined with the loop there. I was considering actually converting and saving it to a list of tuples, since I'm thinking that's what sorted(my args) does to that dictionary anyway. If I did that I could rather easily use slice notation.
2. Since a dictionary isn't guaranteed to be sorted in any particular way, is that sorted(etc etc) expression being re-evaluated for every iteration of the loop, or does it stay in memory somewhere?

KICK BAMA KICK
Mar 2, 2009

Fergus Mac Roich posted:

1. Am I right that there's no built-in way to slice a dictionary? I want to cut it off when it reaches 20 items, and I could easily break the loop when an iterator reaches 20, but that seems to be coming from the C center of my brain, and I'm thinking Python might have something more elegant. I'm aware I could use something like for x in range(), but I don't know how or even if that can be combined with the loop there. I was considering actually converting and saving it to a list of tuples, since I'm thinking that's what sorted(my args) does to that dictionary anyway. If I did that I could rather easily use slice notation.
There's not really a way to slice a dictionary per se (dictionaries aren't ordered, so where would you be slicing them?), but the thing you want to slice isn't a dictionary. word_dict.items() is a list of tuples, and thus so is the output of sorted. You can slice that just like you'd expect (at this point, it would be advisable to break it out onto its own line for readability). You can also inline the sorting key function as a lambda, which is quite conventional as long as it doesn't get too complicated. I have no objection to separating that out and giving it a name but either way I think it'd be a little odd to use a def rather than lambda (not that that was evident in your code, just saying.)
Python code:
top_twenty = sorted(word_dict.items(), key=lambda item: item[1], reverse=True)[:20]
for k, v in top_twenty: ...

quote:

2. Since a dictionary isn't guaranteed to be sorted in any particular way, is that sorted(etc etc) expression being re-evaluated for every iteration of the loop, or does it stay in memory somewhere?
The sorted should only get evaluated once upon entering the loop but the above way should show that even more explicitly.

Fergus Mac Roich
Nov 5, 2008

Soiled Meat
I don't actually know what a lambda is(except vaguely) because it doesn't appear in K&R C so I guess I'll figure that out and throw it in there. The rest of your post makes perfect sense and answers my questions, thanks.

baka kaba
Jul 19, 2003

PLEASE ASK ME, THE SELF-PROFESSED NO #1 PAUL CATTERMOLE FAN IN THE SOMETHING AWFUL S-CLUB 7 MEGATHREAD, TO NAME A SINGLE SONG BY HIS EXCELLENT NU-METAL SIDE PROJECT, SKUA, AND IF I CAN'T PLEASE TELL ME TO
EAT SHIT

Here's a nice lambda overview
http://www.secnetix.de/olli/Python/lambda_functions.hawk

Also it doesn't really answer your general question, but There's A Library For That!
https://docs.python.org/3/library/collections.html?highlight=counter#collections.Counter

KICK BAMA KICK
Mar 2, 2009

Fergus Mac Roich posted:

I don't actually know what a lambda is(except vaguely) because it doesn't appear in K&R C so I guess I'll figure that out and throw it in there. The rest of your post makes perfect sense and answers my questions, thanks.
Lambdas are anonymous functions -- one-liners you can use exactly like a function; the key of sorted and the like is the canonical example of their use in Python.
Python code:
def val(item):
    """Get the second item of the given tuple, e.g., the value of a (key, value) pair generated by dict.items()"""
    return item[1]
or
code:
val = lambda item: item[1]
are pretty much exactly equivalent, and the latter is more Pythonic for such a simple function; inlining them anonymously like I did in the previous post is also very common as long as they're simple and don't make the resulting line unreadable.

SelfOM
Jun 15, 2010
Does anyone know if in matplotlib you can initialize subplot axes dynamically without knowing the number of subplots you are going to have before hand? I want to be able to use functions like ax.add_patches in convenience functions that just add plots in sequentially. The other option I was thinking of holding is using holding objects that would call add_patches after everything is set.

Dominoes
Sep 20, 2007

Of note, you can also use key=operator.itemgetter(1)

You'd use lambda or itemgetter() in this example, but you'd pass a normal function into key if you're sorting based on a more complex rule.

Fergus Mac Roich
Nov 5, 2008

Soiled Meat

baka kaba posted:

Here's a nice lambda overview
http://www.secnetix.de/olli/Python/lambda_functions.hawk

Also it doesn't really answer your general question, but There's A Library For That!
https://docs.python.org/3/library/collections.html?highlight=counter#collections.Counter


KICK BAMA KICK posted:

Lambdas are anonymous functions -- one-liners you can use exactly like a function; the key of sorted and the like is the canonical example of their use in Python.
Python code:
def val(item):
    """Get the second item of the given tuple, e.g., the value of a (key, value) pair generated by dict.items()"""
    return item[1]
or
code:
val = lambda item: item[1]
are pretty much exactly equivalent, and the latter is more Pythonic for such a simple function; inlining them anonymously like I did in the previous post is also very common as long as they're simple and don't make the resulting line unreadable.


Dominoes posted:

Of note, you can also use key=operator.itemgetter(1)

You'd use lambda or itemgetter() in this example, but you'd pass a normal function into key if you're sorting based on a more complex rule.

Thanks guys! The amount of crap I get "for free" in this language compared to writing it myself in C is pretty awesome. Especially not having to call a function just to allocate memory for a new struct or whatever.

baka kaba
Jul 19, 2003

PLEASE ASK ME, THE SELF-PROFESSED NO #1 PAUL CATTERMOLE FAN IN THE SOMETHING AWFUL S-CLUB 7 MEGATHREAD, TO NAME A SINGLE SONG BY HIS EXCELLENT NU-METAL SIDE PROJECT, SKUA, AND IF I CAN'T PLEASE TELL ME TO
EAT SHIT

Yeah, it's silly huh? You might want to take a look at this too, if you have a bit of time:
http://www.dabeaz.com/generators/Generators.pdf

KICK BAMA KICK
Mar 2, 2009

Fergus Mac Roich posted:

Thanks guys! The amount of crap I get "for free" in this language compared to writing it myself in C is pretty awesome. Especially not having to call a function just to allocate memory for a new struct or whatever.
Two of the best sources of "oh, you can just do that?" are the itertools and collections modules. (Intros: itertools, collections).

SurgicalOntologist
Jun 17, 2004

KICK BAMA KICK posted:

Two of the best sources of "oh, you can just do that?" are the itertools and collections modules. (Intros: itertools, collections).

Speaking of getting stuff "for free"... of note for the problem posted is collections.Counter:

Python code:
from collections import Counter

word_dict = Counter(open(filename).read())

suffix
Jul 27, 2013

Wheeee!

sharktamer posted:

Yeah, I get that too. It's just I'm not sure how else I'd do this without having code repeated everywhere.

edit: OK I think I've got a nicer solution. I got the solution from [url=http://stackoverflow.com/questions/2828953/silence-the-stdout-of-a-function-in-python-without-trashing-sys-stdout-and-resto]here[/here]. I'm using the contextlib solution, so now I have my "the_cool_task" function from before both printing and returning. The function that then calls it and only wants the return value then uses the with nostdout() mentioned in that post. Looks and feels a lot cleaner, even if it is still a bit gross.

code:
@contextlib.contextmanager
def nostdout():
    save_stdout = sys.stdout
    sys.stdout = cStringIO.StringIO()
    yield
    sys.stdout = save_stdout

@task
def the_cool_task():
    out = '1\n2\n4\n5\n3'
    print out
    return out

@task
@runs_once
def sort_output(f):
    with nstdout():
        results = execute(f)
    print sort_results(results)
I'm losing a little bit of fabric's output that's usually given when running the execute function, but that's really no big loss at all.

e2: Since I can never leave anything alone, would it be possible to somehow just override/decorate fabric's execute function to not print anything? I suppose that would be doing the exact same thing as I'm already doing here anyway.

You could have a helper task or function that always returns the result, and have both the_cool_task and sort_output use that? It's usually a good idea to separate calculations from the input/output part for reuse.

KICK BAMA KICK
Mar 2, 2009

Working with SQLAlchemy, I've got something like an association object that represents a relationship between say an Employee and an Employer:
code:
class EmploymentTenure(Base):
    """Represents a record of an Employee's employment with an Employer in a single position over a single contiguous period of time."""
    __tablename__ = 'employmenttenure'

    # Associations
    employee_id = sa.Column(sa.Integer, sa.ForeignKey('employee.id'), primary_key=True)
    employer_id = sa.Column(sa.Integer, sa.ForeignKey('employer.id'), primary_key=True)
    employee = sao.relationship('Employee', uselist=False, back_populates='_employer_history_q')
    employer = sao.relationship('Employer', uselist=False, back_populates='_employee_history_q')

    # Dates
    start_date = sa.Column(sa.Date, nullable=False)
    end_date = sa.Column(sa.Date)
I'd like to define some positions/job titles and add a position column to the EmploymentTenure class/table. There would only be a handful of job titles and I don't intend to create or edit them at runtime. So far, this just calls for just an arbitrary discriminator constrained to a predefined set of arbitrary values, which sounds like an enum. SQLAlchemy has an Enum type, or I could use an Integer column with a Python enum.IntEnum.

Thing is, I'd like to attach some data to each position, ideally to access like JobTitles.Manager.base_salary. I'm kinda stuck on the least gross way to do it -- do I have to just define a simple enum (whether a Python enum or the native SQLAlchemy type) in one place, and then a dictionary mapping the value the enum assigns to each job title to a dict or NamedTuple or other object containing the attributes associated with each job title? Like:
code:
class EmploymentTenure(Base):
    # other stuff
    position = sa.Column(sa.Enum('manager', 'salesman'))

# Elsewhere
job_position_data = {
    'manager': {
        'base_salary': 50000,
        'description': 'Manage stuff'
    },
    'salesman': {
        'base_salary': 40000,
        'description': 'Sell things'
    }
}
This is one of those questions where I'm pretty sure it's a commonly done thing and there must be a right answer that's obvious to someone else but I'm not coming up with the right terms to search so I'm throwing it out there.

SurgicalOntologist
Jun 17, 2004

(I don't have a lot of database experience) It sounds like maybe you'd want a JobTitle table? I'm not sure if the fact that they're unlikely to change is reason against doing so. Hell it may even be a good reason to enshrine them in the database itself.

QuarkJets
Sep 8, 2008

SurgicalOntologist posted:

(I don't have a lot of database experience) It sounds like maybe you'd want a JobTitle table? I'm not sure if the fact that they're unlikely to change is reason against doing so. Hell it may even be a good reason to enshrine them in the database itself.

This was going to be my suggestion. If you want data associated with each type of JobTitle (things like BaseSalary and MinYearsExperience and Description or whatever else), then you may as well create a table for that, and then have a JobTitleID in your employee table.

KICK BAMA KICK
Mar 2, 2009

Thanks for the input. I'm not sure about putting the job title data in its own table though; while I don't need to edit those attributes at runtime I would like to be able to edit them during the development process and touching the db(s) every time seemed undesirable. This probably makes more sense if I mention that this isn't like a business application, it's a prototype for a dumb sim/game idea.

Thinking about it more I'm leaning back toward just defining a class for each job title with its data as class attributes, each with a unique discriminator value. Then construct a dict mapping the discriminator to the class, put a column for the discriminator on the employment record table and give the employment record objects a job_title property that gets the job title class object from the dict by its value from the discriminator column. That's less weird than it sounded to me at first.

Fergus Mac Roich
Nov 5, 2008

Soiled Meat
Question about generators. I have a generator that yields lists of words(one list per call, obviously). Is there any way to use another generator expression to yield each element of each list?

I feel like the solution is staring me in the face but there's some kind of gap in my understanding of generators that prevents me from seeing it. I want to avoid writing a function if possible.

edit: Wow, I think the answer might actually be in the itertools module you guys helpfully pointed out.
double edit: Okay, I thought the answer was itertools.chain(), but I couldn't get that to work. Here's an insanely stupid construct that did work:

code:
word_lists = (words for wordslists in doc_lines for words in wordslists)
Now if I can only actually create a mental model of what's happening in that expression, because I kind of trial and error'd it out. I'm fixing my variable names to be more comprehensible.

I'm really liking generators. Check out my new function:

code:
def word_count(filename):
  f = open(filename)
  ret_word_count  = collections.Counter()
  doc_lines   = (line.lower().split() for line in f) 
  individual_words  = (w for wlists in doc_lines for w in wlists) 
  for w in individual_words:
    ret_word_count[w] += 1
  return ret_word_count
I made that from this:

code:
def word_count(filename):
  f = open(filename, 'r')
  words = {}
  for line in f:
    line = line.lower().split()
    for word in line:
      if word in words:
        words[word] += 1
      else:
        words[word] = 1
  return words
My new code might still be somewhat stupid, not sure. Obviously this is just for a beginner's exercise so pretty low stakes here but if anyone knows of anything further I could do to make it better I'm all ears. I tend to learn a lot from refactoring code over and over.

Fergus Mac Roich fucked around with this message at 02:59 on Jan 17, 2015

ShadowHawk
Jun 25, 2000

CERTIFIED PRE OWNED TESLA OWNER

Fergus Mac Roich posted:

Question about generators. I have a generator that yields lists of words(one list per call, obviously). Is there any way to use another generator expression to yield each element of each list?

I feel like the solution is staring me in the face but there's some kind of gap in my understanding of generators that prevents me from seeing it. I want to avoid writing a function if possible.

edit: Wow, I think the answer might actually be in the itertools module you guys helpfully pointed out.
double edit: Okay, I thought the answer was itertools.chain(), but I couldn't get that to work. Here's an insanely stupid construct that did work:

code:
word_lists = (words for wordslists in doc_lines for words in wordslists)
Now if I can only actually create a mental model of what's happening in that expression, because I kind of trial and error'd it out. I'm fixing my variable names to be more comprehensible.

What you want is equivalent to:
Python code:
def individual_words_from_wordlists(wordlists):
  for wordlist in wordlists:
    for word in wordlist:
      yield word
You called wordlists "doc_lines" and you called "word" "words", but that's the gist of it.

Yes, Python chained for loops in a single expression are a bit yodaspeak. I'm not sure why.

ShadowHawk fucked around with this message at 03:28 on Jan 17, 2015

baka kaba
Jul 19, 2003

PLEASE ASK ME, THE SELF-PROFESSED NO #1 PAUL CATTERMOLE FAN IN THE SOMETHING AWFUL S-CLUB 7 MEGATHREAD, TO NAME A SINGLE SONG BY HIS EXCELLENT NU-METAL SIDE PROJECT, SKUA, AND IF I CAN'T PLEASE TELL ME TO
EAT SHIT

Fergus Mac Roich posted:

edit: Wow, I think the answer might actually be in the itertools module you guys helpfully pointed out.
double edit: Okay, I thought the answer was itertools.chain(), but I couldn't get that to work. Here's an insanely stupid construct that did work:

code:
word_lists = (words for wordslists in doc_lines for words in wordslists)
Now if I can only actually create a mental model of what's happening in that expression, because I kind of trial and error'd it out. I'm fixing my variable names to be more comprehensible.

The structure is basically like nesting for loops, so this:
Python code:
word for wordlist in doc_lines for word in wordlist
is basically equivalent to this:
Python code:
for wordlist in doc_lines:
    for word in wordlist:
        word
The word bit on the left is your final output expression, which could be anything like word.upper() or word*2 or whatever you want to produce. Then, reading left to right, you list each of your for loop expressions, which iterate over something. Just like an inner for loop can refer to a variable in an outer for loop, you can refer back to earlier values in the expression - stuff to the left, basically.

e- beaten, but also chain should work, but there are two versions of it

itertools.chain(*iterables)
Make an iterator that returns elements from the first iterable until it is exhausted,
then proceeds to the next iterable, until all of the iterables are exhausted.
Used for treating consecutive sequences as a single sequence.

classmethod chain.from_iterable(iterable)
Alternate constructor for chain(). Gets chained inputs from
a single iterable argument that is evaluated lazily.


(This is an amateur explanation coming up) This is a little tricky, but you see that asterisk in the first one? That unpacks a collection of arguments into in the individual elements. It turns 'hey chain, handle this box of stuff' into 'hey chain, handle this and this and also this'. The difference is that chain will treat the box (say your collection of iterables) as the -single- iterable you want it to operate on, and it will happily hand you each element inside it - but those elements are the things you actually wanted it to iterate over. So you need to unpack that collection as you hand it to chain, so it gets multiple elements instead of the container. If that makes sense!

Like this:
Python code:
item for item in chain([[a, b, c], [1, 2, 3]])
# iterates over items in the list, producing [a, b, c] and then [1, 2, 3]

item for item in chain(*[[a, b, c], [1, 2, 3]])
# unpacks into the equivalent of
item for item in chain([a, b, c], [1, 2, 3])
# now there are two iterables for chain to iterate over, producing the elements a b c 1 2 3
The alternate constructor up there does the unpacking for you, so you can just do chain.from_iterable([[a, b, c], [1, 2, 3]]) and it will take your single iterable container and open it up to get at the sweet sweet iterables inside, and then hand them each of them to chain so it can iterate over them in turn

baka kaba fucked around with this message at 04:21 on Jan 17, 2015

Fergus Mac Roich
Nov 5, 2008

Soiled Meat

ShadowHawk posted:

What you want is equivalent to:
Python code:
def individual_words_from_wordlists(wordlists):
  for wordlist in wordlists:
    for word in wordlist:
      yield word
You called wordlists "doc_lines" and you called "word" "words", but that's the gist of it.

Yes, Python chained for loops in a single expression are a bit yodaspeak. I'm not sure why.

I kind of figured that I might be able to do it with a generator function, but in this case I decided(somewhat arbitrarily) that I preferred to be as succinct as I could - I had a strong sense that a generator expression could do it. I'll have to read the style guide to get a better sense of how variable names are supposed to look, I was just using whatever matched the concept I had in my head of what was happening with the function.


baka kaba posted:

...
(This is an amateur explanation coming up) This is a little tricky, but you see that asterisk in the first one? That unpacks a collection of arguments into in the individual elements. It turns 'hey chain, handle this box of stuff' into 'hey chain, handle this and this and also this'. The difference is that chain will treat the box (say your collection of iterables) as the -single- iterable you want it to operate on, and it will happily hand you each element inside it - but those elements are the things you actually wanted it to iterate over. So you need to unpack that collection as you hand it to chain, so it gets multiple elements instead of the container. If that makes sense!
...

Good post, thanks. I should have read the docs more closely, I made a totally wrong assumption about the asterisk and then proceeded to miss the alternate version of chain. I'll swap my solution for chain, actually, because I think it's much more clear.

KICK BAMA KICK
Mar 2, 2009

Fergus Mac Roich posted:

double edit: Okay, I thought the answer was itertools.chain(), but I couldn't get that to work. Here's an insanely stupid construct that did work:
code:
word_lists = (words for wordslists in doc_lines for words in wordslists)
Nothing stupid about it, that's Pythonic as hell. Don't get crazy about nesting generator expressions/list comprehensions, especially if you're throwing in conditions and calculations but I think one level like that is usually fine.

baka kaba
Jul 19, 2003

PLEASE ASK ME, THE SELF-PROFESSED NO #1 PAUL CATTERMOLE FAN IN THE SOMETHING AWFUL S-CLUB 7 MEGATHREAD, TO NAME A SINGLE SONG BY HIS EXCELLENT NU-METAL SIDE PROJECT, SKUA, AND IF I CAN'T PLEASE TELL ME TO
EAT SHIT

Also y'know Counter() takes an iterable too - you can just stick your word generator in there, like Counter(words) and it will feed it and build up the totals itself. No need to increment things! You don't even need to create a variable, you can just make your generators and then do return Counter(words) . See the lines of code vanish before your eyes like dissipating steam!

Well, back to Java for me :negative:

ShadowHawk
Jun 25, 2000

CERTIFIED PRE OWNED TESLA OWNER

KICK BAMA KICK posted:

Lambdas are anonymous functions -- one-liners you can use exactly like a function; the key of sorted and the like is the canonical example of their use in Python.
The best quote on this I heard is that if we just called lamda "makefunction" no one would have any problem with it.

quote:

Python code:
def val(item):
    """Get the second item of the given tuple, e.g., the value of a (key, value) pair generated by dict.items()"""
    return item[1]
or
code:
val = lambda item: item[1]
are pretty much exactly equivalent, and the latter is more Pythonic for such a simple function; inlining them anonymously like I did in the previous post is also very common as long as they're simple and don't make the resulting line unreadable.
The latter isn't more pythonic, it's explicitly prohibited in the Pep-8 style guide. If you need to use the function more than the line it's created on you should just use a def statement.

Fergus Mac Roich
Nov 5, 2008

Soiled Meat
The line between brevity and incoherence is a difficult one to grasp for me, I think.

SurgicalOntologist
Jun 17, 2004

Ah, but sometimes brevity is quite coherent.

Not only can Counter take an iterable, but str.split splits on blank lines and spaces. Therefore, your entire code can be replaced with this:

Python code:
def word_counts(filename):
    with open(filename) as f:
        raw_text = f.read()
    return Counter(word.lower() for word in raw_text.split())
I don't think any sane person could call that incoherent.

E: you were probably referring to lambdas, so disregard my "incoherence" jabs.

And because I like functional programming in Python and think the toolz library is really cool:
Python code:
from operator import methodcaller
from collections import Counter
from toolz import compose
from toolz.curried import map

word_counts = compose(
    Counter,
    map(str.lower),
    str.split,
    methodcaller('read'),  # wish this could just be file.read
    open,
)
E2: messed it up a few times. If you prefer the opposite order import pipe instead

Python code:
def word_counts(filename):
    return pipe(filename,
        open,
        methodcaller('read'),
        str.split,
        map(str.lower),
        Counter,
    )
I think I like the second way better.

SurgicalOntologist fucked around with this message at 06:02 on Jan 17, 2015

Fergus Mac Roich
Nov 5, 2008

Soiled Meat
Very true. I'll think of some more apps to write and be back when I have more questions. Thank you all.

KICK BAMA KICK
Mar 2, 2009

ShadowHawk posted:

The latter isn't more pythonic, it's explicitly prohibited in the Pep-8 style guide. If you need to use the function more than the line it's created on you should just use a def statement.
OK, I'll concede the PEP-8 point. I never actually use a named lambda in reality, that was just for illustration of the concept, but would you really rather see a full-fledged def than a lambda assigned to a name on its own line for something that's simple but barely too long to comfortably fit an an anonymous argument as like the key of sorted? Cause despite whatever PEP-8 has to say, the latter seems way more readable to me.

Adbot
ADBOT LOVES YOU

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe

SurgicalOntologist posted:

Ah, but sometimes brevity is quite coherent.

Not only can Counter take an iterable, but str.split splits on blank lines and spaces. Therefore, your entire code can be replaced with this:

Python code:
def word_counts(filename):
    with open(filename) as f:
        raw_text = f.read()
    return Counter(word.lower() for word in raw_text.split())
I don't think any sane person could call that incoherent.

E: you were probably referring to lambdas, so disregard my "incoherence" jabs.

And because I like functional programming in Python and think the toolz library is really cool:
Python code:
from operator import methodcaller
from collections import Counter
from toolz import compose
from toolz.curried import map

word_counts = compose(
    Counter,
    map(str.lower),
    str.split,
    methodcaller('read'),  # wish this could just be file.read
    open,
)
E2: messed it up a few times. If you prefer the opposite order import pipe instead

Python code:
def word_counts(filename):
    return pipe(filename,
        open,
        methodcaller('read'),
        str.split,
        map(str.lower),
        Counter,
    )
I think I like the second way better.

Don't write Lisp in Python, please.

  • Locked thread