Python information and short questions megathread.

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Python information and short questions megathread.

«‹›484 »

Lysidas: Jul 26, 2002; John Diefenbaker is a madman who thinks he's John Diefenbaker.; Pillbug

The names of the columns are accessible as the columns attribute, so the column name that you want is df.columns[1].

# ? Aug 5, 2014 18:02

Adbot: ADBOT LOVES YOU

# ? Jun 11, 2024 06:37

the: Jul 18, 2004; by Cowcaster

What I'm saying is that I'm trying to do this:

code:

lp = pd.DataFrame.plot(l,x=" CloseDate Id")

And getting

KeyError: u'no item named CloseDate Id'

edit:

vikingstrike posted:

Have you tried df['CloseDate Id']?

code:

In [27]: l[' CloseDate Id']
KeyError: u'no item named  CloseDate Id'

In [28]: l['CloseDate Id']
KeyError: u'no item named CloseDate Id'

Lysidas posted:

The names of the columns are accessible as the columns attribute, so the column name that you want is df.columns[1].

code:

In [29]: l.columns[1]
Out[29]: 'CloseDate'

this works!

the fucked around with this message at 18:07 on Aug 5, 2014

# ? Aug 5, 2014 18:04

the: Jul 18, 2004; by Cowcaster

Appending onto this, I now can't get a cumsum() model to work with this dataframe.

When I try:

l.cumsum() I get the following:

TypeError: unsupported operand type(s) for +: 'datetime.date' and 'datetime.date'

Which I understand why that's happening, but what I'm trying to do is show a cumulative sum of the amounts over time. So, I'll try to specify an axis:

l.cumsum(axis='Amount')

ValueError: No axis named Amount for object type <class 'pandas.core.frame.DataFrame'>

Yet:

code:

In [52]: l.describe()
Out[52]: 
             Amount
count   1642.000000
mean    3661.104318
std     4529.367173

# ? Aug 5, 2014 18:55

Jose Cuervo: Aug 25, 2004

the posted:

Appending onto this, I now can't get a cumsum() model to work with this dataframe.

When I try:

l.cumsum() I get the following:

TypeError: unsupported operand type(s) for +: 'datetime.date' and 'datetime.date'

Which I understand why that's happening, but what I'm trying to do is show a cumulative sum of the amounts over time. So, I'll try to specify an axis:

l.cumsum(axis='Amount')

ValueError: No axis named Amount for object type <class 'pandas.core.frame.DataFrame'>

Yet:
code:
In [52]: l.describe()
Out[52]: 
             Amount
count   1642.000000
mean    3661.104318
std     4529.367173

I am pretty sure axis in this context refers to rows (axis=0) or columns (axis=1).

EDIT: perhaps try df['Amount'].cumsum()

# ? Aug 5, 2014 18:57

Ahz: Jun 17, 2001; PUT MY CART BACK? I'M BETTER THAN THAT AND YOU! WHERE IS MY BUTLER?!

Jose Cuervo posted:

I am pretty sure axis in this context refers to rows (axis=0) or columns (axis=1).

EDIT: perhaps try df['Amount'].cumsum()

Dropping in to say that the Pandas documentation sucks for a stats newb. I finally got my groups and filters working for my main reports, but does anyone know a good Pandas for dummies article/site out there? I find the syntax to Panda's filtering and grouping methods highly unintuitive. For example, this is the method I found works on filtering by row value:

Python code:

data_frame = data_frame.loc[data_frame[column_name] >= low_value]

# ? Aug 5, 2014 19:19

Jose Cuervo: Aug 25, 2004

Ahz posted:

Dropping in to say that the Pandas documentation sucks for a stats newb. I finally got my groups and filters working for my main reports, but does anyone know a good Pandas for dummies article/site out there? I find the syntax to Panda's filtering and grouping methods highly unintuitive. For example, this is the method I found works on filtering by row value:
Python code:
data_frame = data_frame.loc[data_frame[column_name] >= low_value]

The pandas documentation for cumsum() seems pretty straightforward to me where it says "Return cumulative sum over requested axis", and all the documentation I have read refers to axis as either row (or index) or column.

# ? Aug 5, 2014 19:36

vikingstrike: Sep 23, 2007; whats happening, captain

Ahz posted:

Dropping in to say that the Pandas documentation sucks for a stats newb. I finally got my groups and filters working for my main reports, but does anyone know a good Pandas for dummies article/site out there? I find the syntax to Panda's filtering and grouping methods highly unintuitive. For example, this is the method I found works on filtering by row value:
Python code:
data_frame = data_frame.loc[data_frame[column_name] >= low_value]

Wes McKinney the pandas library creator has a book called Data Analysis with Python that goes through a bunch of stuff with pandas and matplotlib that may be helpful.

# ? Aug 5, 2014 20:18

Ahz: Jun 17, 2001; PUT MY CART BACK? I'M BETTER THAN THAT AND YOU! WHERE IS MY BUTLER?!

Jose Cuervo posted:

The pandas documentation for cumsum() seems pretty straightforward to me where it says "Return cumulative sum over requested axis", and all the documentation I have read refers to axis as either row (or index) or column.

Contrast with (grouping a data frame by timestamp in minute increments):

http://pandas.pydata.org/pandas-docs/dev/generated/pandas.Series.groupby.html#pandas.Series.groupby

Python code:

grouped_data = data_frame.groupby([pandas.Grouper(freq='60s',key='content_choice_create_ts'),'question_choice_id']).count()

Thank god for Stack Overflow, because that inner function as a parameter would not have come to me reading the Panda docs.

I think the official documentation has pretty horrible or non-existent example cases which is where I often learn from.

# ? Aug 5, 2014 20:27

Thermopyle: Jul 1, 2003; ...the stupid are cocksure while the intelligent are full of doubt. �Bertrand Russell

I'm looking at resurrecting something I wrote a long time ago where I stupidly wasn't consistent in my usage of single vs double quotes. I find this slightly irritating.

Any tools out there for converting all of these to something more consistent?

# ? Aug 5, 2014 21:12

The March Hare: Oct 15, 2006; _{Je r�ve d'un}
Wayne's World 3; Buglord

Thermopyle posted:

I'm looking at resurrecting something I wrote a long time ago where I stupidly wasn't consistent in my usage of single vs double quotes. I find this slightly irritating.

Any tools out there for converting all of these to something more consistent?

You could just open sublime text, add the files to the current project, cmd shift f, and do a find replace on all files in the project. Might hit snags but it is a way to do it.

# ? Aug 5, 2014 22:01

JHVH-1: Jun 28, 2002

drat, I was just going to check out PythonTidy to see what it can do after reading the post about cleaning up quotes and realized I broke yum by installing mx.DateTime. It had some overflow error preventing yum from working. Removed it and re-installed from whatever the recent version is and things are good again.

# ? Aug 5, 2014 22:07

the: Jul 18, 2004; by Cowcaster

Interview questions, in case anyone wants to try:

1. Write a program that finds all occurrences of letters in a string and stores them to a dict.

2. Fizzbuzz (1-100, 3 for fizz, 5 for buzz, 3+5 for fizzbuzz)

# ? Aug 5, 2014 22:22

EAT THE EGGS RICOLA: May 29, 2008

What level of position is this for?

# ? Aug 5, 2014 22:46

the: Jul 18, 2004; by Cowcaster

EAT THE EGGS RICOLA posted:

What level of position is this for?

Fullstack backend Python developer at a small startup of less than a dozen people.

# ? Aug 5, 2014 23:12

The March Hare: Oct 15, 2006; _{Je r�ve d'un}
Wayne's World 3; Buglord

the posted:

Fullstack backend Python developer at a small startup of less than a dozen people.

You are making me think I can maybe get a job... hrmmmm.

# ? Aug 5, 2014 23:13

Thermopyle: Jul 1, 2003; ...the stupid are cocksure while the intelligent are full of doubt. �Bertrand Russell

The March Hare posted:

You could just open sublime text, add the files to the current project, cmd shift f, and do a find replace on all files in the project. Might hit snags but it is a way to do it.

Yeah, but I'm not sure of all the gotchas this might hit and I was hoping someone else had thought through them all. One gotcha I can think of right off the bat is strings containing quotes.

# ? Aug 5, 2014 23:22

fletcher: Jun 27, 2003; ken park is my favorite movie; Cybernetic Crumb

Thermopyle posted:

Yeah, but I'm not sure of all the gotchas this might hit and I was hoping someone else had thought through them all. One gotcha I can think of right off the bat is strings containing quotes.

I usually just ctrl+d a few of 'em in Sublime Text, switch 'em, ctrl+d a few more, switch 'em. Makes it a little easier to spot the gotchas than doing a global find & replace.

# ? Aug 5, 2014 23:49

ohgodwhat: Aug 6, 2005

the posted:

1. Write a program that finds all occurrences of letters in a string and stores them to a dict.

Stores the letters? Like a list of letters all equal to the key, I presume? Or a dict where the values are the number of occurrences, which makes more sense to me?

Either way, don't be the guy who put "Theory of Auto-Meta" down as a course he just took in grad school.

# ? Aug 6, 2014 00:05

the: Jul 18, 2004; by Cowcaster

ohgodwhat posted:

Stores the letters? Like a list of letters all equal to the key, I presume? Or a dict where the values are the number of occurrences, which makes more sense to me?

Either way, don't be the guy who put "Theory of Auto-Meta" down as a course he just took in grad school.

Their example was:

string = "abcdaa"

dict = {a:3, b:1, c:1, d:1}

# ? Aug 6, 2014 00:20

tef: May 30, 2004; -> some l-system crap ->

the posted:

Their example was:

string = "abcdaa"

dict = {a:3, b:1, c:1, d:1}

code:

>>> import collections
>>> collections.Counter("abcdaa")
Counter({'a': 3, 'c': 1, 'b': 1, 'd': 1})
>>>

# ? Aug 6, 2014 00:22

OnceIWasAnOstrich: Jul 22, 2006

the posted:

1. Write a program that finds all occurrences of letters in a string and stores them to a dict.

This is the homework question I give on the first day of teaching Python to biologists who don't know to program, if we aren't including the day that their homework is installing Python.

# ? Aug 6, 2014 00:33

EAT THE EGGS RICOLA: May 29, 2008

Thermopyle posted:

Yeah, but I'm not sure of all the gotchas this might hit and I was hoping someone else had thought through them all. One gotcha I can think of right off the bat is strings containing quotes.

I think PyCharm will give you a hand with that. If not, it will inspect looking for lots of different errors/pep8 violations/etc.

There's a free version and a pro version that comes with a free 30 day trial.

# ? Aug 6, 2014 00:33

EAT THE EGGS RICOLA: May 29, 2008

OnceIWasAnOstrich posted:

This is the homework question I give on the first day of teaching Python to biologists who don't know to program, if we aren't including the day that their homework is installing Python.

I guess that question and fizz buzz are not the worst if all you're looking for is "have you ever seen python before y/n"

# ? Aug 6, 2014 00:35

Zarithas: Jun 18, 2008

tef posted:

code:

>>> import collections
>>> collections.Counter("abcdaa")
Counter({'a': 3, 'c': 1, 'b': 1, 'd': 1})
>>>

This is the most idiomatic way.

The answers they were probably looking for, to show you know how to use dicts and such:

code:

from collections import defaultdict

d = defaultdict(int)

for c in string:
    d[c] += 1

code:

d = {}

for c in string:
    d[c] = d.setdefault(c, 0) + 1
    # Or
    # d[c] = d.get(c, 0) + 1

or the ugly way

code:

d = {}

for c in string:
    if c not in d:
        d[c] = 0
    else:
        d[c] += 1

I'm also quite surprised that this and fizzbuzz were the only things asked for a "fullstack backend Python dev" position. These seem more like questions I'd ask a complete beginner to the language. I might throw this in just to see if a candidate knows about builtin and standard lib functionality like setdefault and defaultdict, but I'd also ask some much harder questions as well.

Is this just a stage 1 interview? It might make sense if they're just weeding out the cruft.

Zarithas fucked around with this message at 02:40 on Aug 6, 2014

# ? Aug 6, 2014 02:38

the: Jul 18, 2004; by Cowcaster

Yes it is a Stage 1 interview. I probably didn't get thru since the 3rd and 4th questions were about Jquery and mysql.

# ? Aug 6, 2014 03:07

Thermopyle: Jul 1, 2003; ...the stupid are cocksure while the intelligent are full of doubt. �Bertrand Russell

EAT THE EGGS RICOLA posted:

I think PyCharm will give you a hand with that. If not, it will inspect looking for lots of different errors/pep8 violations/etc.

There's a free version and a pro version that comes with a free 30 day trial.

Yeah, I've used Pro for years. AFAICT it doesn't really have anything beyond regex search and replace. I'll dig in to it deeper later.

# ? Aug 6, 2014 03:14

namaste friends: Sep 18, 2004; by Smythe

Thermopyle posted:

Yeah, I've used Pro for years. AFAICT it doesn't really have anything beyond regex search and replace. I'll dig in to it deeper later.

It's got a really neat remote auto uploader and remote compiler.

# ? Aug 6, 2014 03:15

ohgodwhat: Aug 6, 2005

Zarithas posted:

Is this just a stage 1 interview? It might make sense if they're just weeding out the cruft.

What are some good, harder python specific questions anyway?

I asked about meta classes once and the guy just went 'I know all about multiple classes, yes!' uuuugh

# ? Aug 6, 2014 03:27

Thermopyle: Jul 1, 2003; ...the stupid are cocksure while the intelligent are full of doubt. �Bertrand Russell

Cultural Imperial posted:

It's got a really neat remote auto uploader and remote compiler.

Sorry, I meant that it didn't have anything besides the search and replace for the purposes I was talking about.

It has tons of other neat features.

# ? Aug 6, 2014 03:31

SurgicalOntologist: Jun 17, 2004

Okay I've been banging my head on something for a couple days so I'm coming here for help.

In part as an exercise, I'm trying to convert a project to use asyncio. The part I'm working on now is managing a server process called out with the asyncio version of subprocess.

I'm running into a problem with stopping the process. Here's the integration(?) test I'm trying to pass:

Python code:

@async_test
def test_server_process_noop(loop):
    """Make sure the server exits after the with statement."""
    with (yield from Server([], loop=loop)) as server:
        assert server.is_running
    assert not server.is_running

(Calling the server with an empty list will do nothing, no stdout or stderr output, just hang indefinitely. So I'm just trying to test if I can start a process and terminate it.)

I had to jump through some hoops to setup the async_test decorator (got it from a Stack Overflow question) as well as the yield from in the with statement (got the technique from asyncio.locks.Lock). Both seem to be working, async_test is working fine in unit tests and __enter__ and __exit__ are getting called appropriately according to my debug messages.

Anyways... what happens is the test fails on the second assert, but when I check ps -ef it's not actually running. Here's the relevant parts of the source code:

Python code:

class Server:
    ...

    @property
    def is_running(self):
        return self.process.returncode is None if self.process else False

    ...

    @asyncio.coroutine
    def stop(self):
        if self.is_running:
            self.cancel_monitoring()
            if kill:
                self.process.kill()
            else:
                self.process.terminate()
            info('{} sent to server process.'.format('SIGKILL' if kill else 'SIGTERM'))

            debug('yielding from coroutine Server.process.wait')
            exitcode = yield from self.process.wait()
            debug('exit code: {}'.format(exitcode))

    ...

As you may have guessed, self.process is an asyncio.subprocess.Process object. Here are the relevant logs:

code:

server.py                  255 DEBUG    running coroutine Server.stop with asyncio.async
base_events.py             791 DEBUG    poll 0.000 took 0.000 seconds
server.py                  211 DEBUG    canceling Task monitoring stdout
server.py                  211 DEBUG    canceling Task monitoring stderr
server.py                  193 INFO     SIGTERM sent to server process.
server.py                  195 DEBUG    yielding from coroutine Server.process.wait

And then it just ends, or yields control back to the test I suppose.

The interesting thing is that we never get to that next debug message that gives us the exit code, and the process apparently is still running when it gives control back to the test, but it closes soon after. I suppose Server.stop is ceding control back to the test. So I tried putting asyncio.sleep in the test, before the final assert, hoping to cede control back to Server.stop. But, same problem.

Compare to the example in the docs.

Now I'm thinking it's because of the async_test decorator, so here it is:

Python code:

def async_test(func):
    @functools.wraps(func)
    def wrapper(loop, *args, **kwargs):
        coro = asyncio.coroutine(func)
        loop.run_until_complete(coro(loop, *args, **kwargs))
    return wrapper

Maybe since run_until_complete only refers to the test coroutine, but not Server.stop, the event loop stops before that yield from can finish (but then the sleep should have helped, right?). Not sure what to do about it if that's the case.

Anyways, if any of you actually read through all that, thanks. This is difficult stuff and I'm in over my head. Maybe this isn't such a big deal if it just affects testing (nothing will ever need to happen after the with statement), but it could be indicative of bigger problems. Testing that the process exits is important because if it doesn't strange errors will start happening on the next run.

Also, I checked outside of Python and the "hanging noop server" does indeed respond to SIGTERM properly.

SurgicalOntologist fucked around with this message at 07:51 on Aug 6, 2014

# ? Aug 6, 2014 07:49

BeefofAges: Jun 5, 2004; Cry 'Havoc!', and let slip the cows of war.

ohgodwhat posted:

What are some good, harder python specific questions anyway?

I asked about meta classes once and the guy just went 'I know all about multiple classes, yes!' uuuugh

I consider metaclasses to be a pretty esoteric language feature, and what I expect most people to know about them is that you probably shouldn't use them unless you have a really good reason.

Asking about list comprehensions is fair game - they're a neat Python feature that people should be familiar with.

You could try asking if Python is pass-by-value or pass-by-reference. (trick question, it's not quite either - http://robertheaton.com/2014/02/09/pythons-pass-by-object-reference-as-explained-by-philip-k-dick/ ) I wouldn't expect people to answer this correctly, but it would be interesting to see them try to reason their way through it based on the Python behavior they're familiar with.

# ? Aug 6, 2014 16:23

Munkeymon: Aug 14, 2003; Motherfucker's got an
armor-piercing crowbar! Rigoddamndicu𝜆ous.

the posted:

Fullstack backend Python developer at a small startup of less than a dozen people.

Here's the one I did for an interview I had the other week:

quote:

Using the CDC Birth Vital Statistics data set and Python, design and develop one or more insightful views of the data, data facets, or subsets of data. What did you learn and can demonstrate from the data that surprised you? Bonus points for using AppEngine.

Because they're an AppEngine shop. I was surprised how quick and easy it was to get my idiot baby "compute r of any two things and group by a third" idea working on AppEngine having never used it before and I'm not even a full time Python dev right now.

What I'm getting at is think of something harder or just steal that idea

# ? Aug 6, 2014 16:51

the: Jul 18, 2004; by Cowcaster

I'm guessing this is possible, but I need to know how.

I want to drop rows in a dataframe where the value of said row for Column Y meets a condition (say if that cell = 0). An example would be dropping all rows from a Census data set where the person's age was below 5.

The Pandas documentation on the drop() command is... sparse. From searching, I've seen ways to drop rows based on naming the specific row, but I need a conditional statement here.

edit: This solved it:

code:

l = l[l.Amount != 0]

the fucked around with this message at 17:23 on Aug 6, 2014

# ? Aug 6, 2014 17:21

vikingstrike: Sep 23, 2007; whats happening, captain

the posted:

I'm guessing this is possible, but I need to know how.

I want to drop rows in a dataframe where the value of said row for Column Y meets a condition (say if that cell = 0). An example would be dropping all rows from a Census data set where the person's age was below 5.

The Pandas documentation on the drop() command is... sparse. From searching, I've seen ways to drop rows based on naming the specific row, but I need a conditional statement here.

edit: This solved it:
code:
l = l[l.Amount != 0]

Drop is for deleting variables in a data frame.

# ? Aug 6, 2014 18:12

ShadowHawk: Jun 25, 2000; CERTIFIED PRE OWNED TESLA OWNER

the posted:

I'm guessing this is possible, but I need to know how.

I want to drop rows in a dataframe where the value of said row for Column Y meets a condition (say if that cell = 0). An example would be dropping all rows from a Census data set where the person's age was below 5.

The Pandas documentation on the drop() command is... sparse. From searching, I've seen ways to drop rows based on naming the specific row, but I need a conditional statement here.

edit: This solved it:
code:
l = l[l.Amount != 0]

I think this copies the whole object in memory, then reassigns the original -- is there a more efficient way to do that?

# ? Aug 6, 2014 18:28

SurgicalOntologist: Jun 17, 2004

After some more investigation, I've determined that asyncio.Process.wait hangs on my machine for any subprocess, even something like ls. Even the example in the asyncio docs (which runs python -m platform) hangs at yield from proc.wait(). Any idea what could be happening here?

Edit: hmm, seems to be because I'm using asyncio.new_event_loop() rather than asyncio.get_event_loop(). Maybe I'm onto the solution.

SurgicalOntologist fucked around with this message at 21:11 on Aug 6, 2014

# ? Aug 6, 2014 21:07

fletcher: Jun 27, 2003; ken park is my favorite movie; Cybernetic Crumb

I want to write a python library for interacting with a particular API (which does not have a python library yet) using the requests library. Is it appropriate to raise exceptions if something like a login() call fails? Should it be a custom exception type? Is there a nice library I can look at for ideas about this kind of stuff?

# ? Aug 6, 2014 21:42

namaste friends: Sep 18, 2004; by Smythe

fletcher posted:

I want to write a python library for interacting with a particular API (which does not have a python library yet) using the requests library. Is it appropriate to raise exceptions if something like a login() call fails? Should it be a custom exception type? Is there a nice library I can look at for ideas about this kind of stuff?

What's wrong with using requests exception code?

# ? Aug 6, 2014 21:59

fletcher: Jun 27, 2003; ken park is my favorite movie; Cybernetic Crumb

Cultural Imperial posted:

What's wrong with using requests exception code?

The API seems to always return a 200 OK with details about the error in the response payload, rather than using meaningful HTTP codes for certain errors.

# ? Aug 6, 2014 23:21

Adbot: ADBOT LOVES YOU

# ? Jun 11, 2024 06:37

ShadowHawk: Jun 25, 2000; CERTIFIED PRE OWNED TESLA OWNER

SurgicalOntologist posted:

After some more investigation, I've determined that asyncio.Process.wait hangs on my machine for any subprocess, even something like ls. Even the example in the asyncio docs (which runs python -m platform) hangs at yield from proc.wait(). Any idea what could be happening here?

Edit: hmm, seems to be because I'm using asyncio.new_event_loop() rather than asyncio.get_event_loop(). Maybe I'm onto the solution.

Just a stab in the dark but do you have a namespace collision by defining your own stop() in the above?

# ? Aug 7, 2014 01:50

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Python information and short questions megathread.

«‹›484 »