Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Locked thread
tef
May 30, 2004

-> some l-system crap ->
You made me worry there :3:

Adbot
ADBOT LOVES YOU

German Joey
Dec 18, 2004

well how about that.

JetsGuy
Sep 17, 2003

science + hockey
=
LASER SKATES

ShoulderDaemon posted:

If you're running a sh script, then the script starts its own shell. It doesn't borrow the shell it's invoked through.


There is the interactive shell that you start your python script in.
Then there is a different, noninteractive shell that python invokes because you set shell=True.
Then there is a different, noninteractive shell that the shell script runs in.

If you set shell=False, then python will avoid uselessly starting that extra shell in the middle, and you can avoid some security issues.

Ok, just for my own edification, I don't start a python shell, when I say I am running scripts from the shell, I mean from the bash prompt. Is that first shell the shell that python is running in order to execute the script?

So I guess my question then is what's the point of invoking shell=True then? I could have sworn that I had trouble running bash commands from within an interactive shell in the past, but now it seems to be ok. If stuff like subprocess.Popen will work with or without shell=True, what benefit do you get using it?

fritz
Jul 26, 2003

For that merge sort example, isn't that an awful lot of calls to .append()?

Emacs Headroom
Aug 2, 2003

fritz posted:

For that merge sort example, isn't that an awful lot of calls to .append()?

I believe append will only resize the list if the pre-allocated space is filled, and bigger allocations occur as powers of 2, so the list gets copied less than log2(n) times.

http://stackoverflow.com/questions/311775/python-create-a-list-with-initial-capacity

edit: and of course the entries in the list itself consist of references, so there's no larger penalty for appending larger objects etc.

fritz
Jul 26, 2003

Ridgely_Fan posted:

I believe append will only resize the list if the pre-allocated space is filled, and bigger allocations occur as powers of 2, so the list gets copied less than log2(n) times.

http://stackoverflow.com/questions/311775/python-create-a-list-with-initial-capacity

edit: and of course the entries in the list itself consist of references, so there's no larger penalty for appending larger objects etc.

Oh, cool, maybe I was thinking of a different language.

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe

JetsGuy posted:

So I guess my question then is what's the point of invoking shell=True then?

To allow morons to write code like this:

code:
p = subprocess.Popen("%s | sort | uniq | wc -l" % (command,), shell=True, stdout=subprocess.PIPE)
p.stdoout.read()
Don't do that. Any use of shell=True is an instant red flag.

leterip
Aug 25, 2004

tef posted:

Despite this, people will tell you its performance without profiling it. Don't listen to these people, they are bad people. (also they do not know how python's sort works)

You guys might enjoy finding out how sort works in python :v: http://en.wikipedia.org/wiki/Timsort

Oh right, I forgot Timsort is amazing and uses ranges of already sorted data to run quicker. Thanks for pointing that out.

tef
May 30, 2004

-> some l-system crap ->
Tim Peters :swoon:

lunar detritus
May 6, 2009


Thanks everyone! I just began learning so I wasn't sure what was meant by "linear time".

JetsGuy
Sep 17, 2003

science + hockey
=
LASER SKATES

Suspicious Dish posted:

To allow morons to write code like this:

code:
p = subprocess.Popen("%s | sort | uniq | wc -l" % (command,), shell=True, stdout=subprocess.PIPE)
p.stdoout.read()
Don't do that. Any use of shell=True is an instant red flag.

I'm gonna have to admit that I'm an idiot and I don't see what that command would do... :(

fritz posted:

For that merge sort example, isn't that an awful lot of calls to .append()?

I used to use append a LOT, but I found numpy.vstack and numpy.hstack to be even better (and often times clearer to use).

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe

JetsGuy posted:

I'm gonna have to admit that I'm an idiot and I don't see what that command would do... :(

It's a standard UNIX thing to get the number of unique lines in a file.

No Safe Word
Feb 26, 2005

Suspicious Dish posted:

It's a standard UNIX thing to get the number of unique lines in a file.

Neglecting the fact that sort has a flag -u that makes sort | uniq unnecessary :v:

tef
May 30, 2004

-> some l-system crap ->
creeping featurism :bahgawd:

http://www.in-ulm.de/~mascheck/various/uuoc/kp.pdf

JetsGuy
Sep 17, 2003

science + hockey
=
LASER SKATES
Crossposting this from the sci computing thread. Since a few of y'all use matplotlib, I figure you may want to know.

Jetsguy posted:

JetsGuy posted:

Argh, I'm sorry, this is a really stupid question, but I can't for the life of me seem to google the right words to find what I want.

When I want to format the tick marks in a matplotlib graph, I generally do stuff like:

code:
for label in ax1.xaxis.get_ticklabels():
    label.set_fontsize(20);
for lines in ax1.xaxis.get_ticklines():
    lines.set_markeredgewidth(1);
    lines.set_markersize(12);
for lines in ax1.xaxis.get_ticklines(minor=True):
    lines.set_markeredgewidth(1);
    lines.set_markersize(6);
for label in ax1.yaxis.get_ticklabels():
    label.set_fontsize(20);
for lines in ax1.yaxis.get_ticklines():
    lines.set_markeredgewidth(1);
    lines.set_markersize(12);
for lines in ax1.yaxis.get_ticklines(minor=True):
    lines.set_markeredgewidth(1);
    lines.set_markersize(6);
However, back when someone asked me to teach them to plot in python, I found that matplotlib had finally made this poo poo easier and there was a MUCH easier set of commands to set format how the ticks look.

I cannot, for the life of me, find this command. Can anyone help, please?

Lucky for me, one of my colleagues takes awesome notes and had hardcopies of all the poo poo I taught about plotting with python.

turns out you can do all that poo poo with the new command:
matplotlib.pyplot.tick_params()

http://matplotlib.sourceforge.net/users/whats_new.html#tick-params

You *need* matplotlib 1.1.0+ though

lunar detritus
May 6, 2009


I managed to make a spider that gets the info I needed with scrapy but I'm getting strings like this in the json file:
code:
"{"ozname": ["1492: La conquista del para\u00edso / 1492 The Conquest of Paradise"]"
When I read and print the json file I get:
code:
"u'ozname': [u'1492: La conquista del para\xedso / 1492 The Conquest of Paradise']}"
How can I transform those encoded bytes to their unicode characters? The previous string should say 'La conquista del paraíso'.

I tried unicode, decode and encode but I'm pretty sure I'm doing something wrong because nothing works. :smith:



vvvvvv
The real page displays it as "1492: La conquista del paraíso / 1492 The Conquest of Paradise" so I think scrapy is encoding it weird.


Never mind, the json exporter escapes unicode characters by default.

lunar detritus fucked around with this message at 20:43 on May 23, 2012

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe

gmq posted:

I managed to make a spider that gets the info I needed with scrapy but I'm getting strings like this in the json file:
code:
"{"ozname": ["1492: La conquista del para\u00edso / 1492 The Conquest of Paradise"]"

Is this the data you get back from the server? You're hosed.

Jo
Jan 24, 2005

:allears:
Soiled Meat
This is driving me nuts and Google isn't helping.

code:
from Tkinter import *

class App(object):
	def __init__(self, parent):
			frame = Frame(parent);
			label = Label(frame, text="blah");
			label.grid(row=1, column=1); # This kills the app.
			frame.pack();

root = Tk();
app = App(root);
root.mainloop();
The label.grid() action appears to kill my program. It just sits and spins, doing nothing. I can use grid on most any other widget. The salt on the wound is that I can't debug using pdb; input locks up when that instruction gets run. I might be able to try a little harder, but I'm just peeved about the whole scenario. Any guesses as to why grid doesn't work on labels?

Emacs Headroom
Aug 2, 2003

Jo posted:

This is driving me nuts and Google isn't helping.

code:
from Tkinter import *

class App(object):
	def __init__(self, parent):
			frame = Frame(parent);
			label = Label(frame, text="blah");
			label.grid(row=1, column=1); # This kills the app.
			frame.pack();

root = Tk();
app = App(root);
root.mainloop();
The label.grid() action appears to kill my program. It just sits and spins, doing nothing. I can use grid on most any other widget. The salt on the wound is that I can't debug using pdb; input locks up when that instruction gets run. I might be able to try a little harder, but I'm just peeved about the whole scenario. Any guesses as to why grid doesn't work on labels?

I don't know about Tk stuff, but if the input and ctrl-c have locked, it's usually I think because thread has stolen the GIL and your main thread can't get it back to process the input.

Not that the knowledge will help in any way.

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe
There's a reason no real applications are built on Tkinter.

fart simpson
Jul 2, 2005

DEATH TO AMERICA
:xickos:

By the way, you don't use semicolons to end lines in Python

onionradish
Jul 6, 2006

That's spicy.
I want to find specific sequence patterns in a list of tuples, based on the second element in the tuple.

For example, given a list like:
code:
[(dog, animal), (cow, animal), (corn, vegetable), (granite, mineral), (carrot, vegetable), (cat, animal), (cow, animal)]
I'd like to find patterns in the sequence, like "animal, animal" and get [[dog, cow],[cat, cow]] or "vegetable, mineral, vegetable" and get [corn, granite, carrot], or "vegetable, vegetable" and get nothing.

I can think of ways to do this by manually iterating through the list and testing the second tuple element against a bunch of IF statements, but suspect there's a smarter, more Python-like way to do it. I've just started with the language, and am already blown away by how much code Python's constructors eliminate.

king salmon
Oct 30, 2011

by Cowcaster

onionradish posted:

I want to find specific sequence patterns in a list of tuples, based on the second element in the tuple.

For example, given a list like:
code:
[(dog, animal), (cow, animal), (corn, vegetable), (granite, mineral), (carrot, vegetable), (cat, animal), (cow, animal)]
I'd like to find patterns in the sequence, like "animal, animal" and get [[dog, cow],[cat, cow]] or "vegetable, mineral, vegetable" and get [corn, granite, carrot], or "vegetable, vegetable" and get nothing.

I can think of ways to do this by manually iterating through the list and testing the second tuple element against a bunch of IF statements, but suspect there's a smarter, more Python-like way to do it. I've just started with the language, and am already blown away by how much code Python's constructors eliminate.

Use a dictionary:
d = {'animal': [dog, cow, cat], 'vegetable': [corn, carrot], 'mineral': [granite]}

Then you can access the dictionary by category: d['animal'] gives you a list of your animals.

onionradish
Jul 6, 2006

That's spicy.
Will that give me the sequence, though? The order will matter, and that's the part I'm not sure how do do "elegantly." For example, "animal, vegetable" vs. "vegetable, animal" should return [cow, corn] and [carrot, cat] respectively. Myabe a Regular Expression?

However, the dictionary form would be helpful elsewhere in code as a list of items in a category. Is there an easy way to convert the list example (assumed assigned to a variable) to that dictionary format? I'm doing it currently by iterating the list with separate constructors with custom IF statements. The categories (second tuple parameter) are hard-wired, but the entries (first tuple parameter) will vary.

onionradish fucked around with this message at 18:35 on May 24, 2012

Modern Pragmatist
Aug 20, 2008

onionradish posted:

I want to find specific sequence patterns in a list of tuples, based on the second element in the tuple.

For example, given a list like:
code:
[(dog, animal), (cow, animal), (corn, vegetable), (granite, mineral), (carrot, vegetable), (cat, animal), (cow, animal)]
I'd like to find patterns in the sequence, like "animal, animal" and get [[dog, cow],[cat, cow]] or "vegetable, mineral, vegetable" and get [corn, granite, carrot], or "vegetable, vegetable" and get nothing.

I can think of ways to do this by manually iterating through the list and testing the second tuple element against a bunch of IF statements, but suspect there's a smarter, more Python-like way to do it. I've just started with the language, and am already blown away by how much code Python's constructors eliminate.
code:
[name,category] = zip(*a)	# a is the list you provided
EDIT: Oh. I see what you want to do. You can disregard that.

Emacs Headroom
Aug 2, 2003
It actually took a minute to figure out what you were doing. So you have a list of things and a list of categories, and order in the list matters, and you want to return examples where the categories go in a particular sequence?

For this I'd probably switch over to numpy.

Use integer arrays, and say let 'animal' = 0, 'vegetable' = 2, 'mineral' = 3

Then you can do something like:

Python code:
def find_example_locations(category_array, example):
    locations = []
    for i in range(0, len(category_array)-len(example)):
        if category_array[i:i+len(example)] == example:
            locations.append(i)
    return locations
That would give you the indeces where your examples occured, and you can grab the entries from your thing_list (or thing_array) with 'dog', 'cow', etc. (or integers representing these things).

If performance wasn't a big deal within some constant value, you could use lists of integers instead of numpy arrays but it would be somewhat slower.

FoiledAgain
May 6, 2007

onionradish posted:

However, the dictionary form would be helpful elsewhere in code as a list of items in a category. Is there an easy way to convert the list example (assumed assigned to a variable) to that dictionary format? I'm doing it currently by iterating the list with separate constructors with custom IF statements. The categories (second tuple parameter) are hard-wired, but the entries (first tuple parameter) will vary.


I would do it this way:

Python code:
from collections import defaultdict

thelist = [('dog', 'animal'), ('cow', 'animal'), ('corn', 'vegetable'), ('granite', 'mineral'), 
('carrot', 'vegetable'), ('cat', 'animal'), ('cow', 'animal')]

d = defaultdict(list)

for item in thelist:
    d[item[1]].append(item[0])

print d['animal']

>>> ['dog', 'cow', 'cat', 'cow']

Defaultdict makes a dictionary where Python assumes that every entry is going to be of the same type. In this case, I told it that every entry will be a list. Notice by the way that I gave the argument list which is a type and not list() which is equivalent to [].

You can do the same thing with a try...except block:

Python code:

thelist = [('dog', 'animal'), ('cow', 'animal'), ('corn', 'vegetable'), ('granite', 'mineral'), 
('carrot', 'vegetable'), ('cat', 'animal'), ('cow', 'animal')]

d = dict()

for item in thelist:
    try:
        d[item[1]].append(item[0])
    except KeyError:
        d[item[1]] = list()
        d[item[1]].append(item[0])

FoiledAgain fucked around with this message at 20:36 on May 24, 2012

Emacs Headroom
Aug 2, 2003
You don't have to use default_dict or exceptions to get default values in Python dictionaries:

Python code:
d = {}

for thing in thelist:
    d[thing[1]] = d.get(thing[1], []).append(thing[0])

Lysidas
Jul 26, 2002

John Diefenbaker is a madman who thinks he's John Diefenbaker.
Pillbug

Ridgely_Fan posted:

You don't have to use default_dict or exceptions to get default values in Python dictionaries:

Python code:
d = {}

for thing in thelist:
    d[thing[1]] = d.get(thing[1], []).append(thing[0])

Your code doesn't work.

code:
In [1]: thelist = [('dog', 'animal'), ('cow', 'animal'), ('corn', 'vegetable'), ('granite', 'mineral'), 
   ...: ('carrot', 'vegetable'), ('cat', 'animal'), ('cow', 'animal')]

In [2]: d = {}

In [3]: for thing in thelist:
   ...:     d[thing[1]] = d.get(thing[1], []).append(thing[0])
   ...:     
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
/home/lysidas/<ipython-input-3-b935eb784dfb> in <module>()
      1 for thing in thelist:
----> 2     d[thing[1]] = d.get(thing[1], []).append(thing[0])
      3 

AttributeError: 'NoneType' object has no attribute 'append'

In [4]: d
Out[4]: {'animal': None}
list.append doesn't return the list object, it returns None.

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe
setdefault

Python code:
d = {}

for thing, category in thelist:
    d.setdefault(category, []).append(thing)
Except this is a bit silly, as I don't think this is what he wants to do, as he wants consecutive pattern matches. Here:

Python code:
def groups(L):
    """
    >>> groups([1, 2, 3, 4])
    [(1, 2), (2, 3), (3, 4)]
    """
    return zip(L, L[1:])

def find_pattern(L, pattern):
    for (thing1, category1), (thing2, category2) in groups(L):
        if (category1, category2) == pattern:
            yield thing1, thing2

Emacs Headroom
Aug 2, 2003

Lysidas posted:

Your code doesn't work.

Huh, that's odd. I guess you can't use the "get" method to default return empty lists. Oops.

Suspicious Dish posted:

Except this is a bit silly, as I don't think this is what he wants to do, as he wants consecutive pattern matches. Here:

Well I think he wanted both. He wants to get the items for pairs of categories as well as the items in the individual categories themselves (something like getting word counts, word bigrams, and word trigrams by categories I guess).

Emacs Headroom fucked around with this message at 01:10 on May 25, 2012

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe

Ridgely_Fan posted:

Huh, that's odd. I guess you can't use the "get" method to default return empty lists. Oops.

No, it's that list.append returns None.

Emacs Headroom
Aug 2, 2003
Ah, so I should have done:

Python code:
for thing in thelist:
    d[thing[1]] = d.get(thing[1], []) + [thing[0]]
Although it's pretty dumb-looking.

lunar detritus
May 6, 2009


I'm having a small problem writing a JSON file.

"TypeError: <Cast 'Josh Hartnett' as 'Eben Oleson'> is not JSON serializable"

Python code:
if hasattr(mq, 'cast'): movie['cast'] = mq.cast
movies[item['ozid']] = movie
originalmovies.write(json.dumps(movies, sort_keys=True, indent=2))
This is the content of movie['cast']

quote:

>>> print cast
[<Cast 'Josh Hartnett' as 'Eben Oleson'>, <Cast 'Melissa George' as 'Stella Oleson'>, <Cast 'Ben Foster' as 'Der Fremde'>, <Cast 'Danny Huston' as 'Marlow'>, <Cast 'Mark Boone Junior' as 'Beau'>, <Cast 'Craig Hall' as 'Wilson Bulosan'>, <Cast 'Manu Bennett' as 'Billy'>, <Cast 'Nathaniel Lees' as 'Carter Davies'>, <Cast 'Elizabeth Hawthorne' as 'Lucy Ikos'>, <Cast 'Joel Tobeck' as 'Doug Hertz'>]

I'm not sure what the problem is, any ideas?

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe
It's a custom Cast object, which isn't directly serializable to JSON? What kind of JSON data do you want?

lunar detritus
May 6, 2009


I have a JSON file exported by scrapy with movie names. I'm writing a script that takes those movie names and queries them against The Movie DB and saves the info (of all the movies) in a new JSON file.

It worked fine until I added 'Cast', 'Genres' and 'Crew' which are lists. It works if I use str(mq.cast) but they stop being lists. Maybe I can process them later into lists again by using commas as delimiters?

I don't really care about the format of the final file as long I can easily process it later to get it into Django. :v:

vvvvvvvvv
EDIT: No, I want to get them into Django later, I just want a file right now. :smith:
They are objects produced by the pytmdb3 module.

lunar detritus fucked around with this message at 04:03 on May 25, 2012

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe

gmq posted:

I have a JSON file exported by scrapy with movie names. I'm writing a script that takes those movie names and queries them against The Movie DB and saves the info (of all the movies) in a new JSON file.

It worked fine until I added 'Cast', 'Genres' and 'Crew' which are lists. It works if I use str(mq.cast) but they stop being lists. Maybe I can process them later into lists again by using commas as delimiters?

I don't really care about the format of the final file as long I can easily process it later to get it into Django. :v:

Right, so they're Django models. Use the Django serialization stuff to serialize your model to JSON.

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe

gmq posted:

EDIT: No, I want to get them into Django later, I just want a file right now. :smith:
They are objects produced by the pytmdb3 module.

Sorry, I missed this, as you didn't reply. Seems there's no easy way to export the data. You can try cast._data, but that may not be filled in properly.

lunar detritus
May 6, 2009


Python code:
movie['genres'] = [str(s) for s in mq.genres]
did the trick. I think json.dumps didn't like the unescaped single quotes inside each item in the list.

Thanks though!

Adbot
ADBOT LOVES YOU

onionradish
Jul 6, 2006

That's spicy.
Thanks for the ideas on pattern matching! A big help, especially learning about zip(); I'll play around with code this weekend.

  • Locked thread