Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Locked thread
drainpipe
May 17, 2004

AAHHHHHHH!!!!
Oh cool, I haven't looked at that. Are you talking about the DataFrames structure? I'm not familiar with Python and its packages so I'm using this class also as an opportunity to learn Python.

Adbot
ADBOT LOVES YOU

vikingstrike
Sep 23, 2007

whats happening, captain

drainpipe posted:

Oh cool, I haven't looked at that. Are you talking about the DataFrames structure? I'm not familiar with Python and its packages so I'm using this class also as an opportunity to learn Python.

Yep.

Cingulate
Oct 23, 2012

by Fluffdaddy

drainpipe posted:

I'm coding up the k-means clustering algorithm for a homework assignment. I'm given a list of data points (each with two coordinates) and I need to put them into k clusters. The way I'm doing it is to define k empty lists and fill them up with indices of my data points according to which cluster they belong to. I just realized that adding an extra column to the data array to indicate cluster assignment might work just as well.

edit: actually, that probably wouldn't be good since I need to calculate the means of a cluster and accessing the points of a cluster would be much harder if I had to read it from a field.
I'm not sure what your results are, but assuming you have a list of indices for k clusters and want to stash their ordinal list positions into k lists,

full_list = [[jj for jj, cc in enumerate(cluster_indices) if cc == ii] for ii in sorted(list(set(cluster_indices)))]

Probably not the smartest way to go about it though.

Assuming you have a numpy array where the 1st column is the value and the second column is the cluster index, you'd get the cluster mean for example via

arr[arr[:, 1] == cluster_index].mean()

or

np.where(arr == cluster_index)[0].mean()

Pandas is more comfortable though.

Nippashish
Nov 2, 2005

Let me see you dance!
Pandas is both a bit much and not quite the right fit for this imo. It makes updating the means easier, but it makes finding the cluster assignments harder, so you might as well do it in straight numpy where both things are fairly easy.

Storing your cluster ids in the same matrix as your features is probably not ideal, if you keep them separate you can do this:

code:
def square_distance(x,y):
    x2 = np.sum(x*x, axis=1, keepdims=True)
    y2 = np.sum(y*y, axis=1, keepdims=True)
    return x2 + y2.T - np.dot(x, y.T)

cluster_ids = np.argmin(square_distance(data, centers))
If you're learning to do data in python then both pandas and numpy should be high on your list of things to learn though.

long-ass nips Diane
Dec 13, 2010

Breathe.

I'm probably missing something incredibly stupid, but I've been staring at this and I can't figure it out:

code:
for i in range(0, (size * size) - 1, size):  # Scan every row
         for j in range(i, (i + (size)), 1):  # Check every cell in the row
                if Board.CellList[j].flag == 1:
                    rwin += 1
                if rwin == size - 1:  # If we're 1 away from a win
                    for k in range(i, (i + (size - 1)), 1):  # Figure out which cell is empty
                        if Board.CellList[k].flag == 0:
                            return k  # Return the empty cell
The second if statement, if rwin == size - 1, never fires, even when rwin is equal to that value. When I watch it in the debugger it just skips over that block of code like it's not even there.

edit: if I hardcode rwin to be the right number, it works, but not if I start it at 0 and increment it.

long-ass nips Diane fucked around with this message at 03:01 on Apr 9, 2016

QuarkJets
Sep 8, 2008

Cingulate posted:

Possibly slightly better:

code:
list_of_lists = [[] for _ in range(k)]
Also yes, my guess would also be there are very few instances where you want to do that - pre-allocate lists of empty lists of varying length. Elaborate the problem.

You wrote exactly the same code but with one variable name changed. There is no meaningful difference here

QuarkJets
Sep 8, 2008

Swagger Dagger posted:

I'm probably missing something incredibly stupid, but I've been staring at this and I can't figure it out:

code:
snip
The second if statement, if rwin == size - 1, never fires, even when rwin is equal to that value. When I watch it in the debugger it just skips over that block of code like it's not even there.

edit: if I hardcode rwin to be the right number, it works, but not if I start it at 0 and increment it.

Difficult to say without having more code. Try printing rwin, size, type(rwin), and type(size) right before the first for loop so that you can be sure that everything is the right value and right type going in. Your circumstance makes it sound like maybe rwin isn't an integer

KICK BAMA KICK
Mar 2, 2009

This is also exactly the kind of thing using a debugger will usually solve. A good IDE like PyCharm will have one built in, where you can step through your code line by line as it executes and examine the state of all the variables at any time.

QuarkJets
Sep 8, 2008

sorry, i can't use pycharm because i ssh into work from a tropical beach and the internet is a bit dodgy for anything more graphically intense than vim

Nippashish
Nov 2, 2005

Let me see you dance!

QuarkJets posted:

sorry, i can't use pycharm because i ssh into work from a tropical beach and the internet is a bit dodgy for anything more graphically intense than vim

Get pycharm's pro version and use a deployment + remote interpreter.

Hammerite
Mar 9, 2007

And you don't remember what I said here, either, but it was pompous and stupid.
Jade Ear Joe

Munkeymon posted:

Extension methods don't do what you want?

I don't know what you mean - I don't understand how extension methods would be relevant to this, but if you can point me to something I don't know about them that would be great.

hooah
Feb 6, 2006
WTF?
One of the students in the intro class I'm a TA for did this in his project in order to update the data structure (D and D_new are both dictionaries):
Python code:
D = {**D_new, **D}
What the hell is that? I tried Googling and only found stuff about kwargs for **, or a single star on the left for extended iterable unpacking, but none of the examples seem to apply to what this kid did.

Asymmetrikon
Oct 30, 2009

I believe you're a big dork!
It's parameter unpacking; you can do it whenever calling any function to pass a dict as keyword args. Seems like it's working the same as if you did:
Python code:
D = dict(**D_new, **D)
Interestingly,
Python code:
a = [1, 2, 3]
b = [4, ,5 6]
ls = [*a, *b]
# ls == [1, 2, 3, 4, 5, 6]
also works.

Cingulate
Oct 23, 2012

by Fluffdaddy

QuarkJets posted:

You wrote exactly the same code but with one variable name changed. There is no meaningful difference here
It's a very minor change, but 1. i/I as a variable is sometimes discouraged because it evokes imaginary numbers (don't ask me, I just see this all the time), 2. using _ makes a bit clearer we don't care about it.

Hammerite
Mar 9, 2007

And you don't remember what I said here, either, but it was pompous and stupid.
Jade Ear Joe

Cingulate posted:

It's a very minor change, but 1. i/I as a variable is sometimes discouraged because it evokes imaginary numbers (don't ask me, I just see this all the time), 2. using _ makes a bit clearer we don't care about it.

I don't like this practice of using _ as a name for an "unused" variable. I think it's a bad convention. Firstly _ already means something in the standard Python interactive console, secondly it's opaque and baffling if you haven't seen it before. I prefer to use a name like "_unused" or similar. You only write it in one place after all.

Cingulate
Oct 23, 2012

by Fluffdaddy

Hammerite posted:

I don't like this practice of using _ as a name for an "unused" variable. I think it's a bad convention. Firstly _ already means something in the standard Python interactive console, secondly it's opaque and baffling if you haven't seen it before. I prefer to use a name like "_unused" or similar. You only write it in one place after all.
Hm ... is there a one-character option you like better?

Hammerite
Mar 9, 2007

And you don't remember what I said here, either, but it was pompous and stupid.
Jade Ear Joe

Cingulate posted:

Hm ... is there a one-character option you like better?

Well, no there isn't. There's no one-character name that means "this is an unused variable the author doesn't care about" to me, although I know _ is that to some people.

Like I say, given that it's by definition something you'll type only once, there's not much need for terseness.

baka kaba
Jul 19, 2003

PLEASE ASK ME, THE SELF-PROFESSED NO #1 PAUL CATTERMOLE FAN IN THE SOMETHING AWFUL S-CLUB 7 MEGATHREAD, TO NAME A SINGLE SONG BY HIS EXCELLENT NU-METAL SIDE PROJECT, SKUA, AND IF I CAN'T PLEASE TELL ME TO
EAT SHIT

I always thought the idea was to make unused variables disappear into the background, so it makes the code's meaning easier to parse

Blinkz0rz
May 27, 2001

MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS
If you're looking for self-contained readability your best bet is
Python code:
list_of_lists = map(lambda x: [], range(k))
x doesn't exist outside the context of the lambda so there's no scope leakage and you don't have to worry about maintaining an extra variable anywhere.

I'm sure there's an even more concise way of using map() to do this that makes [] a callable so you don't even need the lambda.

Nippashish
Nov 2, 2005

Let me see you dance!
code:
list_of_lists = [f() for f in [lambda:[]]*k]

Asymmetrikon
Oct 30, 2009

I believe you're a big dork!
Python code:
class Repeat:
    def __init__(self, n):
        self.n = n
    def times(self, x):
        return [x for _ in range(self.n)]

Repeat(5).times([])

OnceIWasAnOstrich
Jul 22, 2006

Asymmetrikon posted:

Python code:
class Repeat:
    def __init__(self, n):
        self.n = n
    def times(self, x):
        return [x for _ in range(self.n)]

Repeat(5).times([])

Python code:
from itertools import repeat
list(repeat([],5))

Hammerite
Mar 9, 2007

And you don't remember what I said here, either, but it was pompous and stupid.
Jade Ear Joe

Asymmetrikon posted:

Python code:
class Repeat:
    def __init__(self, n):
        self.n = n
    def times(self, x):
        return [x for _ in range(self.n)]

Repeat(5).times([])

You don't get 5 different lists with this, you get 5 references to the same list.

Asymmetrikon
Oct 30, 2009

I believe you're a big dork!

Hammerite posted:

You don't get 5 different lists with this, you get 5 references to the same list.

You're right

Python code:
import copy

class Repeat:
    def __init__(self, n):
        self.n = n
    def times(self, x):
        return [copy.deepcopy(x) for _ in range(self.n)]

Repeat(5).times([])

Blinkz0rz
May 27, 2001

MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS

OnceIWasAnOstrich posted:

Python code:
from itertools import repeat
list(repeat([],5))

This is the best solution

Nippashish
Nov 2, 2005

Let me see you dance!

Blinkz0rz posted:

This is the best solution
Except it's wrong.
code:
Python 3.5.1 |Continuum Analytics, Inc.| (default, Dec  7 2015, 11:16:01) 
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from itertools import repeat
>>> a = list(repeat([], 5))
>>> a
[[], [], [], [], []]
>>> a[0].append('buttes')
>>> a
[['buttes'], ['buttes'], ['buttes'], ['buttes'], ['buttes']]
>>> 

Blinkz0rz
May 27, 2001

MY CONTEMPT FOR MY OWN EMPLOYEES IS ONLY MATCHED BY MY LOVE FOR TOM BRADY'S SWEATY MAGA BALLS
Man I wish there was something in the documentation for itertools that explained how repeat works wrt object identity.

BigRedDot
Mar 6, 2008

quote:

I don't like this practice of using _ as a name for an "unused" variable. I think it's a bad convention. Firstly _ already means something in the standard Python interactive console, secondly it's opaque and baffling if you haven't seen it before. I prefer to use a name like "_unused" or similar. You only write it in one place after all.
I can count the times I've used the standard interactive console _ in the last 20 years on zero fingers.

OnceIWasAnOstrich
Jul 22, 2006

Nippashish posted:

Except it's wrong.


Yeah I was just pointing out identical functionality to that class in the standard library.

Lets keep this code golf rolling. Don't need the outside list call in py2.

Python code:
lol = list(map(list, [[]]*k))

OnceIWasAnOstrich fucked around with this message at 00:11 on Apr 10, 2016

Cingulate
Oct 23, 2012

by Fluffdaddy
alist = [[l] for l in 'alist']
for sublist in alist: sublist.pop()

Dominoes
Sep 20, 2007

Python code:

class ListWrapper(list):
    def __init__(self):
        super().__init__()

    def items(self):
        return [i for i in self]


class CopyFactory:
    def __init__(self, n):
        self.n = n
    
    def set_n(self, n):
        self.n = n
        
    def get_n(self):
        return self.n
   

class CopyItem:
    def __init__(self):
        self.item = []
        
    def copy(self):
        return self.get_item()
        
    def get_item(self):
        return self.item
        
    def set_item(self, item):
        self.item = item
   
   
class Copy:
    def __init__(self, factory):
        self.factory = factory
        self.items = ListWrapper().items()
        
    def get_factory(self):
        return self.factory
        
    def set_factory(self, factor):
        self.factory = factory
        
    def get_items(self):
        return self.items
        
    def add_item(self, item):
        self.items.append(item)
        
    def populate_items(self):
        for i in range(self.factory.get_n()):
            self.add_item(CopyItem().get_item())
        
        
factory = CopyFactory(5)
list_of_lists = Copy(factory)
list_of_lists.populate_items()
list_of_lists.get_items()

Dominoes fucked around with this message at 00:40 on Apr 10, 2016

QuarkJets
Sep 8, 2008

Cingulate posted:

Hm ... is there a one-character option you like better?

Yes, the letter i. It's probably the most common single-character "throwaway" variable name for an integer in all of computer science, across almost all languages.

In Matlab code it's problematic because i is a predefined variable in Matlab's enormous namespace. That's more Matlab's problem than anything. It's certainly not an issue in Python code.

Mr. Nemo
Feb 4, 2016

I wish I had a sister like my big strong Daddy :(
Hello, I hope this is the thread for the stupid newbie questions thread, otherwise, I'm sorry and please point me in the right direction.

I'm completely new at Python, having only taken a course in edX. https://courses.edx.org/courses/course-v1:Microsoft+DAT208x+2T2016/info
Which is very basic. I'm trying to do some stuff with pandas, some simple DataFrames stuff, importing csv files.

I'm using a Python console in Spyder, which came with Anaconda. I have no idea what any of this means, but apparently it was the easiest way for a newbie to get started according to the internet, but that's a different subject.

I basically have two lists, and I'm trying to "map" (I think) a column in the first one according to the data in the second one.

Python code:
 import pandas as pd

data=pd.read_csv("***import.csv",index_col=0)
CC=pd.read_csv(***Country List.csv,index_col=0)

for i in range(0,6908):
        data['Reporter Code'].loc[i]=CC.loc[ data['Reporter Code'].loc[1]][0]
I'm getting the "A value is trying to be set on a copy of a slice from a DataFrame", which is very common from what I've seen, but I really don't get how to fix it.

data has five columns, so in order to avoid using .loc I tried masking the column I need to a new list RC=data['Reporter Code']

The thing is, when I try to change ONE data point as in


Python code:
data['Reporter Code'].loc[17]='whatever'
I get the error code, but the change goes through. Yet when I do the for, to change all of them I get the same error, but I'm locked out the thing. I can write anything I want, but it doesn't register as code. This may be due to my complete lack of knowledge, there's probably a button I can press to fix it or something, but so far I've been closing the console, opening a new one and rewriting all the code.

Can someone help me? Is the situation clear enough?

So far this has been kind of fun. The last programming I did was some Free Pascal several years ago, and I'd forgotten how great it felt to see your code working. I'd love to be able to continue using Python.

Any help would be appreciated, thanks in advance.

edit: I wasn't being "locked out", the command was just taking a very long while, due to the error popping up every single time it tried to overwrite a value. After dinner I came back to a list full of "JOHN", I am now running with the correct code to see if it works after a while. There's probably a way to check in which loop cycle it is, but I haven't found it so far. There we go, took 7 minutes, but successfully changed all the numbers into the corresponding string.

Mr. Nemo fucked around with this message at 02:58 on Apr 10, 2016

rvm
May 6, 2013

Nippashish posted:

Get pycharm's pro version and use a deployment + remote interpreter.

Pydev uses the same codebase for remote debugging and happens to be free, btw.

Monkey Fury
Jul 10, 2001

OnceIWasAnOstrich posted:

Lets keep this code golf rolling.

Python code:
import ast
>>> k = 10
>>> ast.literal_eval('[{}]'.format(', '.join(['[]' for _unused in range(k)])))
[[], [], [], [], [], [], [], [], [], []]

Hammerite
Mar 9, 2007

And you don't remember what I said here, either, but it was pompous and stupid.
Jade Ear Joe

BigRedDot posted:

I can count the times I've used the standard interactive console _ in the last 20 years on zero fingers.

Well since you don't use it, obviously nobody does! You have opened my eyes.

Cingulate
Oct 23, 2012

by Fluffdaddy
Let's not have a fight over something stupid I said.

Dominoes
Sep 20, 2007

Would partial unpacking be a good language feature? That would resolve this. ie:

Python code:
def func():
    return 1, 2, 3


one, two = func()  # one = 1, two = 2

# as opposed to the current way:
one, two, _ = func()
# or
one, two = func()[:2]
You'd have to have some way to resolve assigning the entire tuple vs unpacking; could maybe be as a decorator in normal python.

Or per this stack overflow page.

Python code:
one, two, *rest = func()
Another var name suggested beyeond '_' and 'rest' is 'ignore'.

Dominoes fucked around with this message at 14:37 on Apr 10, 2016

Cingulate
Oct 23, 2012

by Fluffdaddy

Dominoes posted:

Would partial unpacking be a good language feature? That would resolve this. ie:

Python code:
def func():
    return 1, 2, 3


one, two = func()  # one = 1, two = 2

# as opposed to the current way:
one, two, _ = func()
# or
one, two = func()[:2]
You'd have to have some way to resolve assigning the entire tuple vs unpacking; could maybe be as a decorator in normal python.

Or per this stack overflow page.

Python code:
one, two, *rest = func()
Another var names that were suggested besides _ and rest is ignore.
Only works (comfortably) with continuous slices though. What if you want only the second and second to last of 7 outputs?

Adbot
ADBOT LOVES YOU

Dominoes
Sep 20, 2007

Cingulate posted:

Only works (comfortably) with continuous slices though. What if you want only the second and second to last of 7 outputs?
Wouldn't work as you point out, but perhaps one way to deal with that is design functions so vars that are likely to be left out are after ones most frequently used; like currying, but with output instead of input.

Do y'all have unused variables besides during unpacking?

I think the best way to do this is the last code I posted, but here's another:

Python code:
from toolz import take

one, two = take(2, func())


Dominoes fucked around with this message at 14:40 on Apr 10, 2016

  • Locked thread