Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Locked thread
Harriet Carker
Jun 2, 2009

I am programming a simple game. At the end of the game, I want to give the user an option to play again or exit the program. My logic is as follows: The while loop enters, and if "n" is input, "restart" is set to "n" and the next pass through the while loop should evaluate to false and run the else block. Somehow, this is not happening.

(1)
code:
def play_again():
	restart = input("Enter y to play again or n to quit: ")

restart = "y"			
while restart != "n":
	play_again()
else:
	print ("Goodbye!")
I know I can easily do this with

(2)
code:
while restart != "n":
	restart = input("Enter y to play again or n to quit: ")
else:
	print ("Goodbye!")
but I want to use a function so I can check to make sure the user actually input "y" or "n" such as follows (note - this block is working as intended. I just included it to illustrate why I'm using a function):

(3)
code:
def play_again():
	restart = input("Enter y to play again or n to quit: ")
	try: 
		x = str(restart)
	except ValueError:
		print ("Please enter y or n.")
		play_again()
	else:
		if str(restart) != "y" and str(restart) != "n":
			print ("Please enter y or n.")
			play_again()
Can anyone see why (1) is not working? I'm pretty new at this so it's probably something silly.

Adbot
ADBOT LOVES YOU

ShadowHawk
Jun 25, 2000

CERTIFIED PRE OWNED TESLA OWNER

dantheman650 posted:

I am programming a simple game. At the end of the game, I want to give the user an option to play again or exit the program. My logic is as follows: The while loop enters, and if "n" is input, "restart" is set to "n" and the next pass through the while loop should evaluate to false and run the else block. Somehow, this is not happening.

(1)
code:
def play_again():
	restart = input("Enter y to play again or n to quit: ")

restart = "y"			
while restart != "n":
	play_again()
else:
	print ("Goodbye!")
...
Can anyone see why (1) is not working? I'm pretty new at this so it's probably something silly.
This is python's scoping rules in one of its more simple forms. restart is a new name for a variable you are creating, private to the play_again function, and it's deleted at the end of the play_again function.

If you want to go outside the scope, you can use the global or nonlocal keywords.

edit: note that if you referenced restart inside play_again without reassigning it something, it would inherit from the parent scope.

Harriet Carker
Jun 2, 2009

ShadowHawk posted:

This is python's scoping rules in one of its more simple forms. restart is a new name for a variable you are creating, private to the play_again function, and it's deleted at the end of the play_again function.

If you want to go outside the scope, you can use the global or nonlocal keywords.

edit: note that if you referenced restart inside play_again without reassigning it something, it would inherit from the parent scope.

I thought this might be the case, and I've tried the same code while setting restart as a global variable. No dice - it runs exactly the same. Maybe I'm doing it incorrectly. Don't I just add

code:
global restart
anywhere before assigning it?

Harriet Carker fucked around with this message at 02:14 on Aug 7, 2014

ShadowHawk
Jun 25, 2000

CERTIFIED PRE OWNED TESLA OWNER

dantheman650 posted:

I thought this might be the case, and I've tried the same code while setting restart as a global variable. No dice - it runs exactly the same. Maybe I'm doing it incorrectly. Don't I just add

code:
global restart
anywhere before assigning it?
Yes, within the scope of the function inheriting the variable:

code:
#!/usr/bin/env python3

def play_again():
	global restart
	restart = input("Enter y to play again or n to quit: ")

restart = "y"			
while restart != "n":
	play_again()
else:
	print ("Goodbye!")

ShadowHawk
Jun 25, 2000

CERTIFIED PRE OWNED TESLA OWNER
Thinking it over, perhaps a better name for the global keyword might be inherit, similar to how lambda should be called make_function

Harriet Carker
Jun 2, 2009

ShadowHawk posted:

Yes, within the scope of the function inheriting the variable:

code:
#!/usr/bin/env python3

def play_again():
	global restart
	restart = input("Enter y to play again or n to quit: ")

restart = "y"			
while restart != "n":
	play_again()
else:
	print ("Goodbye!")

Thank you so much. This works perfectly. I had "global restart" outside of the function.

Jewel
May 2, 2009

Why was the first suggestion globals and not return values. I'm not in a position to explain the finer details of return values right now but you should look them up. It would allow you to have something similar to this:

Python code:
#!/usr/bin/env python3

def play_again():
	return input("Enter y to play again or n to quit: ")

restart = "y"			
while restart != "n":
	restart = play_again()
else:
	print ("Goodbye!")
Or in a better system (though probably flawed, I'm tired):

Python code:
#!/usr/bin/env python3

def play_again():
	while (True): #Loop until a valid answer is given
		play_again = input("Enter y to play again or n to quit: ")
		if (play_again == 'y'): return True
		if (play_again == 'n'): return False;
		
		print "Invalid option, please enter only y or n"

game_in_progress = True

while (game_in_progress):
	#Do Your Game
	game_in_progress = play_again()

print("Goodbye!")
Also you should use raw_input because input is just a wrapper around raw_input that also exec's which is bad:



Edit: That's with python 2, I guess they changed it in python 3 otherwise you would have run into some problems? Namely entering "y" gives a NameError on python 2.

Jewel fucked around with this message at 02:37 on Aug 7, 2014

ShadowHawk
Jun 25, 2000

CERTIFIED PRE OWNED TESLA OWNER

Jewel posted:

Why was the first suggestion globals and not return values. I'm not in a position to explain the finer details of return values right now but you should look them up. It would allow you to have something similar to this:

Python code:
#!/usr/bin/env python3

def play_again():
	return input("Enter y to play again or n to quit: ")

restart = "y"			
while restart != "n":
	restart = play_again()
else:
	print ("Goodbye!")
Or in a better system (though probably flawed, I'm tired):
Yes, return values are better and more natural here but from reading his "I know it could be done this other way" part of the post I assumed he knew about them. That may not have been correct in retrospect.

Harriet Carker
Jun 2, 2009

ShadowHawk posted:

Yes, return values are better and more natural here but from reading his "I know it could be done this other way" part of the post I assumed he knew about them. That may not have been correct in retrospect.

I don't quite understand the technicalities, which is why I can't see why this works:

Jewel posted:


Or in a better system (though probably flawed, I'm tired):

Python code:
#!/usr/bin/env python3

def play_again():
	while (True): #Loop until a valid answer is given
		play_again = input("Enter y to play again or n to quit: ")
		if (play_again == 'y'): return True
		if (play_again == 'n'): return False;
		
		print "Invalid option, please enter only y or n"

How does this while loop exit? Does return automatically exit the loop? I suppose it must.


Jewel posted:

Also you should use raw_input because input is just a wrapper around raw_input that also exec's which is bad:



Edit: That's with python 2, I guess they changed it in python 3 otherwise you would have run into some problems? Namely entering "y" gives a NameError on python 2.

Right, it's been changed for Python 3 (as far as I could discern).

ShadowHawk
Jun 25, 2000

CERTIFIED PRE OWNED TESLA OWNER

dantheman650 posted:

I don't quite understand the technicalities, which is why I can't see why this works:


How does this while loop exit? Does return automatically exit the loop? I suppose it must.


Right, it's been changed for Python 3 (as far as I could discern).
A return statement is a bit like a goto statement. It immediately exits the function and execution goes back to whatever called it. As a part of exiting the function any variables confined to the scope of the function will be deleted, and the function call will be replaced with a reference to the return value (which could be anything, like an integer or an object like a list you constructed inside the function).

Python code:
def foo():
    return 1
    hack_whitehouse() # will never run

x = foo()
print(x) # will only ever print 1
A static analysis tool will (should) warn that the foo() function above has "unreachable code"


Incidentally now would also be a good time to mention the continue and break statements, which act a bit like returns but are within for/while loops.

Harriet Carker
Jun 2, 2009

Thanks for the quick, clear, and concise response. My code is running perfectly now, and I implemented the suggestion of using return values instead of a global variable. Also, using a simple while loop to check if the user entered the correct input is so much better and easier than all that try/except stuff I was using before.

Jewel
May 2, 2009

dantheman650 posted:

How does this while loop exit? Does return automatically exit the loop? I suppose it must.

Yeah, "return", "break" both exit a loop (both "for" loops and "while" loops). "continue" is another keyword that skips the rest of the loop and returns to the top.

A proper game engine that draws to a window usually has a "while game_running and not window.is_closed()" with some form of sleep inside the loop so it only runs 60 times per second. That's a super simplified version but the base idea is there.

SurgicalOntologist
Jun 17, 2004

ShadowHawk posted:

Just a stab in the dark but do you have a namespace collision by defining your own stop() in the above?

Nope, I was testing in clean environments. In the end, it seems to me like subprocesses only play nice with the thread's main event loop. I thought it would be nice to avoid the main event loop and always make new ones, because otherwise I'm getting events bleeding from one test to the next, unfortunately. But other than that, things are working. It'll only be an issue for testing.

ShadowHawk
Jun 25, 2000

CERTIFIED PRE OWNED TESLA OWNER

SurgicalOntologist posted:

Nope, I was testing in clean environments. In the end, it seems to me like subprocesses only play nice with the thread's main event loop. I thought it would be nice to avoid the main event loop and always make new ones, because otherwise I'm getting events bleeding from one test to the next, unfortunately. But other than that, things are working. It'll only be an issue for testing.
I meant more colliding with python's builtin stop() for various asyncio things. I believe that's overrideable, in the same way one can override double underscore stuff like __contains__

edit: eg: https://docs.python.org/3/library/asyncio-eventloop.html?highlight=stop#asyncio.BaseEventLoop.stop

If you're defining a coroutine, and coroutines inherit from BaseEventLoop, and you redefine stop(), bad things will happen when stop is normally called.

ShadowHawk fucked around with this message at 03:34 on Aug 7, 2014

SurgicalOntologist
Jun 17, 2004

Sure that's overrideable (everything is), but I'm not subclassing anything, Server has no supers. Plus coroutines are just functions, they don't inherit from BaseEventLoop.

Ahz
Jun 17, 2001
PUT MY CART BACK? I'M BETTER THAN THAT AND YOU! WHERE IS MY BUTLER?!
I have a pandas DataFrame (roughly 3000 rows like below):

code:
df.index   q_id      total_paid choice_txt
3           176            6918          7
12          176               0          3
21          176            5053         10
30          176            5219          9
39          176            5622          8
48          176               0          7
57          176            4239          7
66          176            2811          7
75          176            5049          7
84          176            6797          3
93          176            1740          7
102         176            4747          9
111         176             882          8
120         176            5961          6
129         176            7959          6
138         176            7100          2
147         176            1565          2
156         176            5776          7
165         176            4385          1
174         176               0         10
183         176            1131          8
192         176           10586          7
201         176             920          7
210         176            1900          4
219         176            4436          1
228         176            8715          5
237         176            8248          7
246         176               0          6
255         176            7683         10
263         176             572          8
and on...
I am having trouble grouping it by value counts. Essentially I want to take this DataFrame and group it by counts of identical 'choice_txt', but also group 'total_paid' by value aggregates. In this case I want 3 aggregates (low set, hypothetically < 3000)(mid set, 3001-6000)(high set, 6001+)

In the example, 'choice_txt' is an identifier, not a count above. I would like to count how many identifiers exist for each one, in this case there are identifiers 1-10. In 3000 rows, I have one record for each identifier 1-10.

Grouped into a final result like this:
code:
q_id    total_paid  choice_txt            count             
176       low            10                 1  
          low            2                  3    
          low            3                  2    
          low            4                  1  
          low            6                  5               
          low            7                  4               
          low            8                  3               
          low            9                  2               
          mid            10                 1               
          mid            8                  1               
          mid            1                  1   
          high           10                 1  
          high           2                  3    
          high           3                  2    
          high           4                  1  
          high           6                  5               
          high           7                  4               
          high           8                  3               
          high           9                  2               
My work so far got me to where I can group it well by actual total paid value:
code:
sub_grouped_df = filtered_df.groupby([series_col, breakout_column, "choice_txt"]).count()['another_column']
q_id       total_paid  choice_txt
176        0               10            1
                           2             3
                           3             2
                           4             1
                           6             5
                           7             4
                           8             3
                           9             2
           56              10            1
           236             8             1
           455             1             1
           572             8             1
           609             8             1
           636             7             1
           826             10            1
...
176        9096            4             1
           9141            5             1
           9232            5             1
           9357            8             1
           9371            3             1
           9601            1             1
           9604            2             1
           9706            8             1
           9719            1             1
           10032           1             1
           10490           9             1
           10586           7             1
           10632           3             1
           10799           4             1
           12437           1             1
I have the grouping I need, except I need to take the 'total_paid' and group it by value (again the low/mid/high sets mentioned above.

But I can't figure out how to evaluate the total_paid col by value and use that into the group by.


EDIT:
OK I have a solution that works, but it seems like a total hack:
Since I have 3 groups that I need to accumulate (low,mid,high), separated my main data frame into 3 separate data frames by the low,mid,high criteria:
code:
    low_df = filtered_df.loc[filtered_df[breakout_column] <= low_high] 
    high_df = filtered_df.loc[filtered_df[breakout_column] > mid_high]  
    mid_df = filtered_df.drop(low_df.index)
    mid_df = mid_df.drop(high_df.index)
Then I grouby on each one to get the grouping I need:
code:
    sub_low_df = low_df.groupby([series_col, "choice_txt"])
    sub_mid_df = mid_df.groupby([series_col, "choice_txt"])
    sub_high_df = high_df.groupby([series_col, "choice_txt"])
Finally I sequentially combine my results into a dict for reporting.
Is this an acceptable solution or is there a better way to achieve the result?

SurgicalOntologist
Jun 17, 2004

Here: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.cut.html

Python code:
filtered_df.groupby(pd.cut(filtered_df['total_paid'], [0, 3000, 6000, 100000], labels=['low', 'mid', 'high']))
Not sure what you want to do with the groupby, can't quite follow. If you just want to categorize your data, rather than say sum each group, don't do a groupby, just make another column.

Python code:
filtered_df['category'] = pd.cut(filtered_df['total_paid'], [0, 3000, 6000, 100000], labels=['low', 'mid', 'high'])
E: I think I figured it out. You want the number of rows for every unique combination of low/mid/high category and choice_txt?

Python code:
filtered_df['category'] = pd.cut(filtered_df['total_paid'], [0, 3000, 6000, 100000], labels=['low', 'mid', 'high'])
filtered_df.groupby(['category', 'choice_txt']).count()

SurgicalOntologist fucked around with this message at 05:28 on Aug 7, 2014

Ahz
Jun 17, 2001
PUT MY CART BACK? I'M BETTER THAN THAT AND YOU! WHERE IS MY BUTLER?!
Well I'm pretty proud of my self spending half a day to fix it myself, but your solution is much nicer than mine.

I still hate the Panda docs. It's pretty frustrating knowing exactly when I need, what steps to take and hit a brick wall with syntax and libraries.

SurgicalOntologist
Jun 17, 2004

I agree completely. I feel like I learned Python basics in a day just by reading the docs. Same for many other libraries. Not so for pandas, everything is hidden away, even though I think I've read the docs front to back. Features are not that discoverable, e.g. I'm always confused by what's a method and what's a module-level function.

tef
May 30, 2004

-> some l-system crap ->

BeefofAges posted:

I consider metaclasses to be a pretty esoteric language feature, and what I expect most people to know about them is that you probably shouldn't use them unless you have a really good reason.

Asking about list comprehensions is fair game - they're a neat Python feature that people should be familiar with.

You could try asking if Python is pass-by-value or pass-by-reference. (trick question, it's not quite either - http://robertheaton.com/2014/02/09/pythons-pass-by-object-reference-as-explained-by-philip-k-dick/ ) I wouldn't expect people to answer this correctly, but it would be interesting to see them try to reason their way through it based on the Python behavior they're familiar with.

trick answer: it's pass by value, but all python objects live on the heap and variables only store pointers to them. :argh:

the
Jul 18, 2004

by Cowcaster
Let's say I have a list of names, what would be the best way to determine whether these are male or female?

The easy but hard way would be to just display each name and select the gender, but I'm wondering if I could somehow find a list or something to compare them to. Any ideas?

Asymmetrikon
Oct 30, 2009

I believe you're a big dork!
You're probably gonna have to have user input anyway, given the number of names that are gender-neutral.

the
Jul 18, 2004

by Cowcaster

Asymmetrikon posted:

You're probably gonna have to have user input anyway, given the number of names that are gender-neutral.

Are there that many? Let's say I don't need to be exact, but like 95% accurate on an idea of how many males/females I have in this list.

Nippashish
Nov 2, 2005

Let me see you dance!

the posted:

Are there that many? Let's say I don't need to be exact, but like 95% accurate on an idea of how many males/females I have in this list.

Maybe check what kind of coverage you have from these lists: http://www.census.gov/genealogy/www/data/1990surnames/names_files.html

the
Jul 18, 2004

by Cowcaster
Brilliant, thanks.

the
Jul 18, 2004

by Cowcaster
My entire list of names is 1214

When I ran my first test with this list, I got 940 men and 732 women, or 1672 names, meaning I'm getting duplicates somewhere (i.e both male and female name).

I tried running a test to print out the duplicates, but I don't think this is working:

code:
for i in match_list:
	m = False
	f = False
	for j in mn:
		if string.lower(i[2]) == string.lower(j[0]):
			m = True
			males.append([i[1],''])
	for k in fn:
		if string.lower(i[2]) == string.lower(k[0]):
			f = True 
			females.append([i[1],''])
	if m == True and f == True:
		print i[2]
It looks like it's just spitting out every single name, and I can't figure out why.

edit: I think I figured out the problem. Turns out that people tend to have a lot of common names in both genders. "William" was showing up in the womens list, albeit way down at the bottom. I decided to only look at the top 500 male/female names instead. Hopefully it helps.

edit: Yep, ended up getting 865 males and 198 females

the fucked around with this message at 17:57 on Aug 7, 2014

EAT THE EGGS RICOLA
May 29, 2008

Use genderize.io

code:
GET http://api.genderize.io?name=steve
code:
{"name":"steve","gender":"male","probability":"0.99","count":1686}

vikingstrike
Sep 23, 2007

whats happening, captain

the posted:

My entire list of names is 1214

When I ran my first test with this list, I got 940 men and 732 women, or 1672 names, meaning I'm getting duplicates somewhere (i.e both male and female name).

I tried running a test to print out the duplicates, but I don't think this is working:

code:
for i in match_list:
	m = False
	f = False
	for j in mn:
		if string.lower(i[2]) == string.lower(j[0]):
			m = True
			males.append([i[1],''])
	for k in fn:
		if string.lower(i[2]) == string.lower(k[0]):
			f = True 
			females.append([i[1],''])
	if m == True and f == True:
		print i[2]

It looks like it's just spitting out every single name, and I can't figure out why.

edit: I think I figured out the problem. Turns out that people tend to have a lot of common names in both genders. "William" was showing up in the womens list, albeit way down at the bottom. I decided to only look at the top 500 male/female names instead. Hopefully it helps.

edit: Yep, ended up getting 865 males and 198 females

Python has a built in "set" type that you can use to find intersections, unions, etc. if you ever need to do this again in the future.

ShadowHawk
Jun 25, 2000

CERTIFIED PRE OWNED TESLA OWNER
My question is this -- is python smart enough with the list.copy() function to only duplicate the memory once the copy is actually modified, or does it do it ahead of time?

Python code:
class Stuff:
    def __init__(self, biglist):
        self.biglist = biglist.copy()

large_template = [large_x for large_x in (expensive_generator)]
stuff1 = Stuff(large_template)
stuff2 = Stuff(large_template)
As I understand it with the above there will be 3 copies of large_template in memory.

Python code:
class Stuff:
    def __init__(self, biglist):
        self.biglist = biglist.copy()

def setup():
    global stuff1, stuff2
    large_template = [large_x for large_x in (expensive_generator)]
    stuff1 = Stuff(large_template)
    stuff2 = Stuff(large_template)

setup()
As I understand it here there will be 3, but one will be destroyed after setup ends.

It seems to me the most efficient way to do this would be to only copy the list once things start munging it.
Python code:
class Stuff:
    def __init__(self, biglist):
        self.biglist = biglist

def setup():
    global stuff1, stuff2
    large_template = [large_x for large_x in (expensive_generator)]
    stuff1 = Stuff(large_template.copy())
    stuff2 = Stuff(large_template)

setup()
Here is the "ideal" way, correct?

ShadowHawk fucked around with this message at 04:53 on Aug 8, 2014

Trompe le Monde
Nov 4, 2009

Dunno if this is worth posting here but I'm doing some basic data analysis for a physics lab using numpy and have an array which I'm trying to raise to the fourth power but when I check the resultant array all the entries are -2147483648. What's the deal.

QuarkJets
Sep 8, 2008

Trompe le Monde posted:

Dunno if this is worth posting here but I'm doing some basic data analysis for a physics lab using numpy and have an array which I'm trying to raise to the fourth power but when I check the resultant array all the entries are -2147483648. What's the deal.

Looks like an integer overflow, it's a common issue when you're solving computational problems that involve large numbers. You can either try converting to int64 (if you need to use integers for whatever reason) or using floats

Trompe le Monde
Nov 4, 2009

Righto, switched the array to floats I guess? Anyway it works now, thanks a bunch.

Symbolic Butt
Mar 22, 2009

(_!_)
Buglord

ShadowHawk posted:

My question is this -- is python smart enough with the list.copy() function to only duplicate the memory once the copy is actually modified, or does it do it ahead of time?

This is kind of a subtle question. A list is a sequence of references to other objects. list.copy() does a shallow copy of the list, that means that a new list is immediately created but it does not recreate the objects, it just copies the references. This may or may not be smart enough for you.

Symbolic Butt
Mar 22, 2009

(_!_)
Buglord

Trompe le Monde posted:

Righto, switched the array to floats I guess? Anyway it works now, thanks a bunch.

When you're working with numpy you have to be careful because you're trading a bunch of python conveniences for speed. numpy ints and floats have limits and you can run into overflows (like 2147483647 + 1 becoming -2147483648). In contrast, with regular python ints and floats you don't need to worry about these things because they have infinite precision (bignum).

http://docs.scipy.org/doc/numpy/user/basics.types.html - I recommend studying those datatypes a bit to avoid some lovely bugs later.

QuarkJets
Sep 8, 2008

Trompe le Monde posted:

Righto, switched the array to floats I guess? Anyway it works now, thanks a bunch.

It depends on what you need to do. Floating point arithmetic is slightly inaccurate. For instance, you might see something like this:

Python code:
>>> 1.2-1.0
0.199999999999999996
For nearly all applications, this does not matter. For instance, if you're using Python to calculate something from measured data, the measurements themselves are likely going to have way more uncertainty than the little bit of uncertainty that you'd get from using floating point arithmetic. I use numpy for computational physics, and I haven't personally run into any situations where floating point arithmetic inaccuracy has mattered (although I can think of some scenarios where it would)

ShadowHawk
Jun 25, 2000

CERTIFIED PRE OWNED TESLA OWNER

tef posted:

Your question struck me as a bit of an X-Y problem, i.e "I'm using generators awkwardly, how do I mitigate it?". Generators are about streams of values, if you're only wanting the first item, something's a bit up.


Re: "multiple functions are referencing the same generator in an unpredictable order". Sharing things is painful! What your code seems to have is a series of shared data structures which contain the state of your simulation, a bunch of generators over this state, and then a bunch of accessors to return these generators one by one. The problem I can imagine is that if you switch generator mid way through, bad things can happen.

There are lots of different ways to model it, but a hacky but useful way to model it as a Project object. When you construct it, you tell it how many users, bugs, features you want, and how to link them. This will contain the globals, and let you simulate multiple projects in your code.
I kept the generators more or less as-is, but did implement the project class like you suggested. Again, thank you! Here is the result:



https://github.com/YokoZar/wine-model

duck hunt
Dec 22, 2010
I got a question today that has me kinda freaked out. I was asked how you would implement a hash table data structure in Python.

Not having to worry about making that kind of stuff is one of the reasons that I use Python. I tried, but couldn't come up with a working hash table.

Should I be so freaked out?

Nippashish
Nov 2, 2005

Let me see you dance!

duck hunt posted:

Should I be so freaked out?

No, you should be learning how to make a hash table.

floppo
Aug 24, 2005
apologies if this is too broad a question for this thread. I am interested getting data from twitter - specifically a list of all the people following a specific account. My goal is compare two such lists and see who is connected to who - this would require a second set of queries.

I know a little bit of python, but mostly from homework-type assignments that don't require getting data. I've found the twitter API website but I could use a bit more hand-holding. In short does anyone know a good guide to scraping data from twitter using python? I thought I would narrow my search to python stuff for now, but feel free to suggest alternatives if you know of them.

Adbot
ADBOT LOVES YOU

duck hunt
Dec 22, 2010

Nippashish posted:

No, you should be learning how to make a hash table.

Here's my first try today.

code:
class Hashtable(object):
    def __init__(self, size):
        self.size = size
        self.idx = [[] for x in range(size)]

    def setItem(self, item):
        hashKey = ord(item[0]) % (self.size + 1)
        self.idx[hashKey].append([hashKey, item])

    def getItem(self, key):
        return self.idx[key]

    def pop(self, key):
        self.idx.pop(key)
The only thing I might add to it later would be to implement linked lists for small hash tables where you end up with collisions.

A related question:

I want to implement the array list data structure without using Python's list type

code:
class ArrayList(object):
    def __init__(self):
        init_size = 10
        self.array = [None] * init_size
        self.listLength = 0
        self.arrayLength = init_size

    def append(self, item):
        if self.listLength == self.arrayLength:
            self.array += [None] * self.arrayLength
            self.arrayLength *= 2
        self.array[self.listLength] = item
        self.listLength += 1

    def getItem(self, idx):
        if idx >= self.listLength:
            raise Exception('index out of bounds')
        return self.array[idx]

    def setItem(self, idx, item):
        if idx >= self.listLength:
            raise Exception('index out of bounds')
        self.array[idx] = item
you can see that in the constructor, I'm using a Python list. What would be a better type to collect items? Set? Tuple? Some other class?

  • Locked thread