Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Locked thread
Rohaq
Aug 11, 2006

the posted:

I'm running a sci/numerical python program in Canopy right now It takes about 4 hours to run. Afterwards I need to manipulate the data in the console (like check specific values and whatnot). Just a bunch of 41x41x41 arrays and such.

Is there any way I can *save* all the computed data so I can just instantly load it up again to work with it? That would save me literal hours of work in having to run this program again if I have to shut down my computer for some reason.
Sounds like you want to be using the dump and load functions of the pickle module: http://docs.python.org/2/library/pickle.html

Adbot
ADBOT LOVES YOU

evensevenone
May 12, 2001
Glass is a solid.
Pickle, yaml, or json are all pretty easy ways to achieve this. You'll want to make an object that contains all the data, then dump it to a file. Json and yaml are better if you might want to read the data in other languages/platforms. Pickle is faster and produces smaller files, so if you've got a shitload of data that might make sense.

Emacs Headroom
Aug 2, 2003
For stuff that does well structured like json, json is (much) faster in my experience.

Pickle is really slow for large-ish things (even cPickle); maybe someone knows more about the details but I assume it has something to do with how pickle does the object instantiation.

Nippashish
Nov 2, 2005

Let me see you dance!
Assuming you're using numpy, it has its own file format (see numpy.save/numpy.savez).

Don't put numpy matrices in json files :ughh:

Vulture Culture
Jul 14, 2003

I was never enjoying it. I only eat it for the nutrients.
I'm becoming more interested in TDD, but I'm largely clueless because most of what I use Python for is async with gevent. Anyone have any recommendations for starting points with testing concurrent clusterfuck applications?

Dominoes
Sep 20, 2007

Lesson in reinventing the wheel today: I created a function that allows date addition and subtraction using modulus and floor division. Turns out the datetime function already does that! It was a good learning experience.

Dominoes fucked around with this message at 02:37 on Apr 21, 2013

Rohaq
Aug 11, 2006
JSON and YAML are great for structured data, numpy has its own format as said, but pickle works with objects as a whole; if your files are meant to work exclusively with your script, pickle is probably the best guarantee that your object remains the same once it's been saved and retrieved.

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe

Dominoes posted:

Lesson in reinventing the wheel today: I created a function that allows date addition and subtraction using modulus and floor division.

And chances are you got it wrong.

Dominoes
Sep 20, 2007

Suspicious Dish posted:

And chances are you got it wrong.
There is some error, but it was acceptable for what I used it for. No more than 2 days of error for the first few years.

QuarkJets
Sep 8, 2008

Dominoes posted:

Lesson in reinventing the wheel today: I created a function that allows date addition and subtraction using modulus and floor division. Turns out the datetime function already does that! It was a good learning experience.

And quite well, at that!

MySQLdb even converts its own datetime format into the Python datetime format. Very handy, although inserting a datetime with MySQLdb requires converting it to a string first

Dominoes
Sep 20, 2007

Technique question: I have two similar very similar, and lengthy bits of code (Pulling financial data from two different sources, one uses json, the other csv). There are common replaceable items. For example, where one function uses 'data[i][4]', the other might use 'data[n][day]['date']'. I'd pass '[n][day]' and '[i][4]' as parameters to the new function, reusing the code instead of having it exist twice. Logically, it seems easy to combine these into a common function, but simply turning them into arguments doesn't seem to work.

Option 1: Keep the functions separate due to this issue. Don't try to reuse the code

Option 2: I've read about using exec(), but it seems like a messy hack, and I can't get it working.

Option 3: A Python solution designed for this scenario that I haven't found. Someone earlier helped me with a similar problem using __getattribute__ with %s, but I can't seem to get it working in this scenario, and it also feels like a weird hack.

What's the preferred way to handle this?

example:
Python code:
data = ['a', 'b', 'c', 'd']
input1 = '[1]'
input2 = '[2]'

def func(data, input):
    return ''.join(data, input) #pseudocode
    
func(data, input1)
func(data, input2)

Dominoes fucked around with this message at 18:55 on Apr 21, 2013

Nippashish
Nov 2, 2005

Let me see you dance!

Dominoes posted:

What's the preferred way to handle this?

Python code:
data = ['a', 'b', 'c', 'd']
input1 = 1
input2 = 2

def func(data, input):
    return data[input]
    
func(data, input1)
func(data, input2)

Dominoes
Sep 20, 2007

Nippashish posted:

Python code:
data = ['a', 'b', 'c', 'd']
input1 = 1
input2 = 2

def func(data, input):
    return data[input]
    
func(data, input1)
func(data, input2)
What about this?
Python code:
data1 = ['a', 'b', 'c', 'd']
data2 = [['a', 'b', 'c', 'd'],['w', 'x', 'y', 'z']]
input1 = '[1]'
input2 = '[0][1]'

def func(data, input):
    return ''.join(data, input) #pseudocode
    
func(data1, input1)
func(data2, input2)
edit: Actually it looks like your solution works. Just need to pass the arguments as lists/tuples and account for all possible slots.

Python code:
data1 = ['a', 'b', 'c', 'd']
data2 = [['a', 'b', 'c', 'd'],['w', 'x', 'y', 'z']]
input1 = (1, 0)
input2 = (1, 2)

def func(data, input):
    return data[input[0]][input[1]]
    
func(data1, input1)
func(data2, input2)

Dominoes fucked around with this message at 19:42 on Apr 21, 2013

Plorkyeran
Mar 22, 2007

To Escape The Shackles Of The Old Forums, We Must Reject The Tribal Negativity He Endorsed
Python code:
data1 = ['a', 'b', 'c', 'd']
data2 = [['a', 'b', 'c', 'd'],['w', 'x', 'y', 'z']]
input1 = [1]
input2 = [0, 1]

def func(data, input):
    for key in input:
        data = data[key]
    return data

func(data1, input1) # 'b'
func(data2, input2) # 'b'

Dominoes
Sep 20, 2007

Thanks Plork - that works too.

Last one! I think this can be solved with an if statement asking which input format you're entering (as an additional parameter) and having two separate return lines, but I'm wondering if there's a cleaner way.

Python code:
data1 = ['a', 'b', 'c', 'd']
data2 = [['a', 'b', 'c', 'd'],['w', 'x', 'y', 'z']]
input1 = '[1]'
input2 = '[0][n]'

def func(data, input):
    for n in range(3):
        return ''.join(data, input) #pseudocode
    
func(data1, input1)
func(data2, input2)

Dominoes fucked around with this message at 20:09 on Apr 21, 2013

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe
Please share more about your actual problem. This seems like overengineering at the highest level.

Dominoes
Sep 20, 2007

Suspicious Dish posted:

Please share more about your actual problem. This seems like overengineering at the highest level.
It's similar to the example I posted, but more complicated. I'm looking at historical stock data from Yahoo and Tradeking. I'm working with the Yahoo data as a .csv, and Tradeking as a .json phrase.

It looks like I have a working solution using Nippa's and Plork's examples. I used an if statement as described to sort the last example I posted. Here's the current implementation:

Python code:
eval(data, parameters, n, (n, 'date'), (n, 'close'), symbols[n], 'tk')

def eval(data, parameters, len_source, date_loc, price_loc, symbol, _type):
    failed = False
    result = []
    for n2 in parameters:
        start_date = date.today() - timedelta(days = n2.days_start)
        end_date = date.today() - timedelta(days = n2.days_end)
        
        #Checks for weekend dates; uses nearest Friday instead
        if date.weekday(start_date) == 5: 
            start_date -= timedelta(days = 1)
        if date.weekday(start_date) == 6:
            start_date -= timedelta(days = 2)
        if date.weekday(end_date) == 5:
            end_date -= timedelta(days = 1)
        if date.weekday(end_date) == 6:
            end_date -= timedelta(days = 2)
        
        count = 0 #for breaking loop once both dates determined, to save time
        #loops through the days, checking if dates match. Find a way around this?
        for n3 in range(len(data[len_source])): 
            if _type == 'tk':
                if data[date_loc[0]][n3][date_loc[1]] == str(start_date):
                    start_value = float(data[price_loc[0]][n3][price_loc[1]])
                    count += 1
                if data[date_loc[0]][n3][date_loc[1]] == str(end_date):
                    end_value = float(data[price_loc[0]][n3][price_loc[1]])
                    count += 1
                if count == 2:
                    break
            elif _type == 'yf':
                if data[date_loc[0]][n3][date_loc[1]] == str(start_date):
                    start_value = float(data[price_loc[0]][price_loc[1]])
                    count += 1
                if data[date_loc1[0]][n3][date_loc1[1]] == str(end_date):
                    end_value = float(data[price_loc[0]][price_loc[1]])
                    count += 1
                if count == 2:
                    break
        
        change = ((end_value - start_value) / start_value) * 100

        print('\n', symbol, n2.name)
        print ("change: " + str(change))
        print("start:", start_value, "end:", end_value)
        print("start:", start_date, "end:", end_date)
        
        if not n2._floor <= change <= n2._ceil:
            failed = True
            print (symbol, "failed for", n2.name)
            break

    if failed == False:
        result.append(symbol)
    return result

QuarkJets
Sep 8, 2008

Agreed; I have no idea what you're asking for at this point. What is this most recent Python code supposed to do?

e: To me, nothing in your psuedo-code looks anything like the code that you just posted :psyduck:

Dominoes
Sep 20, 2007

QuarkJets posted:

Agreed; I have no idea what you're asking for at this point. What is this most recent Python code supposed to do?
Nippa and Plork posted examples that I turned into a solution. I was wondering if there's a clean way to implement variables in code similar to the .join and %s abilities of strings.

I'm now curious if there's a way to clean up the duplicate 'if' statements in a situation like this.

Dominoes fucked around with this message at 20:43 on Apr 21, 2013

QuarkJets
Sep 8, 2008

Dominoes posted:

I'm not asking anything at this point; Nippa and Plork posted examples that I turned into a solution. I was wondering if there's a clean way to implement variables in code similar to the .join and %s abilities of strings.

This is terrible, don't go looking for this. It's probably not what you actually want to do.

e: You asked for clean, what you did is way cleaner than trying to pull variables from strings and then loop over them or whatever

QuarkJets fucked around with this message at 20:44 on Apr 21, 2013

Nippashish
Nov 2, 2005

Let me see you dance!

Dominoes posted:

I was wondering if there's a clean way to implement variables in code similar to the .join and %s abilities of strings.

This is never a good idea. Don't do this. Avoid wanting to do this.

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe
Why do you have variables named n2 and n3?

Dominoes
Sep 20, 2007

Suspicious Dish posted:

Why do you have variables named n2 and n3?
Iterators.

Dominoes fucked around with this message at 22:35 on Apr 21, 2013

QuarkJets
Sep 8, 2008

Dominoes posted:

Iterators.

Don't do this:

code:
 if failed == False:
        result.append(symbol)
Instead:

code:
if not failed:
        result.append(symbol)
...

And instead of a failure flag, you could just return when your failure condition is met. Plus it looks like result is never longer than length==1, so couldn't you just scrap it entirely? Like this :

Python code:
eval(data, parameters, n, \
	(n, 'date'), (n, 'close'), \
	symbols[n], 'tk')

def eval(data, parameters, len_source, \
	date_loc, price_loc, symbol, _typ):
    for n2 in parameters:
	#some code
        
        if not n2._floor <= change <= n2._ceil:
            print (symbol, "failed for", n2.name)
            return None
    return symbol
If eval returns None, then eval failed, otherwise you've got your symbol object...

...

Wait, symbol isn't ever used or modified anywhere in the code! Can't this just return True or False?

QuarkJets fucked around with this message at 22:53 on Apr 21, 2013

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe

Dominoes posted:

Iterators.

Name them something better. I also see a lot of issues with your code; use more variables. Like, I see that some of your code references date_loc1, which AFAICT doesn't exist. I don't know much about your data structures, but you should be able to use zip() or something instead of the range.

Dominoes
Sep 20, 2007

QuarkJets posted:

Don't do this:

code:
 if failed == False:
        result.append(symbol)
Instead:

code:
if not failed:
        result.append(symbol)
...

And instead of a failure flag, you could just return when your failure condition is met. Plus it looks like result is never longer than length==1, so couldn't you just scrap it entirely? Like this :

Python code:
eval(data, parameters, n, \
	(n, 'date'), (n, 'close'), \
	symbols[n], 'tk')

def eval(data, parameters, len_source, \
	date_loc, price_loc, symbol, _typ):
    for n2 in parameters:
	#some code
        
        if not n2._floor <= change <= n2._ceil:
            print (symbol, "failed for", n2.name)
            return None
    return symbol
If eval returns None, then eval failed, otherwise you've got your symbol object...

...

Wait, symbol isn't ever used or modified anywhere in the code! Can't this just return True or False?
Good catches. Didn't know about the 'if not failed' boolean logic. You're right about the append being unecessary; I was mixing up code in this function and in the one that calls it. Made your change to the return logic. 'symbol' is passed as an argument. It looks like you're right that I could turn this function into a True/False return. Not sure if it's better; I'll think about it. I'd still want to pass the symbol name for the debugging prints.

edit: I like your True/False return suggestion more than returning the symbol name. Changed. Removed the "failed" variable.

Suspicious Dish posted:

Name them something better. I also see a lot of issues with your code; use more variables. Like, I see that some of your code references date_loc1, which AFAICT doesn't exist. I don't know much about your data structures, but you should be able to use zip() or something instead of the range.
The tutorials I learned from would use statements like "for parameter in parameters" or "for day in range". I don't like them because I have the phrase "parameter(s)" and "day(s)" several other places in the program, and want to make it easy to distinguish. The short names also make the code easier to read, although harder for someone else to interpret. I'm open to changing and suggestions on alternate ways of doing this. "loc1" was a typo. I'll look up zip().

Dominoes fucked around with this message at 02:22 on Apr 22, 2013

Vulture Culture
Jul 14, 2003

I was never enjoying it. I only eat it for the nutrients.

Dominoes posted:

The tutorials I learned from would use statements like "for parameter in parameters" or "for day in range". I don't like them because I have the phrase "parameter(s)" and "day(s)" several other places in the program, and want to make it easy to distinguish.
This means two things:

  1. Your functions are too long and mix too many concerns, so you're not able to properly leverage variable scope
  2. Your variable names are not descriptive in the first place

Both of these things also make it harder to unit test your code, which starts the death spiral of "this code is unreadable and I don't want to touch it because I might break something."

Dominoes posted:

The short names also make the code easier to read, although harder for someone else to interpret. I'm open to changing and suggestions on alternate ways of doing this. "loc1" was a typo.
Also harder for you to interpret once you take a few weeks off from this code, which honestly is the bigger concern to most developers.

Dominoes
Sep 20, 2007

Replaced all instances of "n" and "i" iterators with names that make sense in both my programs.
Do y'all usually include __repr__ functions in classes containing instances?

Dominoes fucked around with this message at 02:22 on Apr 22, 2013

the
Jul 18, 2004

by Cowcaster
So I have a 3d-array ey that increments over i,j,k. I also have an array eytimes that's 4 dimensions, I want it to be the 3d component of ey plus a counter that goes from 0-59 (because ey evolves in a while loop that counts 60 times).

Python code:
eytimes[i,j,k,counter] = ey[i,j,k] + counter
So, I want to plot all of the values on top of each other, like... I want to plot [i,0,0,1] and [i,0,0,2], etc... all on top on the same graph versus x. How would I do this?

I had

Python code:
pylab.plot(x,eytimes[:,half,half,:])
But that doesn't work

edit: fixed it

the fucked around with this message at 06:36 on Apr 22, 2013

JetsGuy
Sep 17, 2003

science + hockey
=
LASER SKATES

the posted:

So I have a 3d-array ey that increments over i,j,k. I also have an array eytimes that's 4 dimensions, I want it to be the 3d component of ey plus a counter that goes from 0-59 (because ey evolves in a while loop that counts 60 times).

Python code:
eytimes[i,j,k,counter] = ey[i,j,k] + counter
So, I want to plot all of the values on top of each other, like... I want to plot [i,0,0,1] and [i,0,0,2], etc... all on top on the same graph versus x. How would I do this?

I had

Python code:
pylab.plot(x,eytimes[:,half,half,:])
But that doesn't work
I'm assuming you have some 3D array of values (say x, y, t), and 60 of those? If that's the case, you may wanna check out the "meshgrid" funtions and how to plot those. It may be more useful for you. If you're just concerned with the first value over "counter" (where y=t=0 in my example), here's a few things:

1) Stop using pylab, I keep telling you to stop using pylab. :v:

2) From what you've described here, it doesn't look like eytimes[:,half,half,:] would be a 1D array, which pyplot.plot() is expecting for y.

3) Maybe I'm misunderstanding how you're construction eytimes, but those indicies don't make sense to me here. Counter should be the "lead" one I think. As in:

code:
In [4]: num.zeros([2,3,3,3])
Out[4]: 
array([[[[ 0.,  0.,  0.],
         [ 0.,  0.,  0.],
         [ 0.,  0.,  0.]],

        [[ 0.,  0.,  0.],
         [ 0.,  0.,  0.],
         [ 0.,  0.,  0.]],

        [[ 0.,  0.,  0.],
         [ 0.,  0.,  0.],
         [ 0.,  0.,  0.]]],


       [[[ 0.,  0.,  0.],
         [ 0.,  0.,  0.],
         [ 0.,  0.,  0.]],

        [[ 0.,  0.,  0.],
         [ 0.,  0.,  0.],
         [ 0.,  0.,  0.]],

        [[ 0.,  0.,  0.],
         [ 0.,  0.,  0.],
         [ 0.,  0.,  0.]]]])
Anyway, my advice here is to not shortcut it at the plot command and construct the data array you're trying to do in "eytimes[:,half,half,:]" on it's own. You're probably getting something like this:

code:
In [8]: x[:,:,0,0]
Out[8]: 
array([[ 0.,  0.,  0.],
       [ 0.,  0.,  0.]])
So you'll either have to do a couple of appends, or rethink how you cast eytimes. I think that may be the better solution.

JetsGuy
Sep 17, 2003

science + hockey
=
LASER SKATES
Goddamnit. Beaten while writing my post again.

Dren
Jan 5, 2001

Pillbug
Dominoes, you were talking about having data from two sources. One dataset is in json the other is in csv. Do the two data sources provide you the same data? If so, you should look at creating your own type, call it StockData. When you ingest data, load it into a StockData object. The custom code for how to read in each type of data will go into those load functions. Your code will access the StockData object. This way you don't have to write nutty custom accessor stuff (like the code you have).

This is just a suggestion, you might have a need for the way your stuff is right now.

Houston Rockets
Apr 15, 2006

Is it possible to have my setup.py download a binary file and include it with the module?

I'm considering hacking it by having setup.py just download the file using urllib before setup() is invoked, but I wanted to know if there was a better way.

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe
Don't do that. Your package's payload should have all the stuff it needs. Offline installation should be supported.

Vulture Culture
Jul 14, 2003

I was never enjoying it. I only eat it for the nutrients.

Suspicious Dish posted:

Don't do that. Your package's payload should have all the stuff it needs. Offline installation should be supported.
Sometimes this isn't possible due to licensing issues (think of the video driver or MS corefont installers in most Linux distros)

For basically all other instances though, agreed.

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe
Why would the license allow downloading it over the internet as part of the install process? Like, corefonts never supported that. MS pulled it after people started extracted things from the .cab file.

Please do not install video drivers from setup.py.

FoiledAgain
May 6, 2007

I want to be able to draw some simple lines and circles on screen, as a way of visualizing some data that my program outputs. Essentially I just want to draw a whole bunch of number lines from 1-100 with a few annotations on each line. I have no experience doing anything visual other than making some figures in matplotlib. What's a good package to use? The offical docs only seem to include Tkinter/turtle, and then recommendations for wxPython, PyQt, or PyGTK. I haven't used any of these before. I'm willing to learn new stuff, but this is a program that I want other people to use, and my target audience is academic colleagues who are afraid of code so I want to avoid sending them download anything other than Python if that's possible.

Masa
Jun 20, 2003
Generic Newbie
I suck at Pyplot and Matplotlib and the documentation and examples never seem to help me, is there some simple way that I'm missing to show the same labels for the yticks on both the left and right sides of the graph?

accipter
Sep 12, 2003

Masa posted:

I suck at Pyplot and Matplotlib and the documentation and examples never seem to help me, is there some simple way that I'm missing to show the same labels for the yticks on both the left and right sides of the graph?

This is the example you are looking for: http://matplotlib.org/examples/api/two_scales.html

You will have to copy the limits, ticks, and labels over with:
code:
ax2.set_ylim(*ax1.get_ylim()
ax2.set_yticks(ax1.get_yticks())
ax2.set_ylabel(ax1.get_ylabel())
edit: And another example: http://matplotlib.org/examples/api/fahrenheit_celsius_scales.html

accipter fucked around with this message at 23:24 on Apr 22, 2013

Adbot
ADBOT LOVES YOU

accipter
Sep 12, 2003

FoiledAgain posted:

I want to be able to draw some simple lines and circles on screen, as a way of visualizing some data that my program outputs. Essentially I just want to draw a whole bunch of number lines from 1-100 with a few annotations on each line. I have no experience doing anything visual other than making some figures in matplotlib. What's a good package to use? The offical docs only seem to include Tkinter/turtle, and then recommendations for wxPython, PyQt, or PyGTK. I haven't used any of these before. I'm willing to learn new stuff, but this is a program that I want other people to use, and my target audience is academic colleagues who are afraid of code so I want to avoid sending them download anything other than Python if that's possible.

Are you looking to create a static image, or provide some interactive capabilities? If you are interested in a static image I would just create it with matplotlib.

  • Locked thread