Python information and short questions megathread.

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Python information and short questions megathread.

SelfOM: Jun 15, 2010

JetsGuy posted:

Echoing matplotlib. It loving rules, and I've even used it in graphs for my professional academic journal articles.

I heartily endorse it.

Pretty much the first thing I do when I get a new machine is Python 2.7, NumPy, SciPy, PyFits and matplotlib.

I also recently got turned on to how awesome iPython really is. I'm really annoyed at myself for resisting using it for so long!

I recently switched to doing a lot of data analysis in R to IPython + the libraries you mentioned. One of the nice features is the interactive cluster functionality. Also the notebook option is amazing if you haven't tried it yet. Both of these features require zeromq and the latter requires tornado. Here is an example of it in use: http://healthyalgorithms.com/2012/02/09/powells-method-for-maximization-in-pymc/. It would be cool if ipython notebook had a nice export to html function though.

Also check out: http://pandas.pydata.org/. It's a really great data munging package.

# ¿ Feb 26, 2012 06:19

Adbot: ADBOT LOVES YOU

# ¿ May 14, 2024 01:11

SelfOM: Jun 15, 2010

I'm finding myself using this pattern with applies where I pass in a second pandas series or dataframes as an additional argument. The other dataframes are usually the same size, but the operation can't be done simply by matrix multiplication (or at least not at first glance). In addition, I kind of like being able to reference a position of a dataframe in the current apply by using the index, ie x.name in the example below even though there has to be a lookup as well as passing in another dataframe or series.

Python code:

import pandas as pd
import numpy as np

# df1 is a pd.DataFrame
# s1 and s2 is a pd.Series

def apply_func(x, series1, series2, tol=1e-10, maxiter=50):
    cur_beta = x[x>tol]/np.exp(offset[x>low_level])).sum()
    cur_beta = np.log(cur_beta/x.shape[0])
    # Newton-raphson
    for i in xrange(0, maxit):
        mu = np.exp(cur_beta + offset)
        denom = 1 + mu * series2[x.name]
        # More irrelevant stuff. 

df1.apply(apply_func, args=(s1, s2))

For optimization, I probably should be dropping down to C/Cython and using pointers in for loops rather than apply?

# ¿ Jun 20, 2013 16:51

SelfOM: Jun 15, 2010

Something like this works for me: https://gist.github.com/anonymous/5962369, maybe it has something do with with what's going on in Button as this works for me:

Python code:

test = Test()
test.something(20)
test.seg_button.push()

I'm having with trouble with Cython memory views:

Python code:

import numpy as np
cimport numpy as np                                                                     
cpdef double test(int[:]):
      ### Stuff ##

Used to work, now I get:

code:

cpdef double test(int[:]):
                  ^
Expected an identifier or literal

# ¿ Jul 10, 2013 00:56

SelfOM: Jun 15, 2010

I find myself using this pattern a lot for testing for numpy arrays as optional arguments. Is this the best way to do this?

Python code:

import numpy as np

def test_func(x, y=None):
    if getattr(y, 'shape', False):
         print('Array passed for y')
    else:
         print('No array passed for y')

# ¿ Nov 18, 2013 19:56

SelfOM: Jun 15, 2010

Thermopyle posted:

Also Guido wants to bring something like mypy into core python. In fact, you can use mypy with type annotations right now!
Python code:
def fib(n: int) -> Iterator[int]:
    a, b = 0, 1
    while a < n:
        yield a
        a, b = b, a+b
http://www.mypy-lang.org/tutorial.html

I really like this, but why not use syntax closer to Cython? I guess this is more pythonic but it would be great if it could all converge such that optionally statically typed data in python could then be easily compiled with cython. Cython is great because it already has integration with numpy and all the C libraries.

# ¿ Dec 12, 2014 03:43

SelfOM: Jun 15, 2010

Does anyone know if in matplotlib you can initialize subplot axes dynamically without knowing the number of subplots you are going to have before hand? I want to be able to use functions like ax.add_patches in convenience functions that just add plots in sequentially. The other option I was thinking of holding is using holding objects that would call add_patches after everything is set.

# ¿ Jan 16, 2015 01:08

SelfOM: Jun 15, 2010

Is there a way to have variable axes in matplotlib, ie from 1-100 scaled 10:1 and then from 100-110 be scale 1:1 then back to 10:1. I could scale the underlying data, but I want to avoid this.

# ¿ Apr 16, 2015 15:21

SelfOM: Jun 15, 2010

QuarkJets posted:

You want to squish the data around another set of data, basically? I don't think that you should do this for many reasons that are non-programmatic, such as it could cause misinterpretation of your results. Make a second graph of the range that you're interested in, you could even plot it in the same figure, just please don't go loving with axes in weird ways like this.

People that don't understand variable scaling of an axis isn't the audience I care about. I've done the latter and the graphs already take up too much space.

# ¿ Apr 17, 2015 14:58

SelfOM: Jun 15, 2010

The banded idea is smart.

Here is something similar where it scales differently in different regions along the X-axis (but there is no explicit x-axis labeling, which is problem for me, and looking at the code the data is scaled):
http://miso.readthedocs.org/en/fastmiso/_images/sashimi-plot-example.png

# ¿ Apr 17, 2015 23:28

SelfOM: Jun 15, 2010

ShadowHawk posted:

Python2 will always be what gets run in the "python" command because it is brain-dead moronic for a Linux distribution to override that and break user scripts. Only one distro to my knowledge is hostile enough to users to have done that (I think Arch?)

Meanwhile everyone else is smart enough to have "python" run python2 like existing scripts expect and "python3" run python3, even when 100% of the system components are python3.

Distro scripts, for what it's worth, should explicitly declare the version of python they expect (eg #!/usr/bin/python2.7 or #!/usr/bin/python3.4) -- this tells you what remains to be tested and ported when you the distro maker are considering dropping an older version of python (say, 3.3). Sometimes porting is as simple as changing that line and seeing if anything breaks.

Yep it is definitely arch.

# ¿ Apr 19, 2015 22:30

SelfOM: Jun 15, 2010

What is the best way to generate a sorted random int array of variable length in Cython (this array is generated in a loop, which I don' know how to type)? I'm trying to weave two arrays together, where switchover positions occur at the random indexes.

# ¿ Feb 17, 2017 22:55

SelfOM: Jun 15, 2010

Rosalind posted:

My questions now are: 1. How the heck does anyone ever learn how to use this? I'm the first to admit that I'm no programming guru, but I have some experience with several different languages and actually getting started with Python makes absolutely no sense. Everything feels like the biggest clusterfuck of dependencies and "install X, Y, and Z to get W to work but to get X to work install A and B which require C to run on Windows." 2. Is there a simple, idiot-proof (because that's what I am apparently) guide to moving from R to Python for data analysis and hopefully eventually machine learning?

I do a lot of R and python. For windows I would use anaconda: https://conda.io/miniconda.html It's not perfect, but I find myself using it for non-python libraries as well. I don't recommend rpy2 at all for beginners, its for importing R objects, which often times you can just use subprocess and grab the output data. For R equivalent tables, install pandas. Conda should handle the dependencies.

# ¿ Feb 17, 2017 23:11

Adbot: ADBOT LOVES YOU

# ¿ May 14, 2024 01:11

SelfOM: Jun 15, 2010

Eela6 posted:

np.sort defaults to quicksort, also has mergesort and heapsort. For integers in a restricted range, counting sort should be fastest of all - if you're really burning for speed, you can use that. But this should be pretty fast.

Thanks this is the solution I'm using at the moment.

# ¿ Feb 20, 2017 21:15

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Python information and short questions megathread.