Python information and short questions megathread.

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Python information and short questions megathread.

«‹›484 »

Hammerite: Mar 9, 2007; And you don't remember what I said here, either, but it was pompous and stupid.; Jade Ear Joe

Gothmog1065 posted:

What issue? I use variable1/2/3 all the time (or list_1, list_2, list3 and never had a problem).

There's actual technical issues, as in "you can't do that, it's a syntax error" and then there's "you shouldn't do that, it makes the code hard to read"

If you are writing a short function that obviously does something with two lists (because of the nature of whatever it does) then having variables named list1 and list2 is not a big deal, in any other situation you can call variables names like that but you probably shouldn't.

I know this is tangential to the discussion at hand, but I want to make what I think is a useful stylistic point.

# ? Sep 15, 2015 12:33

Adbot: ADBOT LOVES YOU

# ? May 8, 2024 21:59

SurgicalOntologist: Jun 17, 2004

Down that road lies questions like "how can I iterate over integers in variable names" and eval.

# ? Sep 15, 2015 16:30

OnceIWasAnOstrich: Jul 22, 2006

SurgicalOntologist posted:

Down that road lies questions like "how can I iterate over integers in variable names" and eval.

Who needs eval?

Python code:

for list_ in ({**globals(),**locals()}['list{}'.format(x)] for x in range(3)):
        print(list_)

# ? Sep 15, 2015 20:05

Dominoes: Sep 20, 2007

Is there a clean way to update and merge timeseries' in Pandas? For example, I have data_current, which is a timeseries of dates and values from October 1 - October 10; I have data_future, which is in the same format, from October 7 - October 20. I would like to merge the two into a timeseries from October 1 - October 20. The overlapping values would have to match, or raise an error. (Or you could specify a default, like the first series value, the second series value, NA etc.)

edit: Solved with updated_data = data_current.combine_first(data_future)

Dominoes fucked around with this message at 21:31 on Sep 15, 2015

# ? Sep 15, 2015 21:25

ahmeni: May 1, 2005; It's one continuous form where hardware and software function in perfect unison, creating a new generation of iPhone that's better by any measure.; Grimey Drawer

Hadlock posted:

OpenCV just got official Python 3 support in June of this year. There's enough big legacy projects out there that still require 2.7, and I haven't run in to anything that explicitly requires 3 yet. Async sounds interesting though.

Only thing I've found so far is JupyterHub, which is a nice shared implementation of the Jupyter/iPython notebook. I've got a pull request pending to fix domain username support and will be hassling my DevOps group shortly to join in the pandas / bokeh fun.

# ? Sep 15, 2015 23:57

qntm: Jun 17, 2009

I'm working on a library for finite state machines. When you make a finite state machine you have to supply an alphabet of symbols, which in this case is a set of hashable values (they get used as keys in a dict). But I would also like there to be a special "anything else" value which you can add to your alphabet. So if your alphabet is {"a", "b", "c", "d", fsm.anything_else}, and you pass "e" into your FSM, then the library treats that as "anything else" and follows the transition you selected for that special value.

Problem is, what should I set the value fsm.anything_else to? I can't set it to "e" because that might be part of the user's chosen alphabet. I can't set it to None for the same reason. In fact the alphabet could in theory legitimately contain any hashable value. Users don't care, of course, unless there's a clash, because they'll use the symbol, not the value. Is the best approach really to just use a very large integer which nobody is ever likely to run into by accident?

# ? Sep 17, 2015 00:35

Nippashish: Nov 2, 2005; Let me see you dance!

qntm posted:

Problem is, what should I set the value fsm.anything_else to?

Make an empty class called AnythingElse and set it to an instance of that.

# ? Sep 17, 2015 00:58

Kuule hain nussivan: Nov 27, 2008

I am trying to make a GUI with Python using Tkinter. This is also the first thing I've made in Python so I'm 100% sure this is a retarded question.

I have a main function which sets up the window and widgets for the UI. I have a button that calls a function, which injects text into a Field in the UI. If I do...

def foo():
baa.insert(INSERT, "This is text")

def main():
boo = Button(command=foo())
baa = Text()

I get a complaint in foo that baa is an unresolved reference. If I do it the other way around, boo complains that foo is an unresolved reference. Is there some sort of forward declaration in Python, or is there something else I'm completely missing?

# ? Sep 17, 2015 13:07

Asymmetrikon: Oct 30, 2009; I believe you're a big dork!

First of all, looking at the Tkinter docs, command is a callback, so it expects a function; you're passing it the result of calling foo.

I don't know Tkinter, so I don't know if there's a way of passing in variables at call time, but you can use a closure to make a function with baa bound correctly, like so:

code:

def mkFoo(text):
  def foo():
    text.insert(INSERT, "This is text")
  return foo

def main():
  baa = Text()
  foo = mkFoo(baa)
  boo = Button(command=foo)

# ? Sep 17, 2015 13:23

tef: May 30, 2004; -> some l-system crap ->

qntm posted:

I'm working on a library for finite state machines. When you make a finite state machine you have to supply an alphabet of symbols, which in this case is a set of hashable values (they get used as keys in a dict). But I would also like there to be a special "anything else" value which you can add to your alphabet. So if your alphabet is {"a", "b", "c", "d", fsm.anything_else}, and you pass "e" into your FSM, then the library treats that as "anything else" and follows the transition you selected for that special value.

Problem is, what should I set the value fsm.anything_else to? I can't set it to "e" because that might be part of the user's chosen alphabet. I can't set it to None for the same reason. In fact the alphabet could in theory legitimately contain any hashable value. Users don't care, of course, unless there's a clash, because they'll use the symbol, not the value. Is the best approach really to just use a very large integer which nobody is ever likely to run into by accident?

foo = object()

>>> foo = object()
>>> bar = object()
>>> foo == bar
False
>>> foo is bar
False
>>> any(foo == x for x in [1,2,3,"a","b","c", None, bar])
False

# ? Sep 17, 2015 14:17

Kuule hain nussivan: Nov 27, 2008

Asymmetrikon posted:

First of all, looking at the Tkinter docs, command is a callback, so it expects a function; you're passing it the result of calling foo.

I don't know Tkinter, so I don't know if there's a way of passing in variables at call time, but you can use a closure to make a function with baa bound correctly, like so:
code:
def mkFoo(text):
  def foo():
    text.insert(INSERT, "This is text")
  return foo

def main():
  baa = Text()
  foo = mkFoo(baa)
  boo = Button(command=foo)

Yeah, that was a mistake on my part. Luckily, it looks like my original problem was just a fart with the IDE, since moving the main method to the top didn't cause any problems the second time. So no need for closures. Now my only problem is that the sqlite query seems to have trouble with an input string that's more than 1 character long.

Edit: Nevermind, got it to work. Changing the parameter to a list rather than a single parameter did the trick.

Kuule hain nussivan fucked around with this message at 17:20 on Sep 17, 2015

# ? Sep 17, 2015 16:56

Gothmog1065: May 14, 2009

Okay, more of a theoretical question than a straight code question, but I'm still working on the Skynet game I started ([urlhttps://gist.github.com/gothmog1065/449da65f1320fb6390c1]Here is my ultimate revised code[/url] and Here is the game itself).

Now if you just play the "triple star" scenario, you'll see there's 3 gateways the "enemy" can go to. One of the achievements is to complete the game (trap the enemy so it can't get to any node) with 50 links available. I've gotten it close, but my current code leaves 41 links left, which means you have to trap the enemy on one of the stars and not let it out. My question is how do you predict where the enemy is going well enough to accomplish that? Playing out with my current code, it leaves three "wasted" link breakages, the rest stop them directly at the gateway.

I guess I'm trying to problem solve it now.

One way I could think of is to break certain links (the links on the rings that are only between two nodes) at the ends of the "stars", and when the virus goes around a star and has to come back, break the other side so it's trapped on the outer ring of a star, but I'm not sure how I'd go about determining where to break.

# ? Sep 17, 2015 18:39

FoiledAgain: May 6, 2007

I'm experiencing some confusion/frustration with PyQt.

I have a QDialogBox that returns a Corpus object, like this:

code:

dialog = CorpusLoadDialog(self, self.settings)
result = dialog.exec_()
if result:
    corpus = dialog.corpus

The Corpus has an attribute .inventory. This is an Inventory object, which itself is a subclass of QAbstractTableModel. But when I try to call any inherited methods, I get a RuntimeError, e.g.

code:

corpus.inventory.headerDataChanged.emit(1,1,1)
RuntimeError: super-class __init__() of type Inventory was never called

As far as I can tell, Python is just lying to me here. I really do call super(), and the call works. Relevant code looks like this:

code:

class Corpus(object):

    def __init__(self):
        self.inventory = Inventory()
        self.inventory.headerDataChanged.emit(1,1,1) #does not raise an Error

class Inventory(QAbstractTableModel):
    def __init__(self):
        super().__init__() #Here's the crucial line!

As far as I can tell, the super() call works, because I can call Inventory's inherited methods back up inside the Corpus.__init__() method without a problem. However, as soon as the Corpus gets returned from the QDialogBox, all bets are off, as I described at the beginning of the post. Trying to called an inherited method raises the RuntimeError claiming I never called super().

How can the super() called get "nullified" like this? Am I using super() incorrectly?

I should also mention that the Inventory class has .data(), .headerData(), .columnCount(), and .rowCount(), i.e. all the methods you're required to implement when you subclass QAbstractTableModel. If I call super() a second time after getting the Corpus back from the QDialogBox, then things partially work (but not completely, because the QTableView that gets this model refuses to display headers). In any case, calling super() twice feels really suspicious, and I don't think it's an acceptable work-around.

# ? Sep 20, 2015 20:43

tef: May 30, 2004; -> some l-system crap ->

FoiledAgain posted:

How can the super() called get "nullified" like this? Am I using super() incorrectly?

super takes args in 2.x: https://docs.python.org/2/library/functions.html#super

in the current version of python, you can use the no-arg form: https://docs.python.org/3/library/functions.html#super "The zero argument form only works inside a class definition, as the compiler fills in the necessary details to correctly retrieve the class being defined, as well as accessing the current instance for ordinary methods."

my advice is to do SuperClass.__init__(self) explicitly and just avoid super altogether unless you're doing multiple inheritance

# ? Sep 20, 2015 21:08

chutwig: May 28, 2001; BURLAP SATCHEL OF CRACKERJACKS

Are you working in Python 2 or 3? You're manually inheriting from object which implies Python 2, but calling super() without arguments which implies (really requires) Python 3.

My best guess is there's some weird inheritance diamond stuff going on. What happens if you change the super call to super(Inventory, self).__init__()?

# ? Sep 20, 2015 21:13

FoiledAgain: May 6, 2007

chutwig posted:

Are you working in Python 2 or 3? You're manually inheriting from object which implies Python 2, but calling super() without arguments which implies (really requires) Python 3.

My best guess is there's some weird inheritance diamond stuff going on. What happens if you change the super call to super(Inventory, self).__init__()?

I'm using 3.4, but I started on 2.7, so I'm in the habit of inheriting from object.
super(Inventory, self).__init__() has the same effect as plain super()

tef posted:

my advice is to do SuperClass.__init__(self) explicitly and just avoid super altogether unless you're doing multiple inheritance

Calling QAbstractTableModel.__init__(self) doesn't work either. Same RuntimeError.

FoiledAgain fucked around with this message at 21:55 on Sep 20, 2015

# ? Sep 20, 2015 21:50

FoiledAgain: May 6, 2007

I think I'm on to something. I just discovered that a pickled copy of the Corpus is saved before the DialogBox returns. A little googling suggests that you can't reliably pickle some Qt objects (possibly including QAbstractTableModel). Does anyone know anything about this?

# ? Sep 21, 2015 00:48

FoiledAgain: May 6, 2007

I fixed the problem I was describing in my last two posts, but in the process came across another weird Qt thing. When trying to display my TableView headers, I initially had this code:

code:

def headerData(self, row_or_col, orientation, role=None):
        try:
            if orientation == Qt.Horizontal:
                return self.column_data[row_or_col]
            elif orientation == Qt.Vertical:
                return self.column_data[row_or_col]
        except KeyError:
            return QVariant()

But no headers would show up. I changed the code very minimally to include a comparison, and suddenly everything works:

code:

def headerData(self, row_or_col, orientation, role=None):
        try:
            if orientation == Qt.Horizontal and role == Qt.DisplayRole:
                return self.column_data[row_or_col]
            elif orientation == Qt.Vertical and role == Qt.DisplayRole:
                return self.column_data[row_or_col]
        except KeyError:
            return QVariant()

Why is this? How come checking for Qt.DisplayRole makes the headers visible? It's not like I'm actually setting the Role, I'm just doing a comparison.

# ? Sep 22, 2015 18:20

hooah: Feb 6, 2006; WTF?

I'm trying to wrap my head around DEAP (documentation), and ran across an error I don't really understand when trying to do a simple onemax genetic algorithm. Here's my setup code:

Python code:

import random

from deap import tools
from deap import base
from deap import creator
from deap import algorithms


def evaluate(individual):
    return sum(individual),


def select(population):
    return population


def randBinList(n):
    return [random.randint(0, 1) for _ in range(1, n+1)]

IND_SIZE = 20
POP_SIZE = 100

creator.create("FitnessMax", base.Fitness, weights=(1.0,))
creator.create("Individual", list, fitness=creator.FitnessMax)

toolbox = base.Toolbox()
toolbox.register("attr_bin", randBinList)
#toolbox.register("attr_float", random.random)
toolbox.register("individual", tools.initRepeat, creator.Individual,
                 toolbox.attr_bin, n=IND_SIZE)
toolbox.register("mate", tools.cxOnePoint)
toolbox.register("mutate", tools.mutFlipBit, indpb=0.2)
toolbox.register("select", select)
toolbox.register("evaluate", evaluate)

pop = [toolbox.individual() for _ in range(POP_SIZE)]

When I run this, I get a TypeError on the last line: "randBinList() missing 1 required positional argument: 'n'". I get that randBinList isn't getting its argument, but I don't understand why. I thought the initRepeat should take care of that?

# ? Sep 23, 2015 01:14

chutwig: May 28, 2001; BURLAP SATCHEL OF CRACKERJACKS

hooah posted:

When I run this, I get a TypeError on the last line: "randBinList() missing 1 required positional argument: 'n'". I get that randBinList isn't getting its argument, but I don't understand why. I thought the initRepeat should take care of that?

It's because when you supply n=IND_SIZE, that's going to get passed as an argument to tools.initRepeat and not to randBinList. If that register were turned into a method invocation, it would look like

Python code:

tools.initRepeat(creator.Individual, toolbox.attr_bin, n=IND_SIZE)

The callable that you supply to tools.initRepeat looks like it needs to take no arguments or supply defaults for any arguments. If you register randBinList with an argument for n, it works as expected:

Python code:

toolbox.register("attr_bin", randBinList, n=IND_SIZE)

# ? Sep 23, 2015 02:26

nonathlon: Jul 9, 2004; And yet, somehow, now it's my fault ...

So what do people use for testing within IPython notebooks? Do people do testing within IPython notebooks? I use them for laying out scientific analyses, so it's fairly important to "get things right". On one hand, you could argue that anything substantial should be rolled out into an external module with it's own testing, but with science we're alway trying new analyses and bespoke / ad hoc techniques, so there's always going to be something new in there.

# ? Sep 23, 2015 09:36

hooah: Feb 6, 2006; WTF?

chutwig posted:

It's because when you supply n=IND_SIZE, that's going to get passed as an argument to tools.initRepeat and not to randBinList. If that register were turned into a method invocation, it would look like
Python code:
tools.initRepeat(creator.Individual, toolbox.attr_bin, n=IND_SIZE)
The callable that you supply to tools.initRepeat looks like it needs to take no arguments or supply defaults for any arguments. If you register randBinList with an argument for n, it works as expected:
Python code:
toolbox.register("attr_bin", randBinList, n=IND_SIZE)

That makes sense, but after changing the attr_bin registration (and leaving the individual registration as is), the framework seems to be calling randBinList twice, so I get a population that has individuals which are made up of 20 lists of 20 1s/0s apiece. I changed the individual's register to take n=1 and left the attr_bin register at 20, but that made each individual a list that contains a single list which then contains the 20 binary elements.

# ? Sep 23, 2015 13:19

Dominoes: Sep 20, 2007

I'm looking for advice on cleaning a pandas timeseries. I currently have financial data in minute increments. Sometimes the data includes every minute, sometimes it goes in increments of a few minutes at a time. The cleaned needs to include an entry for every minute, during set hours on set days.

I can populate data for the missing minutes minute using this: (As nan, backfill, forward fill etc)

Python code:

data.asfreq('1Min')

However, this also fills the large chunks of time I'm not interested in.

I can then filter for the relevant times and dates using this:

Python code:

for timestamp in data.index:
    if not in_range(timestamp):
        data.drop(timestamp, inplace=True)

Where in_range is a function that evaluates if each time is in the range. However, this currently takes an unnaceptably-large amount of time to run through the data. I could try to optimize my in_range function, but I suspect there's a cleaner way to do this.

The Pandas docs on Timeseries' and Missing data seem to point in the right direction, but I'm unable to find a built-in solution.

Dominoes fucked around with this message at 17:03 on Sep 23, 2015

# ? Sep 23, 2015 16:32

Emacs Headroom: Aug 2, 2003

I'd probably do it a super lazy way, like

Python code:

times = [start_t + datetime.timedelta(minutes=i) for i in range(something)]
full_series = pd.Series(np.zeros(len(times)), index=times)
full_series[partial_series.index] = partial_series

edit: er, just replace 'times' with the minutes of the times you do actually care about and it should work

Emacs Headroom fucked around with this message at 06:35 on Sep 24, 2015

# ? Sep 24, 2015 06:32

QuarkJets: Sep 8, 2008

outlier posted:

So what do people use for testing within IPython notebooks? Do people do testing within IPython notebooks? I use them for laying out scientific analyses, so it's fairly important to "get things right". On one hand, you could argue that anything substantial should be rolled out into an external module with it's own testing, but with science we're alway trying new analyses and bespoke / ad hoc techniques, so there's always going to be something new in there.

Just use git or some other version control software. I don't see a lot of advantage to using IPython notebooks for scientific analyses

# ? Sep 24, 2015 07:01

ahmeni: May 1, 2005; It's one continuous form where hardware and software function in perfect unison, creating a new generation of iPhone that's better by any measure.; Grimey Drawer

outlier posted:

So what do people use for testing within IPython notebooks? Do people do testing within IPython notebooks? I use them for laying out scientific analyses, so it's fairly important to "get things right". On one hand, you could argue that anything substantial should be rolled out into an external module with it's own testing, but with science we're alway trying new analyses and bespoke / ad hoc techniques, so there's always going to be something new in there.

I'm always for more testing when possible. I've seen one or two people roll IPython Nose into their notebooks as a collapsed cell and it seems to be a nice and tidy way to keep things tested.

QuarkJets posted:

Just use git or some other version control software. I don't see a lot of advantage to using IPython notebooks for scientific analyses

You're right in that IPython is not version control, but it's also great when used with it. There is definitely an advantage to distributing your scientific analyses with built in data display via nice plotting systems like Bokeh. Plenty of exmamples in A Gallery Of Interesting IPython Notebooks.

# ? Sep 24, 2015 07:33

QuarkJets: Sep 8, 2008

I never liked using Mathematica notebooks because of the whole fakeness of having to get a perfect-working state and then copying it all into a new notebook to make it look nice, and that's on top of a bunch of manual formatting. The whole thing feels hokey. At that point it feels like you'd save time just saving the output that you need and writing a Latex document. I imagine that IPython notebooks suffer from the same problems, but I guess I've never used them so I don't really know.

I mean I guess it works well if you're talking about < 100 lines of code, but that seems more relevant to a class room than to a lab. Unless the objective is to just use IPython notebooks for your frontend presentation on top of a bunch of backend code, but again I don't really understand the point of that. I'm not saying it's bad or stupid or whatever, I just don't understand it

# ? Sep 24, 2015 07:57

Cingulate: Oct 23, 2012; by Fluffdaddy

At the point I'm at, doing science in Python without using iPython notebooks seems like using Python without list comprehensions. Sure, you still got a useful package, but why??

# ? Sep 24, 2015 08:59

nonathlon: Jul 9, 2004; And yet, somehow, now it's my fault ...

QuarkJets posted:

I never liked using Mathematica notebooks because of the whole fakeness of having to get a perfect-working state and then copying it all into a new notebook to make it look nice, and that's on top of a bunch of manual formatting. The whole thing feels hokey. At that point it feels like you'd save time just saving the output that you need and writing a Latex document. I imagine that IPython notebooks suffer from the same problems, but I guess I've never used them so I don't really know.

I mean I guess it works well if you're talking about < 100 lines of code, but that seems more relevant to a class room than to a lab. Unless the objective is to just use IPython notebooks for your frontend presentation on top of a bunch of backend code, but again I don't really understand the point of that. I'm not saying it's bad or stupid or whatever, I just don't understand it

I use IPython notebooks for science & analysis a lot ... and I'm in two minds about it. It's a very good way to document your analysis workflow and show results to colleagues. On the other hand, it's not the best development environment and once you start doing major complex code in the notebook, the cracks start to show. I really need to use that function I wrote in the other notebook ... uh, better cut and paste ...

# ? Sep 24, 2015 10:00

QuarkJets: Sep 8, 2008

e: ^^^ Okay, yeah, that's basically how I felt about using Mathematica notebooks. It feels good for small analyses but messy for more complicated projects. So if I'm just going to analyze an output file real quick then that makes sense to put in a notebook, but if I need to generate a bunch of plots for a .tex paper that are all sort of formatted in the same way then it'd actually be faster to just write a function to do that

Cingulate posted:

At the point I'm at, doing science in Python without using iPython notebooks seems like using Python without list comprehensions. Sure, you still got a useful package, but why??

Can you elaborate, maybe? I mean I get that it's cool to be able to make plots and notes in the same window, and that you can make it look nice with some formatting. So it's basically useful for just the data analysis portion of a project, yeah?

In my line of work I'm either writing code that will run on a supercomputer, code that's complicated enough to be put in its own module, or both. A notebook isn't really a good IDE, and I have PyCharm for that anyway. I could use a notebook during analysis, I guess, but the rest of my workflow is already in PyCharm... so it feels easier to just write a script in PyCharm that outputs content directly to latex. What am I missing?

QuarkJets fucked around with this message at 10:04 on Sep 24, 2015

# ? Sep 24, 2015 10:00

Cingulate: Oct 23, 2012; by Fluffdaddy

A lot of what scientists are doing is share analyses - at every stage, including preliminary stages. A notebook is self documenting, can be directly interacted it by the receiver, and can be easily annotated. And a lot of data scientific code isn't really that complicated that you need a real IDE.

Sure, there are steps that might better be done in a real IDE, but eventually, when you HAVE that module, you can still import it in the notebook. And if you HAVE crunched the numbers, you can still visualize (and document) it in the notebook

For me, another enormous benefit is how easily I can use our servers to crunch numbers and plot them all over a spotty SSH connection - I do a lot of work on airports, trains, etc. Sure, you can run your regular python session in e.g. a Screen. But that way you can't easily plot the results. It's trivial t reconnect to a notebook, look at previous results and pick up right where you left it.

In R, an analogous workflow exists with the rmarkdown/knitr packages for a reason - it's simply a great way to do data science.

# ? Sep 24, 2015 10:07

Cingulate: Oct 23, 2012; by Fluffdaddy

QuarkJets posted:

e: ^^^ Okay, yeah, that's basically how I felt about using Mathematica notebooks. It feels good for small analyses but messy for more complicated projects. So if I'm just going to analyze an output file real quick then that makes sense to put in a notebook, but if I need to generate a bunch of plots for a .tex paper that are all sort of formatted in the same way then it'd actually be faster to just write a function to do that

Or just directly share the notebook.

# ? Sep 24, 2015 10:16

Dumlefudge: Feb 25, 2013

I am fetching data from the API on a number of devices, each of which have a list of network nodes that they are associated with, in the following format

code:

# response 1
{
	# other fields omitted
	'nodes': [
		{
			'ip': '1.2.3.4'
			# more values here
		},
		{
			'ip':  '2.3.4.5'
		}
	]
}

# response 2
{
	'nodes': [
		{
			'ip': '1.2.3.4'
		}
	]
}

I want to take all the entries in 'nodes' and combine them into a single list, where no duplicates exist (items are considered duplicates if their 'ip' values are equal).

My first thought was to iterate over the list being built, as I inspect the content of each incoming dictionary, checking for a match.
However, that doesn't seem like the right way to go about it, since I end up iterating over a constantly-growing list.
Is there a cleaner way to approach this problem?

# ? Sep 24, 2015 11:15

nonathlon: Jul 9, 2004; And yet, somehow, now it's my fault ...

ahmeni posted:

I'm always for more testing when possible. I've seen one or two people roll IPython Nose into their notebooks as a collapsed cell and it seems to be a nice and tidy way to keep things tested.

That is so cool and so useful. Thanks!

# ? Sep 24, 2015 12:30

'ST: Jul 24, 2003; "The world has nothing to fear from military ambition in our Government."

Dumlefudge posted:

My first thought was to iterate over the list being built, as I inspect the content of each incoming dictionary, checking for a match.
However, that doesn't seem like the right way to go about it, since I end up iterating over a constantly-growing list.
Is there a cleaner way to approach this problem?

Just add each value for "ip" into a running set: https://docs.python.org/3/library/stdtypes.html?highlight=set#set-types-set-frozenset

Something like

Python code:

node_values = set()
for node_obj in response['nodes']:
    node_values.add(node_obj['ip'])

# ? Sep 24, 2015 12:48

Emacs Headroom: Aug 2, 2003

outlier posted:

I use IPython notebooks for science & analysis a lot ... and I'm in two minds about it. It's a very good way to document your analysis workflow and show results to colleagues. On the other hand, it's not the best development environment and once you start doing major complex code in the notebook, the cracks start to show. I really need to use that function I wrote in the other notebook ... uh, better cut and paste ...

As my analysis / modeling evolves, I end up moving code from inside the notebook to a library in Python that gets imported into the notebook (and re-used by other notebooks). The library is the time to add unit tests, write good docstrings, etc.

If you're doing data engineering in industry, the library can also be a good reference to base your streaming / hadoop / spark / whatever version on as well.

# ? Sep 24, 2015 13:44

Proteus Jones: Feb 28, 2013

Emacs Headroom posted:

As my analysis / modeling evolves, I end up moving code from inside the notebook to a library in Python that gets imported into the notebook (and re-used by other notebooks). The library is the time to add unit tests, write good docstrings, etc.

If you're doing data engineering in industry, the library can also be a good reference to base your streaming / hadoop / spark / whatever version on as well.

I don't use notebooks (really I do different stuff with my programs), but I've found creating libraries has been very useful to me, since there have been some classes I've used over and over again in different projects. I've found that while it adds a little more time to planning and a bit more effort programming and documenting when creating it for the first time, it's saved me oh so much time as opposed to cutting and pasting code and hammering it to fit.

**kwargs are my bestest friends.

# ? Sep 24, 2015 16:31

QuarkJets: Sep 8, 2008

Cingulate posted:

Or just directly share the notebook.

That doesn't really work with anyone who's over 50. Old people want a white paper. And it certainly doesn't work if I want to publish a journal article

e: The notebook approach feels like it's the "code until it works" approach that scientific code is so ill-known for. Would you disagree?

QuarkJets fucked around with this message at 19:36 on Sep 24, 2015

# ? Sep 24, 2015 19:34

Cingulate: Oct 23, 2012; by Fluffdaddy

QuarkJets posted:

That doesn't really work with anyone who's over 50. Old people want a white paper. And it certainly doesn't work if I want to publish a journal article

Yes, but so does everything that's not "a word document named manuscript_final-version_b_2014_reallyfinal_mkII_d.docx".

And you can easily export the notebook to a PDF.

QuarkJets posted:

e: The notebook approach feels like it's the "code until it works" approach that scientific code is so ill-known for. Would you disagree?

Big difference: everyone can see the (potentially bad) code you used to get to the results.

Cingulate fucked around with this message at 01:08 on Sep 25, 2015

# ? Sep 25, 2015 01:06

Adbot: ADBOT LOVES YOU

# ? May 8, 2024 21:59

pmchem: Jan 22, 2010

Cingulate posted:

Big difference: everyone can see the (potentially bad) code you used to get to the results.

I agree with QJets on this issue. If you work in a large team with diverse ages then trying to get people to adopt, or even look at, ipython nb's is a hopeless endeavor. Other programmers that are not primarily Python guys can open up .py source in vim or emacs but not ipynb files. Ipynb only really works when you're in a herd of other people that also use it. Even then, I wouldn't want to use it for large scientific projects when work on a remote machine will be done with regular py and not ipynb.

# ? Sep 25, 2015 01:44

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Python information and short questions megathread.

«‹›484 »