Python information and short questions megathread.

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Python information and short questions megathread.

brosmike: Jun 26, 2009

Jam2 posted:

In what sort of environment do you guys code Python? I've been juggling notepad++ and the Python shell, pasting code across and evaluating it to see when it breaks. Is there a better way? I watched a Google lecture on Python. The speaker utilized Terminal and executed Python in that way. How would I do this in Windows? Should I work in OS X?

My editor of choice is currently PyCharm, but for smaller projects I usually switch between a python command line and notepad++. Note that you don't have to copy-paste code, generally - just open the python command line from wherever your script is and you can do "import myscript" from within the python shell. You can then do "reload(myscript)" if you make changes and don't want to restart the whole shell.

If you want to do this, make sure that the "normal functionality" of your script is wrapped in a block that looks like this:

code:

if __name__ == '__main__':
    #your main code

If you do this, your script will still do what you want when you run it directly, but it won't run the main code when you import it into the shell.

Jam2 posted:

How do I import a text file and go about parsing the contents? Do I need to import modules to do this? I want to be able to locate certain patterns of data, create lists, strings, and assign values to variables.

You can import modules to make parsing easier if the files are formatted in particular ways (for example, there's a separate module for making it easy to read CSV files, and there are several libraries for XML parsing), but if you just want to access the text of a file line by line, the easiest way is:

code:

with open(my_filename) as my_file:
    for line in my_file:
        #line is a string, do what you want with it

This assumes that you're using a python version new enough for the with statement, I forget which version added it.

Jam2 posted:

I've been using Python 3. Is this a bad idea? Should I stick with a version that's been in production for a while? The Facebook puzzle robot accepts Python 2.5.2. A visit to the Python 2.5.2 page informs that this version has been replaced by 2.5.5. What is a newbie to do?

Python 2.5.5 is at this point pretty old, and you should avoid it in favor of 3.1 or at least 2.7 where possible. Obviously there are some situations - apparently, like the Facebook puzzle robot - where you don't have control over this. The general consensus is to use 3.1 whenever you can, and 2.7 when you need compatibility with a non-python-3-compatible library (that is, most of them).

# ¿ Dec 20, 2010 02:01

Adbot: ADBOT LOVES YOU

# ¿ May 19, 2024 19:23

brosmike: Jun 26, 2009

zarg posted:

Awesome. Thanks for your help on the stuff above this! I was already reading the python docs stuff Tons of nice concise info there; it's great.

For the example I quote above, what will happen if myDictionary is a list-of-lists like I'm aiming to do? Will it create a new list with the same values, simply sorted? Or does it take advantage of the mutability of lists and just shuffle it in place? Again, dosnt matter in the least for me. I'm just a curious dude

edit: To clarify, I understand that tuple is a different data type from list. When I say "the same values" I just mean the same terms in a different order.

There's a few things going on here; in general, the documentation is the best place to get answers like this. dict.items() is going to give you a shallow copy of the dict's list of key-value pairs. Since it's a copy, modifying the list returned by items() (say, by appending a new (key,value) pair) will not affect the dict in any way. But since it's a shallow copy, modifying one of the values of the dict (which are also lists) through the returned value of items() (say, by appending something to the list at my_dict_items[0][1]) will affect the contents of the dict.

The sorted() call has the same effect - it generates a new shallow copy of the original list and sorts that, it doesn't sort the original list in place. If you want to sort in-place, you would want to use my_dict_items.sort() instead of sorted(my_dict_items).

# ¿ Mar 17, 2011 12:26

brosmike: Jun 26, 2009

In python as in most other languages, the "or" operator just operates on two boolean expressions. Let's take a look at your original if statement with some clarifying parentheses to make it clear why this matters to you. When you tell python

code:

if counter >= len(list1) or len(list2):

...it sees...

code:

if (counter >= len(list1)) or (len(list2)):

To do what you want, you have a few options (including the sort of clunky one you mentioned already). You could do:

code:

if counter >= len(list1) or counter >= len(list2):

or you could do

code:

if counter >= min(len(list1), len(list2)):

Edit: beaten like gently caress, and also yes this is checking greater than

# ¿ Mar 28, 2011 20:15

brosmike: Jun 26, 2009

Hughmoris posted:

Ok, I'm making progress but I need to write a loop that will continue to read each line and do something with that line, until there are no more lines. What is a good way to write a loop to break when it reaches the end of the text file?

Python makes this very easy for you:

code:

with open("my_file.txt") as myfile:
    for line in myfile:
        #do whatever

# ¿ Mar 29, 2011 03:06

brosmike: Jun 26, 2009

Python also lets you just do

code:

'a'.isalpha()

# ¿ Mar 29, 2011 14:29

brosmike: Jun 26, 2009

MeramJert posted:

I have a question that's more a general algorithm question than a Python question. So if I had a very large grid, and randomly moving particles were spread out all across it, what would be a good way to select 1 particle and tell which other particles are within a certain distance of the selected one? The only way I can think of involves checking the position of every particle in the whole system to see if they are in range, but there has to be a better way to do it.

One idea might be to keep track of a mapping from chunks of your grid to individual particles. So say your grid went from (0,0) to (100,100) - you might keep a mapping from 10x10 squares to a set of the particles contained within the squares. Keeping track of this would only require a constant-time addition to your physics tick process, but if your particles are generally not clustered together would significantly reduce the amount of particles you need to check (just check those from the 10x10 squares near the target).

You can be cleverer than this by allowing for variable-sized partitions (ie, not all of your squares are 10x10), using smaller ones where particles are more densely gathered. This is much harder, especially if the "dense areas" move over time, but might be more effective if you have particles clustered together all the time.

# ¿ May 13, 2011 03:17

brosmike: Jun 26, 2009

So, to clarify, you want to REQUIRE callers to use keyword arguments for every parameter, regardless of whether it's optional? That is, something like:

code:

#valid:
mygeneratedfunction(myfirstparam=a, mysecondparam=b)

#not valid:
mygeneratedfunction(a, b)

If that's the case, you use do something like:

code:

def generate_function(param_names, param_optional, body_function):
    def impl(**kwargs):
        num_used_params = 0
        params = []
        # Find the parameters we want to pass on in the order we want to pass them
        for param, optional in zip(param_names, param_optional):
            if param in kwargs:
                num_used_params += 1
                params.append(kwargs[param])
            elif optional:
                params.append(None)
            else:
                raise ValueError("Parameter {} is required".format(param))
                    
        # Make sure there weren't any we don't want
        if num_used_params != len(kwargs):
            invalid_params = [p for p in kwargs.keys() if p not in param_names]
            raise ValueError("The following parameters are invalid: {!s}".format(invalid_params))
            
        #Pass it on to the real implementation
        return body_function(*params)
    return impl

Disclaimer: I haven't tested this. But it should give you the right general idea.

Edit: This doesn't add your function to the global namespace on its own, of course - it's probably cleaner to add it to a dict from function names to these generated functions than to just shove them in the global namespace, unless there's some reason you absolutely can't do that.

brosmike fucked around with this message at 23:27 on May 16, 2011

# ¿ May 16, 2011 23:18

brosmike: Jun 26, 2009

Janin posted:

Use eval() -- it's ugly, not not as ugly as mucking around in internal Python implementation details.

I wouldn't really call using **kwargs "mucking around in internal Python implementation details". It's possible that an eval call could open up a pretty serious security hole, depending on where the data is coming from. Also,

Stabby McDamage posted:

Obviously I could use string processing to write the python itself, but I think it's more interesting to build the objects dynamically, that way the machine-readable spec can live right in the code.

# ¿ May 17, 2011 03:24

brosmike: Jun 26, 2009

Gothmog1065 posted:

As someone who is about two months ahead of you learning from scratch, use IDLE until you're comfortable, then use something fancier. A lot of the features of a good programming environment (Can't remember the term off the top of my head) didn't make sense until you actually know the code.

Modern Pragmatist posted:

IDLE or just using running python from the command line are really good for hacking away when first getting started to understand the basic datatypes, how to call methods etc. As soon as you start writing your own classes or basically doing any real programming, I would recommend avoiding IDLE. I personally use VIM but IMHO anything is better than IDLE.

These are both good pieces of advice. I'll add in that even once you start doing fancier stuff, IDLE and the python command shell are great as REPLs to quickly test out very small ideas before you include them in your bigger project.

As for which IDE to switch to once you're a bit more comfortable with the language, I'm a fan of JetBrains' PyCharm. I think it's the best mix of intuitiveness and powerful features I've seen in a Python IDE. PyDev is another common suggestion, and it's not too bad, but I found it to be slightly clunkier and significantly worse at things like code completion than PyCharm. That said, it is totally free - PyCharm isn't (unless you're using it for a class or an "active" open source project).

Vim and Emacs also tend to come up when people ask this question. They're both extremely powerful and are really useful tools to be familiar with, and once you become good at using them they are both extremely efficient. That said, they also both have pretty steep learning curves and tend to require a lot of customization to bring them from "good" to "extremely effective". I'd say that either of their learning curves is significantly steeper than Python's itself, and would probably avoid them until/unless you find yourself wanting more than something like PyCharm or PyDev can offer.

e: vvvvvvvvvvvvvvvv I totally forgot about Pyscripter, but I used that before I tried PyCharm and thought it was slightly better than PyDev. Still not as good as PyCharm.

brosmike fucked around with this message at 04:12 on May 23, 2011

# ¿ May 23, 2011 03:32

brosmike: Jun 26, 2009

quaunaut posted:

Any ideas why that authenticate could be failing? I'm open to anything, and showing you absolutely any of my code(save my settings.py file unedited, of course). The oauth library I'm using is here.

Unlikely, but are you using a custom value for AUTHENTICATION_BACKENDS in your settings.py? Failing that, I'd say your best bet is to put a breakpoint on the authenticate() call and step through that - a cursory glance at its source code suggests that something pretty weird is happening if it's not returning None but also not setting the backend attribute.

# ¿ Jun 2, 2011 00:31

brosmike: Jun 26, 2009

FoiledAgain posted:

Why is this happening?

code:

>>> i, j = 2,3
>>> n = list()
>>> n.append([[i+n,j] for n in [-1,0,+1]])
>>> print n
1
>>> n = list()
>>> n = [[i+n,j] for n in [-1,0,+1]]
>>> print n
[[1, 3], [2, 3], [3, 3]]

The behavior is weird, but documented (see footnote 1 of the Python expressions reference). It evidently behaves as you want in Python 3.0 and above.

e: In case it wasn't clear: The n in the list comprehension gets leaked in both cases, overriding n being set to the list. In the second case, n is immediately reset to be a list again once the comprehension is done - in the first case, no such luck.

# ¿ Jun 8, 2011 00:25

brosmike: Jun 26, 2009

Stoatbringer posted:

Apart from the indentation, which I consider to be dangerous nonsense which will only end in tears. I don't mind using it, but the old-school part of my brain is always screaming "One slip of the auto-formatter, or accidentally deleting a tab will break everything and nobody will ever know why! Oh woe, woe unto the poor sod who has to maintain this code in five years time!"

How exactly is that any different from the risk that you (or your auto-formatter) could accidentally delete a }?

# ¿ Jun 18, 2011 01:14

brosmike: Jun 26, 2009

dis astranagant posted:

If nothing else, most any text editor or ide worth a drat can tell you how your parens/braces/brackets match up.

I don't see how this is better than using indentation, whereupon any human eye worth a drat can tell how your code blocks match up.

dis astranagant posted:

And many languages that use them will throw an error if you have a stray one.

I think this is pretty much a neutral trade-off; it's true that a deleted tab in a python script is more likely to result in an error not caught til runtime than a deleted brace in a C program, but giving whitespace semantic meaning also allows you to eliminate errors from things like missing semicolons that mark statement endings. (Those would often be caught as syntax errors, but then, so would most instances of deleting a tab in a python script)

# ¿ Jun 18, 2011 01:54

brosmike: Jun 26, 2009

Unknownmass posted:

I am new to python and getting back into programming after a few years. My question is I have a tab-delimited text file with indices and then grouping of data. What would be the best way to import these, and rank them? Also if possible even import them as separate groups. I have been trying to use numpy but have not had to much luck so far. Thanks

What you describe is a bit vague, but probably pretty easy to do (you probably don't need to bother with numpy for the importing). Can you give us an actual example of the format you're trying to read? Telling us what you mean by "separate groups" of data, as well as how you want to rank the data, would help us help you.

# ¿ Jun 24, 2011 02:59

brosmike: Jun 26, 2009

Metroid48 posted:

The other main option is to use forward slashes, since Windows is fine with either.

Don't do this if you can avoid it - some windows programs (cmd.exe among them) will try to deal with forward slashes sometimes, but the behavior is inconsistent between programs, difficult to predict, and subject to change between versions.

# ¿ Dec 12, 2011 21:21

brosmike: Jun 26, 2009

Your issue is probably that your input includes NaNs, which are explictly defined as being unordered with respect to other floats. Try it:

code:

>>> float('nan') < 1.0
False
>>> float('nan') >= 1.0
False

The sorted function cannot do anything sane with input which is defined to be unsortable, and essentially returns garbage. See this Stack Overflow post for a more in depth explanation.

# ¿ Jan 5, 2015 08:15

Adbot: ADBOT LOVES YOU

# ¿ May 19, 2024 19:23

brosmike: Jun 26, 2009

When I write code against your library, I want it to be trivial for a maintainer to identify things like "is this operation safe for my UI thread", "when did this information come from", and "is it safe to assume that the value I got in the outer function scope is the same as the one two layers deeper". This makes me prefer that most of the data be in uninteresting (possibly immutable) data bags, and that the operations doing network activity that output the uninteresting models be distinct functions with searchable names. This also makes it less likely that I will have to do unnecessary wrapping to make my code against your model sanely testable.

# ¿ Aug 26, 2015 16:19

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Python information and short questions megathread.