Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Locked thread
QuarkJets
Sep 8, 2008

I'm not sure if this is a short question or not... but here goes. I'm a physicist, and coding is generally something we do and not something that we're taught, so I'm hoping that someone with more knowledge of the tools available can tell me whether there's a better way to solve my problem.

I have two sets of run numbers, and each run number comes with a list of event numbers. I want to check overlap between the two runs so that I don't look at the same event twice, but I want to look at every event at least once. For example

RunStream1
Run 152000 has event numbers (1, 2, 3, 41, 45)
Run 152234 has event numbers (1, 2, 14, 15, 20)
.....

RunStream2
Run 152000 has event numbers (30, 31, 32, 34, 45)
Run 156000 has event numbers (blah blah blah)
.....

A single RunStream is guaranteed to not have run/event duplicates, but there is no such guarantee between runstreams; in this short example, run 152000 event 45 is in both sets. I have upwards of billions of events distributed cross hundreds of runs, so checking run/event overlap between the two runstreams becomes a pretty large task. I'm working on a server, not a supercomputer, so memory limitations are a concern.

Right now for overlap checking I use a dictionary where the run numbers determine the keys and the value of each key is a set() filled with event numbers. I only fill the dictionary/sets when looking at RunStream1, and I only check for overlap when looking at RunStream2. Dictionaries and sets are both hash lists, so checking for membership is fast and easy, as is appending new runs and events.

Is there a more effective way that might use less memory while maintaining at least the speed of a hash list?

Adbot
ADBOT LOVES YOU

QuarkJets
Sep 8, 2008

In Python 2.6 I want to take a dictionary and convert all of the keys, which are strings with various cases, to lowercase strings. Right now I am doing it like this:

code:

dict_low = dict((k.lower(), v) for k, v in olddict.items()) 


I just want to change the keys to lowercase, not create a second copy of the dictionary in memory. Does this code do that, and how can I check that this is the case?

QuarkJets
Sep 8, 2008

yaoi prophet posted:

Honestly, that's an odd enough request that I'm curious as to why exactly you want to do this. I'm not saying it's wrong, I'm just curious.

I am taking dictionaries as input and trying to read values with specific keys, but there are sometimes inconsistencies in capitalization. Sometimes the dictionaries can be large, but I do have plenty of memory available; I just want to write effective code that doesn't waste time and memory making huge dictionary copies

So I think that what I will do is check whether a key is already lower case, and if it isn't then I'll create a new lowercase key with the old value. That would be better than a full dictionary copy, I think

QuarkJets
Sep 8, 2008

Emacs Headroom posted:

It would probably only be better if you have a relatively low proportion on non-lower case keys and making new keys is a rare operation. You can't "change" a key, since that would also change its hash value, you can only make new keys and delete old ones. So it might end up being like 6-of-one or half-dozen of another when choosing between the solutions. You can always profile with your data to see if there's a winner just to make sure.

Yeah, having to create a lowercase key is relatively uncommon (most come lowercased already). I suppose that I could run some tests and find out for sure whether one way or the other is actually faster on average for my data

QuarkJets fucked around with this message at 02:02 on Feb 12, 2013

QuarkJets
Sep 8, 2008

Thinking deeper, I should specify that the dictionary values are actually numpy arrays each with 1k to 1B entries. There are maybe only 100 keys in the dictionaries, really it's these arrays that are large.

When I create a new key and give it the same value as the old key, then a reference gets passed and I am not actually creating any new arrays in memory, correct? So I don't really even need to worry about deleting the old keys since minimal memory is used by two keys both pointing to the same array

QuarkJets
Sep 8, 2008

The Insect Court posted:

Unless you're going to be vastly increasing the number of keys, it shouldn't be an issue. That said, you can pretty easily do:
code:
for k, v in myDict:
    myDict[transform(k)] = v
    toBeDeleted.append(k)
As long as a value in the dictionary has some key attached to it, it shouldn't get GC'ed. And getting rid of the old keys means you can iterate through the dictionary without having to write code to check the case of the keys.

Excellent, thanks guys!

QuarkJets
Sep 8, 2008

What's the general feel regarding Python 2 vs Python 3? None of the computers at the place where I work have Python 3 installed, and I know that it's not backwards compatible, so isn't every new Python 2 script just going to create problems in the future when Python 2 eventually gets abandoned? Python 3.x doesn't come with our redhat installs, and getting IT to install it for us would be a pain. Is this worth it?

I've been slowly ramping up my effort to get people in my workplace to switch from MATLAB to numpy/scipy, but if this is creating headaches for a Python 2.X to 3.X switchover in the future then I'd like to make the switch happen sooner rather than later

aeverous posted:

What do you guys use for your Python work, I'm currently using Notepad++ with the pyNPP plugin but earlier this year I used VS2010 for a C# project and gently caress if I didn't get really spoiled by the code completion. I've looked around at Python IDEs and they all look a bit crap except PyCharm which is pretty expensive. Are there any free/OSS Python IDEs with really solid code completion and a built in dark theme?

Vim

Although recently I tried Spyder 2 on my Windows box (comes with the Pythonxy package) and it worked really really great, so I may start using that more.

QuarkJets
Sep 8, 2008

Thanks for the feedback, guys

Modern Pragmatist posted:

In my experience, 2to3 works pretty well for converting python2 code to be compatible with python3. The biggest hurdle is the bytes/str/unicode change, but as long as you don't do much work with strings it should be pretty painless to migrate.

Also, as far as a Matlab replacement, I'm pretty sure most projects are now python3 compatible.

That's good news; it sounds like I don't need to worry about this too much and can just make a casual python3 software request without banging on doors

QuarkJets
Sep 8, 2008

JetsGuy posted:

Same. I avoided lambda for a long time because it just seemed lazy and dumb to me. Then I programmed in python for a few years and understood.

Can you help me understand? Because lambda seems lazy and dumb to me, and while I do use Python a lot I wouldn't say that I'm more than average-skilled, so I'm authentically interested in learning more about the parts of Python that I don't get to see day-to-day. Wouldn't it be better to write a clean and documented def in case someone else has to use your code in the future?

I've been following along, but I still can't think of a circumstance where a lambda is easier to understand than a def

QuarkJets fucked around with this message at 08:10 on Mar 13, 2013

QuarkJets
Sep 8, 2008

I don't really understand how to spot whether I may be writing spaghetti or ravioli code. I want to be a good python programmer, so I have read the most recent PEP and I try to code readable but efficient code. Would anyone mind posting examples of python spaghetti/ravioli code for examples of how not to code? With no formal training but years of experience, I am worried about bad habits that I may not even realize I have

QuarkJets
Sep 8, 2008

I have used Qt and wxPython on Redhat and Windows. I couldn't say that one was necessarily better than the other, so if wxPython has Mac problems then you may as well go to Qt

QuarkJets
Sep 8, 2008

DARPA posted:

If you're new to both python and programming I recommend Tkinter. It's incredibly easy to learn, handles data binding simply, and comes bundled with the standard python installation. Using the widgets from ttk makes things look nicer than Tk's reputation let's on. Just make sure the python ttk package is installed and imported. Then all it takes is changing your widget definitions like Button(...) to ttk.Button(...) and you'll get a better themed look.

wx and Qt are great, but they're also a lot to take in for very little gain when you're just learning and want to make something that adds two numbers.

That said, all of the Tkinter examples that I've seen look like poo poo, so think of it as a good educational tool only

QuarkJets
Sep 8, 2008

Dominoes posted:

Thanks for the GUI advice dudes - I'm going to use QT.

IDLE is the only program I know of that runs uncompiled python scripts, other than a command line. What do you recommend instead? I'll try PyNPP.

Spyder 2 does that also, I think

QuarkJets
Sep 8, 2008

Popper posted:

This is basically how people are thought classes.
Think like this:

etc...

Card should be a class too.

There are faster ways to do this with dicts and namedtuples but get a class implementation working first.

The assignment makes it sound like he hasn't been told about classes yet (specifically, it says to use functions), so he shouldn't use a class (even if it would be objectively better and was my first thought, too).

QuarkJets
Sep 8, 2008

I have something like this:

code:
#myfile.py
class file(dict):
    #some file attributes
    #and for fun let's say that the first 10MB of the file is stored in memory, 
    #so this becomes relatively time-consuming with multiple files

##

#mydir.py
import myfile
class dir(dict):
    #some dir attributes
    #Also contains a bunch of file objects such that self[filename] = file()

##

#myhost.py
import mydir
class host(dict):
    #some dir attributes
    #Also contains a bunch of dir objects such that self[dirname] = dir()

#example usage:
#x = myhost("127.0.0.1") 
#y = x.get("127.0.0.1")
#if x is not None:
#	z = y.get("/home/someuser/somedir/")
#	if z is not None:
#		fi = z.get("somefile.txt")
#		#do something
These are three separate files; myhost is a directory of mydir objects, mydir is a directory of myfile objects. This isn't the actual project, it's just an example analogous to some project that I'm working on.

So here's my question: are there any dangers to making each class threaded? I'm new to threading and haven't experienced the difficulties of creating a threaded program before. So for instance, class host creates a list of dir() objects, and class dir creates a list of file() objects, so I could theoretically thread each of these and experience a speed increase so long as I'm careful about waiting for for the dictionary-filling and class-creating operations to complete before declaring the thread complete (ie use join() to make sure that everything is filled before accessing any of the dictionaries?)

But since this is a lot of file I/O, will this not benefit as much from threading?

QuarkJets
Sep 8, 2008

Suspicious Dish posted:

You seem to have a fundamental misunderstanding of threading. I'd lay it to rest until you pick up more of the basics of CS. Threading is a fairly complex subject in and of itself, so it's something to tackle after you understand more of Python and more about memory and processes and everything.

I'm self-taught and have been coding for nearly a decade now (scientific coding, a means to an end), but I don't know anything about threading aside from whatever I've read on the Internet. I'm well-versed in dealing with memory, but I haven't really learned any actual computer science

QuarkJets
Sep 8, 2008

Suspicious Dish posted:

So, you can't make objects threaded. Threads are like additional programs that run in the same memory space (in Linux, that's literally the only difference from forked processes — they get their own PID and everything)

The issue with this is that if one thread reads memory and then another thread writes memory, you end up with inconsistency issues between the two processes. Since Python's objects are refcounted, this can be disasterous: one object's refcount is at 1, Thread 1 reads it, Thread 2 increments it, and then Thread 1 decrements it to 0 and destroys the object, leaving Thread 2 with some freed memory. Boom.

To account for this, Python introduced something called the Global Interpreter Lock, or GIL, and a thread must take it when it wants to deal with Python objects. All Python threads wait on this lock until it becomes available, and the performance of the GIL is quite poor when combined with how kernel scheduling works.

Note that the GIL is only taken when something wants to interact with Python, so a thread can be waiting on I/O (read, poll), and another thread can come in and do its stuff. Some libraries like NumPy also release the GIL when some kinds of calculations go on, since they've been extremely careful to make all their special calculation objects thread-safe and such.

So, your directory system, it depends on what method you want to thread. If you want to thread the filling of each host object, perhaps through readdir (through os.walk or os.listwhatever), then each thread is going to be hanging in I/O, waiting to grab the lock, and it's unlikely you'll see much of a speedup, but try it out and profile. Computers are so complex that I can make a guess as to some performance thing based on everything I know, but I can't be 100% sure at all, and have been proven wrong before.

Note that you might have bugs when you start to thread! That's OK. Note that there still can be race conditions and thread bugs in your Python code — the GIL doesn't prevent those.

Some people use what's called a job thread model, where they have one thread dedicated to Graphics, one thread dedicated to disk I/O, one thread dedicated to Sound, etc. This is usually a good model for video games because most APIs used there, especially ones like OpenGL, are in no way thread-safe.

I tend to use what's called a worker thread model, where threads are spawned off to do a very specific task (read this PNG file from disk, decode the bytes) — where I share pretty much nothing except some starting task data, and the result data when nothing is complete. It's effectively a separate process without the overhead of IPC.

Oh boy, that was probably a bit too much. If you read all this and have any questions on it, feel free to ask more questions. It's complex stuff.

This is all easy to understand, from the way you've described it. Thanks.

My project consists of a step in which specific files are read in and then a second step in which calculatios are performed using data from those files, but the files and the data in the files is not being changed (IE results are stored in new variables and saved elsewhere on the disk, MySQL insertions are performed, etc). So I could set up a Queue for the read-in (which shouldn't benefit much due to a file I/O bottleneck) and another queue for the operations, and before queuing operations I would just need to wait for the read-in queue to empty, right? Alternatively, if the file I/O queue really doesn't improve speed at all, then I could just read everything in normally and then setup a queue for processing the data, so long as I'm careful about not changing the data that is being operated on.

I'm very comfortable with coding in Python and using Python, I've just never done Computer Science, which I see as "understanding what the code is making the computer actually do." I have a lot of experience in C++ as well, so I know all about memory management and reference passing and how memory management in Python works differently (IE basically Python uses something analogous to smart pointers, a chunk of memory can only cleaned up after the number of active references to that chunk of memory becomes 0, all done automatically), I'm just really clueless in threading and multiprocessing, and I'm not well-versed in the "guts" of Python, just the high-level stuff.

AKA there is a gap between what many of the hard science programs teach (a bare-bones introduction to scientific computing) and what is actually needed in the real world of hard science (actual computer science knowledge for real and efficient scientific computing), and individually overcoming this gap is what I've been trying to do for the last few years. This July I'll become eligible for free remote-learning university courses through my employer, I already have plans to get some intro-level CS courses under my belt in the hope of building a better understanding of what is going on under the hood.

Thern posted:

You can release the GIL? I need to investigate this further as I have something that is very I/O bound. And Multiprocessing is a bit of hack I feel.

According to this page the GIL is always unlocked when doing I/O. I don't know whether this helps you.

QuarkJets fucked around with this message at 20:58 on Mar 26, 2013

QuarkJets
Sep 8, 2008

[i for i in u] will create a list that is just all of the elements in u. If u is list1, then list3 will be filled with the same values as list1 with that line. You don't appear to be using enumerate, so instead of keeping track of an index and incrementing it you could just:

code:
    list3 = [i for i in u]
    for i in xrange(len(u)):
        list3[i] = u[i]+b[i]
        index += 1
    return list3
Or the far more awesome one-line way:

code:
    return [u[i]+b[i] for i in xrange(len(u))]
This generates an index i that loops over the length of u. For each value of i, a new entry is the list is created that is equal to u[i]+b[i].

This isn't great though because if b is shorter than u then the code will bomb out

QuarkJets
Sep 8, 2008

ARACHNOTRON posted:

why is Python so weird??

What if I did something with site or added the main package location to sys.path in every module? I don't really want to dick around with PYTHONPATH right now because I'm only doing test stuff.

C and other languages work in a similar way; would you prefer it if Python/C/etc recursively searched your entire file system every time that you tried to import something?

Setting your PYTHONPATH only takes a second. Just point it to the directory where you're currently testing things. It would also be way faster than any of the workarounds that you're considering

QuarkJets
Sep 8, 2008

ARACHNOTRON posted:

It would be nice if it searched back until it didn't find an init file. I don't know.

I have only done Java packaging and it kinda just works, especially with JARs :(

Ah, but Python also supports importing files that reside in directories without an init file. And you might have more than one area holding .py files that you want to use, which would ultimately necessitate some sort of PATH variable anyway

(Java uses a PATH-like variable, too, it was probably set for you by your IDE if you're working in Windows. I think it's JAVAHOME? I haven't used Java in a long time)

QuarkJets
Sep 8, 2008

Social Animal posted:

So I learned the basics of Python from code academy and now I really want to start a small project to practice. I was thinking of a page I can upload/download files from sort of like my own ghetto megaupload. What's a good place to start? I looked at Flask and it looks like I can use this but I'm hitting a brick wall. When it comes to coding I am pretty bad (probably why I was attracted to Python's easy to use/read syntax.) The problem is it feels like frameworks are a whole different language in themselves, and I'm pretty lost. Can anyone please recommend me a good tutorial or path I should take to really start learning? Or recommend a good beginner's project I can start with?

I really want to get into coding but I get overwhelmed easily.

Have you completed the Python Challenge yet? That's a really good first project

QuarkJets
Sep 8, 2008

Social Animal posted:

This is the only page I could find:

http://www.pythonchallenge.com/

Is this the one? If so love that page design.

That's the one. It was made in 2005, so not exactly the height of web design

QuarkJets
Sep 8, 2008

BeefofAges posted:

Wow, those are crappy function names.

It's pretty similar to the old atoi and atof functions that C has

QuarkJets
Sep 8, 2008

Is there a handy but thorough list of advantages gained by using new-styled classes in Python 2.X? I know a few of the advantages, like being able to user super, but I'd like to have a whole list handy to show to a coworker

QuarkJets
Sep 8, 2008

dedian posted:

I wrote a little script this weekend to calculate and store md5 hashes of files in a list of directories I point the script to, for the purpose of finding duplicates. I fully realize there's lots of tools to do this already, and do it better, but I wanted to write something myself as I get more familiar with Python (I keep meaning to go through Learn python the hard way or other tutorials :D). Does any of this look like the completely wrong way of doing things? It seems like the for loops don't need to be as nested... somehow? List comprehensions, or.. something? Anyway, just something dumb to poke holes in on a Monday morning :) (I'm happy at least that it works :))

http://pastie.org/7577472

MD5 is great, but if you're like me then you worry that your hashes might detect file duplicates when none exist (IE two unique files can produce the same hash). If that's the case, then you want your hash strings to be as long as possible so as to minimize the likelihood of collisions. You'd only need to change 1-2 lines in order to use sha256 or sha512 instead of md5 (hashlib supports both). You could also check for the file size in bytes when checking for duplicates; if two files have the same hash but different file sizes, then they are probably unique files, no?

QuarkJets
Sep 8, 2008

Scaevolus posted:

Unless you have a habit of storing outputs of MD5 collision creators, you will never see an MD5 collision on your filesystem.

File sizes are a good optimization, though -- if a file has a unique size, there's no need to bother reading the entire thing to hash it.

You probably won't see a collision on your filesystem with md5, you mean. You can trivially implement any of the sha-2 hash functions for much better collision resistance, and it's not really costing you anything to do so, so why not?

Checking for file sizes will make the probability basically zero and will make the duplicate checking much faster, but there's always that one in a gazillion chance that two unique files with the same file size will also have the same hash...

QuarkJets
Sep 8, 2008

Haystack posted:

Given that dedian would need to have a folder with about 10 billion billion files in it before he'd have a decent chance of having single MD5 collision, he'd probably be better off spending his time shielding his computer from cosmic radiation :v:

Hey I agree with you in that the probability is basically zero and he doesn't need to worry, I just like effortless solutions that teach people new things :shrug:

QuarkJets
Sep 8, 2008

My girlfriend is interested in learning more web development skills. Maintaining a web site is a small facet of her job, so it actually does have some applicability to what she's doing. For instance, she knows how to use CSS and javascript. She knows that I've been using Python for a long time, and she has asked me if Python is a useful language to learn for web development purposes. I wasn't sure what to tell her; I've never done any web stuff.

Searching around on the web gives links to people talking about how awesome django is as a web framework, but I don't really understand what django does. Is it only useful for creating web applications with forms and the like or can it also just be used to make nice-looking web sites in a relatively easy way?

QuarkJets
Sep 8, 2008

I feel like I'm typing 'self' a whole lot when I write classes. Using self.method() when I need to call internal methods, self.object to access internal variables, etc. This looks ugly to me; is there a way to prettify it?

QuarkJets
Sep 8, 2008

yaoi prophet posted:

Python code:
import inspect

class Magic(object):
    def __init__(self, x):
        self.x = x

    def foo(self):
        exec ""
        magic_locals()
        print "foo says x is %d" % x

    def bar(self):
        exec ""
        magic_locals()
        print "calling foo!"
        foo()

def magic_locals():
    stack = inspect.stack()
    locals_ = stack[1][0].f_locals
    obj = locals_["self"]
    for attr in dir(obj):
        locals_[attr] = getattr(obj, attr)


magic = Magic(9)
magic.bar()
I don't think this will actually work in Python 3 since it apparently depends on exec "" forcing the locals storage to be in a state where inspect can modify it, which it might not on Python 3. Also if you use this in anything I hate you.

That's even uglier than just using 'self' everywhere

QuarkJets
Sep 8, 2008

Dominoes posted:

I'm making a binary because I'd like to distribute this program to coworkers. I'd prefer the program to just work. There's got to be something I can add to my setup.py, or a way to get the proper .dll manually into the program's folder.

Do you have the Visual Studio 2010 redistributable package on the computer that you're using to create the binary?

QuarkJets
Sep 8, 2008

Dominoes posted:

No, but I have various 'Microsoft Visual C++ (year), x86 and x64 Redistrbutables' under the uninstall-a-program dialog.

Check if there are any keys in HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\VisualStudio\10.0\VC\VCRedist\

And if you don't have it installed, then you might try installing it on the machine that you're using to create the executable; at some point the Visual Studio compiler is getting called (since you're on Windows). It's not totally clear to me how cx_freeze works in Windows

QuarkJets
Sep 8, 2008

I am using paramiko to sftp get a large file from a remote server.

code:
 
ssh = paramiko.SSHClient() 
ssh.load_system_host_keys() 
ssh.connect(ip_addr) 
sftp = ssh.open_sftp()
sftp.get('/tmp/test.txt', '/home/user/test.txt', callbackfunction) 
 


The implementation works great, but keyboard interrupts cause weird behavior. One of two things happen:

A) Paramiko becomes a zombie and I have to kill the python process by hand

B) Paramiko catches the keyboard interrupt and raises a paramiko.SFTPError with the text "Garbage packet received."

Has anyone else had this experience? Can I somehow prevent A) from happening?

QuarkJets
Sep 8, 2008

Dominoes posted:

Lesson in reinventing the wheel today: I created a function that allows date addition and subtraction using modulus and floor division. Turns out the datetime function already does that! It was a good learning experience.

And quite well, at that!

MySQLdb even converts its own datetime format into the Python datetime format. Very handy, although inserting a datetime with MySQLdb requires converting it to a string first

QuarkJets
Sep 8, 2008

Agreed; I have no idea what you're asking for at this point. What is this most recent Python code supposed to do?

e: To me, nothing in your psuedo-code looks anything like the code that you just posted :psyduck:

QuarkJets
Sep 8, 2008

Dominoes posted:

I'm not asking anything at this point; Nippa and Plork posted examples that I turned into a solution. I was wondering if there's a clean way to implement variables in code similar to the .join and %s abilities of strings.

This is terrible, don't go looking for this. It's probably not what you actually want to do.

e: You asked for clean, what you did is way cleaner than trying to pull variables from strings and then loop over them or whatever

QuarkJets fucked around with this message at 20:44 on Apr 21, 2013

QuarkJets
Sep 8, 2008

Dominoes posted:

Iterators.

Don't do this:

code:
 if failed == False:
        result.append(symbol)
Instead:

code:
if not failed:
        result.append(symbol)
...

And instead of a failure flag, you could just return when your failure condition is met. Plus it looks like result is never longer than length==1, so couldn't you just scrap it entirely? Like this :

Python code:
eval(data, parameters, n, \
	(n, 'date'), (n, 'close'), \
	symbols[n], 'tk')

def eval(data, parameters, len_source, \
	date_loc, price_loc, symbol, _typ):
    for n2 in parameters:
	#some code
        
        if not n2._floor <= change <= n2._ceil:
            print (symbol, "failed for", n2.name)
            return None
    return symbol
If eval returns None, then eval failed, otherwise you've got your symbol object...

...

Wait, symbol isn't ever used or modified anywhere in the code! Can't this just return True or False?

QuarkJets fucked around with this message at 22:53 on Apr 21, 2013

QuarkJets
Sep 8, 2008

When you're done with an object that takes up a lot of memory (for instance, a 1k by 1k by 1k numpy array), is it considered better practice to delete it with del or just leave it for garbage collection?

QuarkJets
Sep 8, 2008

^^^ Because he's probably in Windows and the Windows command line is ghetto as hell

dantheman650 posted:

I am completely new to programming and am playing around with Python after learning some basics on CodeAcademy. I tried using Notepad++ on a friend's recommendation but it turns out getting Python code to run from it is a pain. The OP of this thread is mostly going over my head and the tutorial on setting up VIM is much more advanced and complicated than anything I need at this point. What is a good, simple IDE to write and run Python code? The wiki has a huge list and I don't know which to pick.

I really, really like Spyder 2. It comes with Pythonxy, which is basically a big executable full of additional python libraries (numpy and others) and IDEs. Pythonxy is aimed toward people who want to make a switch from MATLAB to Python on Windows systems. It's also free. Even if you're not interested in any of the computational stuff, it's still a great starting point simply because it gives you the option of installing a bunch of different IDEs (so that you can try out a bunch of them out and then keep whichever one you like best) and a bunch of extra libraries to play with (although they're all optional components that you can add later).

Spyder 2 is as simple as you want, but it's also incredibly easy to run your code in it. It comes with a command-line interpreter that has all of the normal additional features that you'd expected of a well-developed command-line interpreter (such as tab completion), but there's also a keyboard shortcut for just running code in a fresh window. I suggest trying it

Adbot
ADBOT LOVES YOU

QuarkJets
Sep 8, 2008

BeefofAges posted:

First make it work, then make it fast (if you need to), then make it pretty (if you need to).

If by "pretty" you mean "readable" then shouldn't that be part of "make it work?" Most scientific programming is done in the style of "I'm going to get this to work, I don't care if it's fast or readable", and it's actually a huge problem when a change to the code needs to be made but the entire house of cards falls apart because the code has turned into a black box and no one knows what makes it work

  • Locked thread