|
ARACHNOTRON posted:pymongo docs apparently lied to me. try m['_id']?
|
# ? Apr 12, 2013 16:29 |
|
|
# ? May 8, 2024 07:29 |
|
Yeah, that worked out. So I will get this sucker started up for real in 20 minutes code:
Smarmy Coworker fucked around with this message at 17:09 on Apr 12, 2013 |
# ? Apr 12, 2013 16:39 |
|
ARACHNOTRON posted:Okay this code worked with a different program and does not work with this one. The only difference is I already had a list of the _id's I wanted (there were only 25 as opposed to 2500) and could create a DB connection in the M class. I ask cuz it seems like you're probably already getting back all your data when you access the documents with the mlist cursor and you don't really do separate, linked collections in mongo unless you're running into the 16mb document size limit. If you need or really want your data in separate collections you can reduce the overhead of iterating that cursor by putting a projection on the find operation so that the documents it returns only contain _id.
|
# ? Apr 12, 2013 17:48 |
|
I am. That's a pretty good point and I will do that for future modules, but for this specific application there is only a need to fill the db with random sample data until next week, I believe. I might as well just stop it and fix that, though! It cannot hurt. I had reports as a field initially but since they will be collected every hour, forever, it will definitely hit the size limit eventually.
|
# ? Apr 12, 2013 19:10 |
|
Man, I just had one of those moments where I my mind boggled a bit. I have been putting off implementing Celery in a few projects because I thought it would be a huge endeavor to get working. It's not. It's loving easy to install and implement. And it seems to work well so far. It's just one of those 'THE FUTURE IS NOW' kind of things.
|
# ? Apr 13, 2013 10:51 |
|
I'm hoping someone can help me with a QT / designer question. The example code I've included shows how I've been setting up the code, which may not be correct. I'm trying to get programs that are not in classes that work with QT directly to change aspects of the GUI. I've set up example code showing the structure of my programs. My goal is to get the function btn_about2, in the main program (or another class) to change the main window's statusbar. I've included two windows in this example, to clarify how my gui code's structured.Python code:
Dominoes fucked around with this message at 15:12 on Apr 13, 2013 |
# ? Apr 13, 2013 14:54 |
|
Ok, I'm kinda a python newbie, but does Pandas have a sort method that isn't totally hosed? I have a pandas dataframe (read from a CSV) structured like this: index year month day str1 str2 int 1 2008 12 1 x c 7 2 2008 1 12 x x 9 3 . . . and I want to sort it by date, so year->month->day ascending. Ok, df = df.sort_index(by = ['year', 'month', 'day']), done, right? Not so fast, that gives me an ostensibly sorted array, but the indices are the same, and now all out of order, so I have no way to tell which row is the first and which is the last. How do I reindex this so I can be sure row i+1 has a later date (or at least the same date) as row i?
|
# ? Apr 14, 2013 03:10 |
|
I'm receiving a json string from an API call to my broker that looks like this:code:
code:
|
# ? Apr 14, 2013 13:24 |
|
That's the Python dictionary representation of the JSON string. What would you otherwise expect to happen with the returned JSON? I mean, it's now fairly easy to do float(json.loads(returned_json).get('quote', {}).get('AAPL', 0)), you could even stick it in a namedtuple that matches the data structure, but strictly speaking it's your job to convert the data to something that you can use. json.loads just converts it to something accessible for more Python code. geonetix fucked around with this message at 13:39 on Apr 14, 2013 |
# ? Apr 14, 2013 13:37 |
|
Pudgygiant posted:I'm receiving a json string from an API call to my broker that looks like this: Seems fairly helpful. It turns a string containing json into a python dict which looks syntactically fairly similar but you can now access it directly like a normal dict. Python code:
|
# ? Apr 14, 2013 13:42 |
|
Thanks, that's exactly what I was looking for.
|
# ? Apr 14, 2013 15:47 |
|
KaiserBen posted:Ok, I'm kinda a python newbie, but does Pandas have a sort method that isn't totally hosed? I have a pandas dataframe (read from a CSV) structured like this: Remove the index column from the table. When pandas saves CSV files I am pretty sure that it includes the row in the text file.
|
# ? Apr 14, 2013 16:14 |
|
Dominoes posted:I'm hoping someone can help me with a QT / designer question. The example code I've included shows how I've been setting up the code, which may not be correct. I'm trying to get programs that are not in classes that work with QT directly to change aspects of the GUI. I've set up example code showing the structure of my programs. My goal is to get the function btn_about2, in the main program (or another class) to change the main window's statusbar. I've included two windows in this example, to clarify how my gui code's structured. code:
Method 2: using os.path.dirname(__file__). This works in the uncompiled script, but the compiled program loads the GUI, then glitches out with the following error: Dominoes fucked around with this message at 00:38 on Apr 15, 2013 |
# ? Apr 15, 2013 00:22 |
|
Read this: http://docs.python.org/2/library/os.html#os-file-dir Use os.getcwd() Edit: sorry didn't notice what you were doing exactly. You want http://docs.python.org/2/library/os.path.html os.path.dirname() Dren fucked around with this message at 00:54 on Apr 15, 2013 |
# ? Apr 15, 2013 00:48 |
|
I wrote a little script this weekend to calculate and store md5 hashes of files in a list of directories I point the script to, for the purpose of finding duplicates. I fully realize there's lots of tools to do this already, and do it better, but I wanted to write something myself as I get more familiar with Python (I keep meaning to go through Learn python the hard way or other tutorials ). Does any of this look like the completely wrong way of doing things? It seems like the for loops don't need to be as nested... somehow? List comprehensions, or.. something? Anyway, just something dumb to poke holes in on a Monday morning (I'm happy at least that it works ) http://pastie.org/7577472 dedian fucked around with this message at 13:15 on Apr 15, 2013 |
# ? Apr 15, 2013 13:12 |
|
Django modelling question: I want to make a single row table to store settings in, the reason being that I want to have certain settings to have restricted values, otherwise I'd just make a multirow table of key/values. Is there any way to restrict a table to a single row, and have it appear cleanly in the admin console i.e. don't allow admins to add more useless rows?
|
# ? Apr 15, 2013 13:51 |
|
dedian posted:Does any of this look like the completely wrong way of doing things? It seems like the for loops don't need to be as nested... somehow? List comprehensions, or.. something? Anyway, just something dumb to poke holes in on a Monday morning (I'm happy at least that it works ) Several little issues come to mind, but they have little to do with python, except maybe this one: Instead of repeating Python code:
Python code:
I would make the database connection only once in the code, and make sure it actually uses the global constant. This is a mistake that I do sometimes too. Define a nice constant somewhere for a filename, and put the value instead in the code. Furthermore, I would have something like: Python code:
For the database, I would not distinguish path and filename from this code, as you only seem to use the full path all the time. I would also add a UNIQUE constraint on the full path column, which would accelerate the check for existence. I would also add a non unique index on the md5 COLUMN. My name aint Jerry fucked around with this message at 15:57 on Apr 15, 2013 |
# ? Apr 15, 2013 15:47 |
|
dedian posted:I wrote a little script this weekend to calculate and store md5 hashes of files in a list of directories I point the script to, for the purpose of finding duplicates. I fully realize there's lots of tools to do this already, and do it better, but I wanted to write something myself as I get more familiar with Python (I keep meaning to go through Learn python the hard way or other tutorials ). Does any of this look like the completely wrong way of doing things? It seems like the for loops don't need to be as nested... somehow? List comprehensions, or.. something? Anyway, just something dumb to poke holes in on a Monday morning (I'm happy at least that it works ) Consider using argparse to make your command line interface both more useful and easier to maintain: http://pymotw.com/2/argparse/index.html You can use glob to help build DIR_LIST based on user input: http://pymotw.com/2/glob/index.html
|
# ? Apr 15, 2013 16:03 |
|
Awesome, thanks for taking a look! That all makes sense. It's been quite a while since I've touched any code, so this all helps :-)
|
# ? Apr 15, 2013 16:10 |
|
dedian posted:I wrote a little script this weekend to calculate and store md5 hashes of files in a list of directories I point the script to, for the purpose of finding duplicates. I fully realize there's lots of tools to do this already, and do it better, but I wanted to write something myself as I get more familiar with Python (I keep meaning to go through Learn python the hard way or other tutorials ). Does any of this look like the completely wrong way of doing things? It seems like the for loops don't need to be as nested... somehow? List comprehensions, or.. something? Anyway, just something dumb to poke holes in on a Monday morning (I'm happy at least that it works ) MD5 is great, but if you're like me then you worry that your hashes might detect file duplicates when none exist (IE two unique files can produce the same hash). If that's the case, then you want your hash strings to be as long as possible so as to minimize the likelihood of collisions. You'd only need to change 1-2 lines in order to use sha256 or sha512 instead of md5 (hashlib supports both). You could also check for the file size in bytes when checking for duplicates; if two files have the same hash but different file sizes, then they are probably unique files, no?
|
# ? Apr 15, 2013 18:28 |
|
I'm working on an assignment where I have to take a file containing a list of numbers, apply a radix sort to them, and output the list into a new file. I have the code for applying a radix sort to a list of numbers but I'm stuck on how I get the list of numbers from the file and how to output them into a new file. This is the code I have right now, and I don't understand why this doesn't allow me to perform the radix_sort method on the file. The error it returns is "object of type 'file' has no len()"code:
|
# ? Apr 16, 2013 02:24 |
|
QuarkJets posted:MD5 is great, but if you're like me then you worry that your hashes might detect file duplicates when none exist (IE two unique files can produce the same hash). If that's the case, then you want your hash strings to be as long as possible so as to minimize the likelihood of collisions. You'd only need to change 1-2 lines in order to use sha256 or sha512 instead of md5 (hashlib supports both). You could also check for the file size in bytes when checking for duplicates; if two files have the same hash but different file sizes, then they are probably unique files, no? File sizes are a good optimization, though -- if a file has a unique size, there's no need to bother reading the entire thing to hash it.
|
# ? Apr 16, 2013 02:32 |
|
Qwertyiop25 posted:I'm working on an assignment where I have to take a file containing a list of numbers, apply a radix sort to them, and output the list into a new file. I have the code for applying a radix sort to a list of numbers but I'm stuck on how I get the list of numbers from the file and how to output them into a new file. This is the code I have right now, and I don't understand why this doesn't allow me to perform the radix_sort method on the file. The error it returns is "object of type 'file' has no len()" What you have is a file object, what you need is a list of numbers. You need to read each line of the file, convert it to a number, add that number to a list, then you can sort that list.
|
# ? Apr 16, 2013 02:44 |
|
Ok I added a part to read the file and convert it into a list of integer values but now when I run the program, it runs forever without printing anything or bringing up an error. This is what I have now code:
Qwertyiop25 fucked around with this message at 05:01 on Apr 16, 2013 |
# ? Apr 16, 2013 04:45 |
|
Qwertyiop25 posted:Ok I added a part to read the file and convert it into a list of integer values but now when I run the program, it runs forever without printing anything or bringing up an error. I don't see a break statement in there (but there is a return). Also, you don't use "random_list". Are you using Python 2 or 3? accipter fucked around with this message at 05:07 on Apr 16, 2013 |
# ? Apr 16, 2013 05:03 |
|
accipter posted:Also, you don't use "random_list". Thank you, this was the problem. I changed random_list to list1 everywhere but there. I could have sworn I used the replace all button but I guess not.
|
# ? Apr 16, 2013 05:11 |
|
The reason that it's an infinite loop is that at the end you append each item from radix_list to radix_list again, doubling the size of the list. len_radix_list is not recalculated in the loop so it becomes very unlikely that your exit condition len(new_list[0]) == len_radix_list will ever be met (without thinking about it too much, I guess only if exactly half your input numbers are have at least one less digit than the other half). There are a number of issues here but I would advise you to start by deleting all the lines form random_list = [] down, and rethinking what you're trying to do there - it seems like you think the contents of radix_list are being permanently consumed by your loop, which they are not.
|
# ? Apr 16, 2013 05:14 |
|
Scaevolus posted:Unless you have a habit of storing outputs of MD5 collision creators, you will never see an MD5 collision on your filesystem. You probably won't see a collision on your filesystem with md5, you mean. You can trivially implement any of the sha-2 hash functions for much better collision resistance, and it's not really costing you anything to do so, so why not? Checking for file sizes will make the probability basically zero and will make the duplicate checking much faster, but there's always that one in a gazillion chance that two unique files with the same file size will also have the same hash...
|
# ? Apr 16, 2013 05:47 |
|
Given that dedian would need to have a folder with about 10 billion billion files in it before he'd have a decent chance of having single MD5 collision, he'd probably be better off spending his time shielding his computer from cosmic radiation
|
# ? Apr 16, 2013 06:14 |
|
Haystack posted:Given that dedian would need to have a folder with about 10 billion billion files in it before he'd have a decent chance of having single MD5 collision, he'd probably be better off spending his time shielding his computer from cosmic radiation Hey I agree with you in that the probability is basically zero and he doesn't need to worry, I just like effortless solutions that teach people new things
|
# ? Apr 16, 2013 06:58 |
|
dedian posted:I wrote a little script this weekend to calculate and store md5 hashes of files in a list of directories I point the script to, for the purpose of finding duplicates. I fully realize there's lots of tools to do this already, and do it better, but I wanted to write something myself as I get more familiar with Python (I keep meaning to go through Learn python the hard way or other tutorials ). Does any of this look like the completely wrong way of doing things? It seems like the for loops don't need to be as nested... somehow? List comprehensions, or.. something? Anyway, just something dumb to poke holes in on a Monday morning (I'm happy at least that it works ) You used 'dir' as the name of a temporary variable to store a directory name. 'dir' is a reserved keyword in python and shouldn't be used as a variable name.
|
# ? Apr 16, 2013 16:31 |
|
dedian posted:Request for feedback.. You define DB_NAME, but never use it to open the database.
|
# ? Apr 16, 2013 19:16 |
|
Aren't SHA-1 and friends actually faster than MD5 though?
|
# ? Apr 16, 2013 19:23 |
|
accipter posted:You define DB_NAME, but never use it to open the database. Thanks, I did notice that after I posted. I've implemented the changes that folks have suggested here (but still working on file size) and those changes have definitely sped things up and made the code easier to read. Thanks to everyone, again! Eventually it'd be nice to use this in a distributed fashion (so I don't think to myself "Do I have duplicate stuff stashed on some random box that I don't know about?") but we'll see. Forcing myself to do some tutorials would probably be a better use of my time at this point
|
# ? Apr 16, 2013 20:53 |
|
OK, yet another unicode question. code:
edit: ah I forgot I can't print those symbols in a code block. '& #664;' == 'ʘ' FoiledAgain fucked around with this message at 02:51 on Apr 17, 2013 |
# ? Apr 17, 2013 02:49 |
|
FoiledAgain posted:Since str(s) does nothing but return s.symbol Python code:
|
# ? Apr 17, 2013 05:04 |
|
My girlfriend is interested in learning more web development skills. Maintaining a web site is a small facet of her job, so it actually does have some applicability to what she's doing. For instance, she knows how to use CSS and javascript. She knows that I've been using Python for a long time, and she has asked me if Python is a useful language to learn for web development purposes. I wasn't sure what to tell her; I've never done any web stuff. Searching around on the web gives links to people talking about how awesome django is as a web framework, but I don't really understand what django does. Is it only useful for creating web applications with forms and the like or can it also just be used to make nice-looking web sites in a relatively easy way?
|
# ? Apr 17, 2013 08:41 |
|
QuarkJets posted:My girlfriend is interested in learning more web development skills. Maintaining a web site is a small facet of her job, so it actually does have some applicability to what she's doing. For instance, she knows how to use CSS and javascript. She knows that I've been using Python for a long time, and she has asked me if Python is a useful language to learn for web development purposes. I wasn't sure what to tell her; I've never done any web stuff. What Django and other frameworks basically do is take HTTP requests and serve up HTML in response. The in-browser presentation part of web programming has almost nothing to do with Django directly, all it really has there is an HTML templating language. It's also pretty heavy-weight, starting out with something like Bottle or Flask(which are micro-frameworks that do less stuff but more simply) might be a better idea.
|
# ? Apr 17, 2013 08:59 |
|
Plorkyeran posted:
It was that easy, was it? OK thanks. A little bit of research found the __unicode__ attribute which I really should have looked for to begin with.
|
# ? Apr 17, 2013 09:52 |
|
|
# ? May 8, 2024 07:29 |
|
QuarkJets posted:My girlfriend is interested in learning more web development skills. Maintaining a web site is a small facet of her job, so it actually does have some applicability to what she's doing. For instance, she knows how to use CSS and javascript. She knows that I've been using Python for a long time, and she has asked me if Python is a useful language to learn for web development purposes. I wasn't sure what to tell her; I've never done any web stuff. "It depends" I'm sure you know this already, but for the sake of clarity: To make nice looking websites, you don't need any languages other than HTML / CSS.... To make nice looking database driven websites, you'll need some sort of server language. Python is one of them, and like all the other languages one could use, there are a bunch of MVC frameworks that help you do common tasks like url routing, validating form input, getting data from the DB, displaying data in templates, etc. Two of the biggies in python-land are Flask and Django. They are both good, have different "philosophies". That said, depending on her needs, something like WordPress (shudder) might be better than diving into a development framework.
|
# ? Apr 17, 2013 16:05 |