|
Karthe posted:I'm parsing a CSV and have about 24,000 of these queries I need to run: Does SQLite support views? Because maybe the solution to this is not to do it at all. Also if your hitting SQLites limits (Its not hard, its just a babby sql for quick and nifty tasks), maybe upgrade to something more industrial like Mysql or something?
|
# ? May 11, 2013 06:19 |
|
|
# ? May 9, 2024 04:43 |
|
duck monster posted:Does SQLite support views? Because maybe the solution to this is not to do it at all. SQLite is surprisingly powerful if you heed its limits: single user and your data set fits into memory. It can be much faster than MySQL (lol at embedding that in an android app!) for the stuff that Karthe wants. Also, Karthe: Indices aren't free--they slow down inserting. You might want to prevent an index rebuild on each insert and re-enable it after the bulk inserts. However, I'm 99% sure your 24000 inserts can be replaced by one properly constructed join using a table based on id1,id2, and w, but you'll need to give some more info about the structure of your data. In addition, you should be aware of sqlite3's parameter handling: "select ? from table" with a query parameter is far more readable than constructing a string for each query.
|
# ? May 11, 2013 15:58 |
|
I'm trying to delete dictionary entries based on values. I'm receiving "RuntimeError: dictionary changed size during iterations" errors despite making a copy first. Any ideas? Python3.Python code:
Dominoes fucked around with this message at 21:36 on May 11, 2013 |
# ? May 11, 2013 21:25 |
|
You aren't actually making a copy, just another reference to the same dict. Do something like using the copy module or a dictionary comprehension to move all the items into a new dict.
|
# ? May 11, 2013 21:35 |
|
You aren't making a copy. mydict_copy = mydict means that you have two names pointing at the same object. Dictionaries do have a copy method that you can use to create a copy of them. That said, you don't actually want a copy of the dictionary in this case; you want a separate list of it's keys, which you can get with mydict.keys(). You can then iterate over that list and do whatever horrible things you must to that poor dictionary, you monster.
|
# ? May 11, 2013 21:39 |
|
breaks posted:You aren't making a copy. mydict_copy = mydict means that you have two names pointing at the same object. OnceIWasAnOstrich posted:You aren't actually making a copy, just another reference to the same dict. Thanks. Replacing the first line with the following worked. code:
Python code:
Dominoes fucked around with this message at 22:13 on May 11, 2013 |
# ? May 11, 2013 22:07 |
|
Dominoes posted:Thanks. Replacing the first line with the following worked. Because you don't want to be modifying an item as you iterate over it like that, it's called a concurrent modification. Instead, you generate a list of keys which will not be modified and iterate over that then delete any entry whose value for key is whatever you're looking for in the actual dictionary. Hope that makes sense.
|
# ? May 11, 2013 22:47 |
|
The problem is that you are modifying the object you are iterating over. To do what you want to do, delete entries from a dictionary while iterating it, you'll need to make use of the .keys(), .values(), or .items() methods. These methods create new lists containing keys, values, and key-value tuples. So if you do something like: Python code:
I encourage you to carefully read this: http://docs.python.org/2/library/stdtypes.html#typesmapping Take note of the iterkeys, itervalues and iteritems methods. When might you use them instead of keys, values, and items?
|
# ? May 11, 2013 22:50 |
|
You might as well just write mydict = {k:v for k,v in mydict.iteritems() if v.objectvalue != 0} and avoid this whole business of copying and deleting.
|
# ? May 11, 2013 22:53 |
|
Nippashish posted:You might as well just write mydict = {k:v for k,v in mydict.iteritems() if v.objectvalue != 0} and avoid this whole business of copying and deleting. Seconded. Deleting stuff from a dictionary in python is sort of weird unless you have memory constraints keeping you from making a copy.
|
# ? May 11, 2013 22:57 |
|
Dren posted:I encourage you to carefully read this: http://docs.python.org/2/library/stdtypes.html#typesmapping Nippashish posted:You might as well just write mydict = {k:v for k,v in mydict.iteritems() if v.objectvalue != 0} and avoid this whole business of copying and deleting. QT issue: Anyone know how to pull data from input text and combo boxes? I've been trying to solve this one for weeks with no luck, and it's stopped development of the program. I can't find anything in a search that describes the problem, and most of what I find about text and combo boxes implies I'm doing it correctly. For text boxes, I receive a None result, and combo boxes I recieve the default selection, no matter what the current one is. Code: code:
Dominoes fucked around with this message at 16:20 on May 12, 2013 |
# ? May 11, 2013 23:14 |
|
Malcolm XML posted:SQLite is surprisingly powerful if you heed its limits: single user and your data set fits into memory. It can be much faster than MySQL (lol at embedding that in an android app!) for the stuff that Karthe wants. Oh don't get me wrong, I adore SQLite. I use it more often than MySQL or PostGreSQL (which I've made peace with, although I still prefer MySQLs brute force) , at least in development. Theres a reason its the most common database on the planet. But it *has* got limitations that become painfully obvious when you try and force too much data in it (ie my attempt at fitting a large astronomy dataset into it, which promptly caused the whole drat thing on fire), however if used within its limits its continuously surprising just how powerful it is.
|
# ? May 13, 2013 01:57 |
|
Hey y'all, I've got a stylistic/speed question regarding importing your own made classes and modules. I'm working on a project that includes a class (let's call it class A). In order to make the rest of the file more readable and easy to edit, I want to tear this off into another file A.py. So I get how to do that and import it into main.py. However, I am trying to determine the "best" practice of importing modules (like say numpy) into both files. Right now, I'm doing the standard import numpy as np in both files. PEP8's guide seems to suggest that this is the way to do things. However, it seems to me that this may cause the program to go slower, and may cause unforseen issues. Wouldn't it be better to import numpy once and have a way to tell the imported class A to use the imports? OR is it better in the long run to have class A always have the needed imports in case I need to use class A for anything else? Isn't this technically keeping 2 numpys open when the program runs? Eventually, this could get into N versions of numpy where N is the number of files I break this up into. Sorry if this was a little convoluted. I'm trying to determine what's going on.
|
# ? May 13, 2013 16:17 |
|
From a unit testing perspective, each file should contain all the imports it needs to test the classes defined in that file. Anything else is going to be a trainwreck once you start actually writing test cases.
|
# ? May 13, 2013 16:19 |
|
Even if you import numpy twice, it's still just happening twice while loading your program. This shouldn't have any noticeable speed impact. It's not like you're reimporting numpy over and over in a loop.
|
# ? May 13, 2013 16:42 |
|
BeefofAges posted:Even if you import numpy twice, it's still just happening twice while loading your program. This shouldn't have any noticeable speed impact. It's not like you're reimporting numpy over and over in a loop. Correct me if I'm wrong, but the second time aren't you just adding a name to the namespace of the module you're in at the time? That's essentially no work.
|
# ? May 13, 2013 16:49 |
|
Hammerite posted:Correct me if I'm wrong, but the second time aren't you just adding a name to the namespace of the module you're in at the time? That's essentially no work. That's kinda what I'm wondering. If main.py looks like: Python code:
Python code:
is it making a numpy isolated into each file (like say a main.np and a classA.np)? If this program gets big, there could be a lot of Numpys!
|
# ? May 13, 2013 17:01 |
|
There's only ever one instance of each module.
|
# ? May 13, 2013 17:13 |
|
If you're worried about optimization you're infinitely better off spending your time using an actual profiler.
|
# ? May 13, 2013 17:18 |
|
Modules are cached in sys.modules.
|
# ? May 13, 2013 17:18 |
|
OK, great, thanks guys. So it *is* just extending the namespace then, not loading a separate instance.Haystack posted:If you're worried about optimization you're infinitely better off spending your time using an actual profiler. I'm sorry, could you clarify this?
|
# ? May 13, 2013 17:41 |
|
Sorry, I was a little more curt than I should have been. It's often more trouble than it's worth to try to optimize your code as you develop it. Developers often find that they get a lot more mileage simply developing without worrying about fiddly optimization. Towards the end, they run their code through a profiler and directly see the areas are actually bottlenecks in their code. As it so happens Python ships with an excellent profiling tool called cProfile. You use it from the command line like this: code:
That said, there's nothing wrong with trying to learn the best practices before you get in too deep with your codebase, so your original question is perfectly valid.
|
# ? May 13, 2013 18:09 |
|
Haystack posted:Optimization I think one of the reasons new programmers get caught in this trap is that they often see discussions online where people talk about things like whether it's more efficient to use {} or dict() to instantiate an empty dictionary. Or maybe they see a didactic code snippet with a comment that explains, oh, we're doing this in a strange way because it's 20% faster. They start to think that's something every developer thinks about constantly as they write their code. What you don't realize, JetsGuy, is that these little bits and pieces of knowledge are part of a large bank of knowledge you accrete over years and years of practice. So maybe at one point you were debugging a tight inner loop and you found that using dict() was responsible for a slowdown that caused stutter in your UI and {} fixed it up. You might mention that in another code fragment that uses {} in a loop that could represent a slowdown, but it doesn't mean that every time you write a loop you pore over every single instruction you've written in every loop to see if there's a faster alternative. So yes, people talk about these things, and once you learn about the performance implications of {} and dict() you can choose which one to use every time and it'll be faster, but these kind of optimizations are things you should learn as a result of writing lots of code, not something you should figure out before writing any code at all. Write code that does what you need it to do in the most straightforward way possible. If it's too slow for your purposes or taste, only then should you go back and intentionally look for optimizations. Over time, after doing this many times over, you'll write more efficient code naturally.
|
# ? May 13, 2013 18:29 |
|
Thanks so much guys, you pretty much nailed what my mindset was. I'll try to just write what works *first*.
|
# ? May 13, 2013 19:52 |
|
First make it work, then make it fast (if you need to), then make it pretty (if you need to).
|
# ? May 14, 2013 03:46 |
|
BeefofAges posted:First make it work, then make it fast (if you need to), then make it pretty (if you need to). If by "pretty" you mean "readable" then shouldn't that be part of "make it work?" Most scientific programming is done in the style of "I'm going to get this to work, I don't care if it's fast or readable", and it's actually a huge problem when a change to the code needs to be made but the entire house of cards falls apart because the code has turned into a black box and no one knows what makes it work
|
# ? May 14, 2013 08:09 |
|
QuarkJets posted:If by "pretty" you mean "readable" then shouldn't that be part of "make it work?" Most scientific programming is done in the style of "I'm going to get this to work, I don't care if it's fast or readable", and it's actually a huge problem when a change to the code needs to be made but the entire house of cards falls apart because the code has turned into a black box and no one knows what makes it work I agree with you here, but that's just because I do scientific programming (as you apparently remember). I have in the last few years really made an effort to make my code easier to read and use for the next grad/postdoc/prof who may want to use the code. In the past, it was a long line of procedural code which could fall apart easily if someone didn't really get the code. Now, I try to put all the "guts" into classes and methods at the top, and clearly delineate each piece of the code. It not only helps for the (rare) times I'll need to create a new class and use inheritances, but it helps make editing the code a lot easier. It makes editing the code easier for me, and makes it easier for future users to customize/edit. Not ideal, I know, but a LOT better than a majority of the scientific code I read which is largely comment-less code with a ridiculous methodology. I can't tell you how many times I've tried to read a colleagues code and just cried bitter tears. Of course, that was more because it was the she-bitch IDL.
|
# ? May 14, 2013 14:59 |
|
Nah, I always aim for readable. I'll take readability over performance most of the time. I meant more along the lines of code where you read it and say 'wow, that's elegant' Speaking of scientific programming, most of the scientists I know either have never heard of version control, or have heard of it but never tried it. I think we really need to start teaching scientists software engineering. We can probably stop this derail and talk about Python though. There's a nice 'common misconceptions' thread going on over on reddit: http://www.reddit.com/r/Python/comments/1e8xw5/common_misconceptions_in_python/
|
# ? May 14, 2013 16:11 |
|
BeefofAges posted:Speaking of scientific programming, most of the scientists I know either have never heard of version control, or have heard of it but never tried it. I think we really need to start teaching scientists software engineering. Yeah, in the coding horrors thread, I shocked a bunch of people a few weeks ago talking about this. Some of the younger scientists use version control, but yeah, largely it's not really used. The exception is if you're part of a huge collaboration (e.g. LIGO) where things like good VCS is good and needed. I've been trying to starting getting into git. Largely right now I'm just using it for my plotter. I wish setting up the gui was a little more straightforward (it's not working on my system). I also am currently having difficulty figuring out how to properly visualize changes and such. The gui version would (hopefully) be better about that. Git looks great though, aside from my being a newbie.
|
# ? May 14, 2013 16:49 |
|
My old labmate would put data and scripts into version control, so he knew exactly what data and what computation was used to make exactly which plots. I never took it that far; I use poor man's version control (dated folders) for data and scripts, and just put the library-ish stuff in git. But yes, unit tests and vc are two things that are sadly not pushed hard enough in scientific computing.
|
# ? May 14, 2013 16:58 |
|
Emacs Headroom posted:I use poor man's version control (dated folders) for data and scripts This is what I do. I have a "tools" folder that has the "recent" version. Then I have a folder with the previous versions that are dated. But yeah, I wanna use git, but it seems like git has a bit of a learning curve in visualizing the changes.
|
# ? May 14, 2013 17:00 |
|
What do you mean when you say "visualizing the changes"? Do you want diffs between versions? Does this do what you want? code:
|
# ? May 14, 2013 17:05 |
|
Dren posted:What do you mean when you say "visualizing the changes"? Do you want diffs between versions? Yes and no. Right now, I've only been committing changes to the master, so this is ok. However, when I start determining a branch I can't see this being useful I guess? I just downloaded SmartGit, and it may be more what I want... I'll update.
|
# ? May 14, 2013 17:12 |
|
One more non-Python post can't hurt: Use gitk. Always. Use the --all argument if you want to see all branches and it's virtually always a good idea run it in the background with &. I cannot get any work done without running gitk --all & and periodically refreshing the window after I make/push some commits.
|
# ? May 14, 2013 17:16 |
|
JetsGuy posted:Yes and no. Right now, I've only been committing changes to the master, so this is ok. However, when I start determining a branch I can't see this being useful I guess? Typically, git log --name-status should be enough. I'm not sure how you envision a branch workflow going but the typical pattern is that you want to go work on some feature so you branch. Then you go work on that feature for a while and all your work happens in the branch. When you're done you merge the changes back into master and delete the branch. Once you merge back, doing git log --name-status on master will give you the full history of everything, including what happened on the branch. To bring us back to python, I liked the snippets in that reddit thread for iterating over list using a window: Python code:
and Python code:
code:
|
# ? May 14, 2013 17:55 |
|
JetsGuy posted:Yes and no. Right now, I've only been committing changes to the master, so this is ok. However, when I start determining a branch I can't see this being useful I guess? You might not have come upon a case where branching would be terribly useful, especially if you're working by yourself. Even if you're just working in the master branch forever, that's still a big improvement over not using version control at all so long as you make regular commits
|
# ? May 14, 2013 19:13 |
|
Is it considered poor form to use the fact that loop variables are still set after the loop? I have to repeatedly loop over elements of a dictionary and unset an element at each iteration, but can't unset in the loop because that's not allowed. So I came up with Python code:
|
# ? May 14, 2013 21:25 |
|
Hammerite posted:Is it considered poor form to use the fact that loop variables are still set after the loop? Python code:
|
# ? May 14, 2013 21:42 |
|
Misogynist posted:Why are you using an inner loop when each run through the loop will execute 0 or 1 times? At each step in the process, at least one of the elements of the dictionary should satisfy the if clause (otherwise the dictionary shall be considered badly-formed by definition). However, I don't know which one(s).
|
# ? May 14, 2013 21:51 |
|
|
# ? May 9, 2024 04:43 |
|
This seems like a job for a heapsort.
|
# ? May 14, 2013 22:22 |