|
ShadowHawk posted:Very briefly, your original implementation did this: Simply for my own self-development at this point, I have been looking at some implementations of hashmaps. Is it fair to say that the most basic implementations are exactly as efficient as my original linear travel through a list - and by extension, some implementations have some voodoo that allows the recovery of key/values with a lot less travel? Say I were into educating myself, is there another term I should be adding to my google search other than 'hash map/table implementations'? Most of what I come up with seems pretty linear and similar to what was causing me my initial performance issues.
|
# ? Oct 2, 2014 19:50 |
|
|
# ? May 9, 2024 18:07 |
|
The RECAPITATOR posted:I have been looking at some implementations of hashmaps. Is it fair to say that the most basic implementations are exactly as efficient as my original linear travel through a list - and by extension, some implementations have some voodoo that allows the recovery of key/values with a lot less travel? The magic voodoo you're talking about is the whole point of a hashmap. The wiki page ( http://en.wikipedia.org/wiki/Hash_table ) is a pretty good overview if you can avoid losing the forest for the trees.
|
# ? Oct 2, 2014 20:05 |
|
The RECAPITATOR posted:Is it fair to say that the most basic implementations are exactly as efficient as my original linear travel through a list - and by extension, some implementations have some voodoo that allows the recovery of key/values with a lot less travel? Dynamic hash tables (or more generally, when the keys aren't known ahead of time in constructing the hash function) have worst-case performance that's the same as fully traversing a list. These are rare situations, though, and only happen when every item is assigned the same hash value. Looking up by a key is usually constant-time. Of course, if you know the hash function that's used, it isn't very difficult to trigger this worst-case behavior by creating a lot of keys that perfectly collide. This is why Python now randomizes the hash of strings every time the interpreter is started.
|
# ? Oct 2, 2014 21:24 |
|
Lysidas posted:If you're having trouble porting old code that doesn't play nicely with the distinction between bytes and text, it was already broken and you didn't know it. "Whoops, it used to work fine but now someone dared to spell their name correctly and where the hell is this UnicodeDecodeError coming from" nope i know python 3 devs want to believe this is true universally, but over here in this land of posix apis and network protocols, we were doing ok
|
# ? Oct 2, 2014 22:00 |
|
I need to scrape some additional product details - but am having trouble extracting the data into a format where it would be clean enough to use in a spreadsheet.code:
code:
Is there a way to extract this data such that the information would be cleanly exported into a row -- something like this: I tried using Trim as well as a regex to remove whitespace, but I don't understand it well enough to be able to accomplish what I am looking to do. It doesn't appear to just be spaces or tabs, but also carriage returns or something.
|
# ? Oct 2, 2014 22:28 |
|
Shmoogy posted:I need to scrape some additional product details - but am having trouble extracting the data into a format where it would be clean enough to use in a spreadsheet. I would highly recommend that you consider using lxml.html instead of BeautifulSoup for scraping information. Python code:
code:
|
# ? Oct 2, 2014 23:55 |
|
I know we already went with this python2 vs python3 debate before in YOSPOS but I just want to add my anecdote. I've been working with this considerably big system made in 2004 python2 and a lot of problems stem from the way python2 deals with encoding. I'm pretty sure the guy at some point gave up trying to deal with it and just made all the databases be in SQL_ASCII. I sincerely believe this wouldn't have happened if it was python3.
|
# ? Oct 3, 2014 00:32 |
|
accipter posted:I would highly recommend that you consider using lxml.html instead of BeautifulSoup for scraping information. This is excellent.. but makes me feel quite stupid as I thought I understood most of it but could not get it to work on a different site (wayfair) Do my comments in the code look correct? code:
code:
Shmoogy fucked around with this message at 03:49 on Oct 3, 2014 |
# ? Oct 3, 2014 03:46 |
|
I don't have time to full answer your question, but here are a few comments:
Good luck.
|
# ? Oct 3, 2014 05:12 |
|
I need to generate a string consisting of non-ascii UTF-8 characters. I want to then create files with these strings as filenames. I haven't got a clue how to generate a string random non-ascii UTF-8 characters and my googling is yielding very little. Any help would be greatly appreciated. Thanks
|
# ? Oct 3, 2014 19:44 |
|
Naive way: generate a bunch of random integers in between 128 and 0x1f000 or so, then call chr on each number to get the corresponding character. You can restrict the range if you want, or tweak the random distribution to usually get characters in certain blocks, or anything else you'd like, but that's the general idea/framework that I'd use. (No code tags because vBulletin double-escapes things) quote:>>> from random import randrange (Pedantic note: there is no such thing as a UTF-8 character. There are Unicode characters, and UTF-8 is one of the ways that you can encode those characters to byte sequences.)
|
# ? Oct 3, 2014 20:06 |
|
loving awesome Lysidas. Thanks. I've been banging my head on the wall all morning over this. e: when I try to print print u'\u1D7AA' in the interpreter, it gives me ᵺA and not the mathematical symbol. What's up with that?
|
# ? Oct 3, 2014 20:13 |
|
'\u1D7AA' is a two-character string consisting of the following:code:
code:
|
# ? Oct 3, 2014 20:19 |
|
I've just been reading this: http://www.sttmedia.com/unicode-basiclingualplane Do you think it's worth testing anything above the basic multilingual plane when files are being created by a bunch of regular shmoes with windows boxes in an unremarkable office environment? It seems to me that anything above the BMP is going to be some specialized academic stuff.
|
# ? Oct 3, 2014 20:45 |
|
Are there any illegal unicode characters for filenames you should test for? I know on Windows particularly strange things happen if you make a window title start with two unicode right-to-left control characters. I imagine funny things might happen if you name a file that and then a program naively opens that file and uses it as the window title.
|
# ? Oct 3, 2014 20:52 |
|
Cultural Imperial posted:Do you think it's worth testing anything above the basic multilingual plane In my opinion, always. Whether you care about failures is a different issue, but you should at least know how things will behave.
|
# ? Oct 3, 2014 20:54 |
|
Cultural Imperial posted:I've just been reading this: http://www.sttmedia.com/unicode-basiclingualplane You might not hit them in this use case, but there are emojis outside the BMP that are in common use due to support on social networks.
|
# ? Oct 4, 2014 19:19 |
|
So yesterday I was running this script against a windows filesystem that was mounted to my Unix workstation. The script generated what I would guess are illegal Unicode characters in ntfs and would crap out. I stuck in a pass statement to keep the script going. I think, given the limitations of ntfs, it's probably going to be a waste of time to test above the BMP.
|
# ? Oct 4, 2014 19:39 |
|
I am trying to use PyCharm with Git for version control. My setup is as follows. I have a Dropbox folder that I am using as my 'remote repository'. I want to check out the code on my work computer (to a folder on the C drive on my work computer), work on the code and commit changes to the local copy of the code. When I am done for the day I would like to push the changes made to the dropbox folder so that I can work on the code from home (check out the code to my home computer). I can successfully pull the code from the Dropbox folder to my local computer. I can successfully make changes to the code and make commits to the local version of the code. However when I try and push the commits to the Dropbox folder I run into problems and the following error message is displayed: I don't know what I am doing wrong (and I don't have a huge understanding of Git) and was hoping someone could help me out.
|
# ? Oct 5, 2014 03:33 |
|
Jose Cuervo posted:I don't know what I am doing wrong (and I don't have a huge understanding of Git) and was hoping someone could help me out. I don't know if this is actually related to your problem, but don't keep git repos in Dropbox. It can corrupt your repos and just isn't the correct model. Put your repo up somewhere accessible like github or bitbucket and use that as your remote.
|
# ? Oct 5, 2014 04:28 |
|
Having Apple bludgeoning my Python module directories to death (and homebrew repos.. seriously wtf apple?) everytime I upgrade my os is getting pretty loving old.
|
# ? Oct 5, 2014 05:02 |
|
I'm following a YouTube tutorial and trying to learn objects and classes. I am using Python 2.7, I believe the tutorial is using Python 3. I'm an executing this code: code:
code:
|
# ? Oct 5, 2014 05:26 |
|
edit: nevermind. I think this is weird code. Maybe its not in context with your tutorial, but as presented I don't like it. Thermopyle fucked around with this message at 05:44 on Oct 5, 2014 |
# ? Oct 5, 2014 05:39 |
|
Shouldn't Person.population, just under __init__ be self.population?
|
# ? Oct 5, 2014 05:47 |
|
They're using Person.population as a static variable held by the Person class, and "totalPop" as a static function. I don't like it either but I can't think of a better way to do this right now, hm. Edit: I think you can decorate "totalPop" with @staticmethod and it'd work, but don't quote me on that.
|
# ? Oct 5, 2014 05:51 |
|
duck monster posted:Having Apple bludgeoning my Python module directories to death (and homebrew repos.. seriously wtf apple?) everytime I upgrade my os is getting pretty loving old. Are you using virtualenvs?
|
# ? Oct 5, 2014 06:53 |
|
Hughmoris posted:code Ok, I'm back at my PC. You're using Python 2.7 to run Python 3 code and that code just won't work under Python 2.7. I recommend either switching to Python 3 (best option), or finding a different tutorial. There's things you can do to make that code work (look into using the classmethod decorator), but that's probably above your head at the moment.
|
# ? Oct 5, 2014 17:18 |
|
Really, it's just a badly designed class. A Person class should be about a person. Population is a separate concern, and should be represented by separate code. E.g.Python code:
|
# ? Oct 5, 2014 17:43 |
|
Jewel posted:Edit: I think you can decorate "totalPop" with @staticmethod and it'd work, but don't quote me on that. The code structure issue dudes are talking about is because you're combining characteristics of a set of instances with the class that defines the instance. It's better to keep the set of instances as a list and leave the class for the instance only. An example of the @classmethod thing Thermopyle mentioned would be a method like this to create a new person: Python code:
Python code:
Dominoes fucked around with this message at 19:07 on Oct 5, 2014 |
# ? Oct 5, 2014 18:46 |
|
I am not much of a coder, but I'm trying to hack together a python script that parses a JSON file and sends a SQL API call to online PostGIS database CartoDB. Python code:
code:
Edit: Instead of building a list and then converting to a tuple, I tried building a string piece by piece for 'values' using the repr() of each value and the str() representation when I pass the SQL function. This seems to work, but is it the best way to do it? Python code:
Tigren fucked around with this message at 02:56 on Oct 6, 2014 |
# ? Oct 5, 2014 21:37 |
|
Thanks for the replies on the People and Population example. I'll look those over and try and find a better tutorial for objects in Python 2.7
|
# ? Oct 6, 2014 03:06 |
|
Hughmoris posted:Thanks for the replies on the People and Population example. I'll look those over and try and find a better tutorial for objects in Python 2.7
|
# ? Oct 6, 2014 03:17 |
|
KICK BAMA KICK posted:Think Python Like a Computer Scientist is very good and written for Python 2; here's the chapter on classes and objects. This person speaketh the truth. This book is how I learned programming after being away from it for 20 years.
|
# ? Oct 6, 2014 03:25 |
|
Tigren posted:Instead of building a list and then converting to a tuple, I tried building a string piece by piece for 'values' using the repr() of each value and the str() representation when I pass the SQL function. This seems to work, but is it the best way to do it? Maybe something like this: Python code:
|
# ? Oct 6, 2014 03:56 |
|
e: nevermind
Space Kablooey fucked around with this message at 20:52 on Oct 6, 2014 |
# ? Oct 6, 2014 18:29 |
|
Cultural Imperial posted:So yesterday I was running this script against a windows filesystem that was mounted to my Unix workstation. The script generated what I would guess are illegal Unicode characters in ntfs and would crap out. I stuck in a pass statement to keep the script going. I think, given the limitations of ntfs, it's probably going to be a waste of time to test above the BMP. I'm sure it doesn't matter at all for what you're doing, but FYI this: Cultural Imperial posted:windows filesystem that was mounted to my Unix workstation The boxes are valid code points that are unassigned and/or have no glyph in that font, but they're stored correctly.
|
# ? Oct 6, 2014 18:39 |
|
Dominoes posted:Issue with iPython. When running scripts with long/time-consuming loops, it tends to hang. Ie if it periodically prints something, progress will stop. If I press ctrl+c, the script continues. (Normally ctrl+c kills the running script) This doesn't occur if using normal python; only IPython. Any ideas?
|
# ? Oct 6, 2014 19:55 |
|
Lysidas posted:wisdom and knowledge Thanks Lysidas!
|
# ? Oct 6, 2014 21:26 |
|
Dominoes posted:Solved, I think. This is caused by using autoreload, and causing errors while editing the code while the script's running. It's triggering silent autoreload errors that ctrl+c skips, I think. Autoreload is awesome, but it causes problems sometimes. And I always forget to check it when I'm having odd problems.
|
# ? Oct 6, 2014 22:41 |
|
|
# ? May 9, 2024 18:07 |
Do you guys use virtualenv-burrito? I filed an issue about its use of ~/.bash_profile and wanted to get a second opinion: https://github.com/brainsik/virtualenv-burrito/issues/51
|
|
# ? Oct 6, 2014 22:46 |