Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Locked thread
The RECAPITATOR
May 12, 2006

Cursed to like terrible teams.

ShadowHawk posted:

Very briefly, your original implementation did this:

for every item in the list:
look through the list until I hit something

The new implementation is like this:

for every item in the list:
do a hash computation once and then check it

Simply for my own self-development at this point, I have been looking at some implementations of hashmaps. Is it fair to say that the most basic implementations are exactly as efficient as my original linear travel through a list - and by extension, some implementations have some voodoo that allows the recovery of key/values with a lot less travel?

Say I were into educating myself, is there another term I should be adding to my google search other than 'hash map/table implementations'? Most of what I come up with seems pretty linear and similar to what was causing me my initial performance issues.

Adbot
ADBOT LOVES YOU

Nippashish
Nov 2, 2005

Let me see you dance!

The RECAPITATOR posted:

I have been looking at some implementations of hashmaps. Is it fair to say that the most basic implementations are exactly as efficient as my original linear travel through a list - and by extension, some implementations have some voodoo that allows the recovery of key/values with a lot less travel?

The magic voodoo you're talking about is the whole point of a hashmap. The wiki page ( http://en.wikipedia.org/wiki/Hash_table ) is a pretty good overview if you can avoid losing the forest for the trees.

Lysidas
Jul 26, 2002

John Diefenbaker is a madman who thinks he's John Diefenbaker.
Pillbug

The RECAPITATOR posted:

Is it fair to say that the most basic implementations are exactly as efficient as my original linear travel through a list - and by extension, some implementations have some voodoo that allows the recovery of key/values with a lot less travel?

Dynamic hash tables (or more generally, when the keys aren't known ahead of time in constructing the hash function) have worst-case performance that's the same as fully traversing a list. These are rare situations, though, and only happen when every item is assigned the same hash value. Looking up by a key is usually constant-time.

Of course, if you know the hash function that's used, it isn't very difficult to trigger this worst-case behavior by creating a lot of keys that perfectly collide. This is why Python now randomizes the hash of strings every time the interpreter is started.

tef
May 30, 2004

-> some l-system crap ->

Lysidas posted:

If you're having trouble porting old code that doesn't play nicely with the distinction between bytes and text, it was already broken and you didn't know it. "Whoops, it used to work fine but now someone dared to spell their name correctly and where the hell is this UnicodeDecodeError coming from"

nope :3: i know python 3 devs want to believe this is true universally, but over here in this land of posix apis and network protocols, we were doing ok

Shmoogy
Mar 21, 2007
I need to scrape some additional product details - but am having trouble extracting the data into a format where it would be clean enough to use in a spreadsheet.

code:
from bs4 import BeautifulSoup
import requests
import unittest, time, re


base_url = "http://www.lightingnewyork.com/product/john-richard-alexander-john-chandeliers-ajc-8535.html"
r = requests.get(base_url)
html = r.text
soup = BeautifulSoup(html, 'html.parser')

details = soup.find("div", {"class" : "pd_padding"})
#finding block of product details

with open('outputLightingNewYork.txt', 'a') as f:
    for detail in details:
        if detail.find('li'):
            f.write(re.sub('<[A-Za-z\/][^>]*>','',str(detail)))
            #strip html tags and write to file
The end result (part of it)is this:
code:


                                	SKU: AJC-8535

                                



Dimensions and Weight



                                        Width: 

                                        30.00  in.

                                        

                                    


                                        Height: 

                                        47.00  in.

                                        

                                    



Other Specifications



                                    Ships Via: Freight 

                                



Additional Details



                                	47"H 30"W 10 Light Chandelier

                                

									
(and a screenshot of how it looks when I pull the text file into excel)

Is there a way to extract this data such that the information would be cleanly exported into a row -- something like this:


I tried using Trim as well as a regex to remove whitespace, but I don't understand it well enough to be able to accomplish what I am looking to do. It doesn't appear to just be spaces or tabs, but also carriage returns or something.

accipter
Sep 12, 2003

Shmoogy posted:

I need to scrape some additional product details - but am having trouble extracting the data into a format where it would be clean enough to use in a spreadsheet.

I would highly recommend that you consider using lxml.html instead of BeautifulSoup for scraping information.


Python code:
import re
import pprint

import lxml.html

#root = lxml.html.parse('http://www.lightingnewyork.com/product/john-richard-alexander-john-chandeliers-ajc-8535.html')
root = lxml.html.parse('./page.htm')

groups = root.xpath('//div[@class="pd_float"]/div[@class="detail_info"]')

info = dict()
for group in groups:
    name = group.xpath('h2/text()')[0]
    props = []
    for list_item in group.xpath('ul/li'):
        strings = list_item.xpath('text()') + list_item.xpath('a/text()')
        # Clean up the strings by removing excess white space
        strings = [re.sub('\s+', ' ', s).strip() for s in strings]
        prop = ''.join(strings)
        props.append(prop)

    info[name] = props

pprint.pprint(info)
Results:
code:
{'Additional Details': ['47"H 30"W 10 Light Chandelier',
                        'Lighting Fixed Lighting Chandeliers'],
 'Brand Information': ['Brand:John Richard',
                       'Collection:Alexander John',
                       'SKU: AJC-8535'],
 'Bulb Information': ['Bulbs Included: No',
                      'Primary Bulb(s): 10 x 60 watts B'],
 'Design Information': ['Category:Chandeliers',
                        'Finish:Hand-Painted',
                        'Material: Iron'],
 'Dimensions and Weight': ['Width: 30.00 in.', 'Height: 47.00 in.'],
 'Other Specifications': ['Ships Via: Freight']}
The best way to figure out the xpaths is to first read about them, and then inspect the elements with Chrome (in the right-click menu). I am sure there are even better ways out there.

Symbolic Butt
Mar 22, 2009

(_!_)
Buglord
I know we already went with this python2 vs python3 debate before in YOSPOS but I just want to add my anecdote. I've been working with this considerably big system made in 2004 python2 and a lot of problems stem from the way python2 deals with encoding. I'm pretty sure the guy at some point gave up trying to deal with it and just made all the databases be in SQL_ASCII. :gonk:

I sincerely believe this wouldn't have happened if it was python3.

Shmoogy
Mar 21, 2007

accipter posted:

I would highly recommend that you consider using lxml.html instead of BeautifulSoup for scraping information.


Python code:
import re
import pprint

import lxml.html

#root = lxml.html.parse('http://www.lightingnewyork.com/product/john-richard-alexander-john-chandeliers-ajc-8535.html')
root = lxml.html.parse('./page.htm')

groups = root.xpath('//div[@class="pd_float"]/div[@class="detail_info"]')
		

info = dict()
for group in groups:
    name = group.xpath('h2/text()')[0]
    props = []
    for list_item in group.xpath('ul/li'):
        strings = list_item.xpath('text()') + list_item.xpath('a/text()')
        # Clean up the strings by removing excess white space
        strings = [re.sub('\s+', ' ', s).strip() for s in strings]
        prop = ''.join(strings)
        props.append(prop)

    info[name] = props

pprint.pprint(info)
Results:
code:
The best way to figure out the xpaths is to first read about them, and then inspect the elements with Chrome (in the right-click menu). I am sure there are even better ways out there.

This is excellent.. but makes me feel quite stupid as I thought I understood most of it but could not get it to work on a different site (wayfair)

Do my comments in the code look correct?

code:
root = lxml.html.parse('http://www.lightingnewyork.com/product/john-richard-alexander-john-chandeliers-ajc-8535.html')
#root = lxml.html.parse('./page.htm')

groups = root.xpath('//div[@class="pd_float"]/div[@class="detail_info"]')
                #large (product detail floating) box         detail column - each bunch of text is a div class= detail_info
		# I tried to recreate this, as I can see the class detail_info. When I inspect element and copy xpath, I get this //*[@id="info_prod"] - where do I click on in the source to actually get the //pd float / detail?

info = dict()
for group in groups:
    name = group.xpath('h2/text()')[0]
            #setting names of each  point of text (brand information, dimensions weight, etc)
    props = []
    for list_item in group.xpath('ul/li'):
        #iterate through every item in group, elements are UL and LI (lists)
        strings = list_item.xpath('text()') + list_item.xpath('a/text()')
            #grabbing the LI text and ?? (href descript text?)

        strings = [re.sub('\s+', ' ', s).strip() for s in strings]
        # Clean up the strings by removing excess white space
        #substite with regex new line replace with space, and strip is trim, for each letter in each item(string)
        prop = ''.join(strings)
        #combining each of the previous breakups into a single entry?
        props.append(prop)
        #adding entry into the list we created above the loop

    info[name] = props
    #setting dictionary key with name and the text descript

pprint.pprint(info)
#not sure how to actually use this, if I just do pprint(info) I get an error, I can't find out how to write this to a text file or excel in a single row
I tried to alter it to grab product details from wayfair - random product.. but my understanding of the code and xpath is fairly weak, it seemed so straight forward in the article and when accipter did it.

code:
root = lxml.html.parse('http://www.wayfair.com/Jaipur-Rugs-Foundations-By-Chayse-Dacoda-Capri-Abstract-Rug-FC11-JPU3196.html')
#root = lxml.html.parse('./page.htm')

groups = root.xpath('//*[@id="yui-main"]/div/div/div/div/div')

info = dict()
for group in groups:
    name = group.xpath('//*[@id="yui-main"]/div/div/div/div/div/table/tbody/tr/td/strong/text()')
    props = []
    for list_item in group.xpath('tr/td'):
        strings = list_item.xpath('text()') + list_item.xpath('a/text()')
        # Clean up the strings by removing excess white space
        strings = [re.sub('\s+', ' ', s).strip() for s in strings]
        prop = ''.join(strings)
        props.append(prop)

    info[name] = props

pprint.pprint(info)

Shmoogy fucked around with this message at 03:49 on Oct 3, 2014

accipter
Sep 12, 2003
I don't have time to full answer your question, but here are a few comments:
  1. My organization of groups/properties might not be appropriate for all websites. I am not sure how to expand what I did to work universally
  2. When selecting a xpath, I try to use specific features (rather than count down 'div's). For example:
    code:
    root.xpath('//div[@class="class="details_body tab_body js-tab-body js-details-body"]')
    
    would be a good way to select the container of the information
  3. Using pprint was just to show you what I got. You would probably want to store everything in a list along with the URL and other information, and then write it to a CSV.

Good luck.

namaste friends
Sep 18, 2004

by Smythe
I need to generate a string consisting of non-ascii UTF-8 characters. I want to then create files with these strings as filenames. I haven't got a clue how to generate a string random non-ascii UTF-8 characters and my googling is yielding very little. Any help would be greatly appreciated. Thanks

Lysidas
Jul 26, 2002

John Diefenbaker is a madman who thinks he's John Diefenbaker.
Pillbug
Naive way: generate a bunch of random integers in between 128 and 0x1f000 or so, then call chr on each number to get the corresponding character. You can restrict the range if you want, or tweak the random distribution to usually get characters in certain blocks, or anything else you'd like, but that's the general idea/framework that I'd use.

(No code tags because vBulletin double-escapes things)

quote:

>>> from random import randrange
>>> def random_char():
... return chr(randrange(128, 0x1f000))
...
>>> ''.join(random_char() for _ in range(10))
'\U0001d2c6\U00018c75쪙\U000199a7\U0001dd50\U0001e5e8\U000124af\U00019f59𝗺Ḝ'

(Pedantic note: there is no such thing as a UTF-8 character. There are Unicode characters, and UTF-8 is one of the ways that you can encode those characters to byte sequences.)

namaste friends
Sep 18, 2004

by Smythe
loving awesome Lysidas. Thanks. I've been banging my head on the wall all morning over this.

e: when I try to print

print u'\u1D7AA'

in the interpreter, it gives me ᵺA and not the mathematical symbol. What's up with that?

Lysidas
Jul 26, 2002

John Diefenbaker is a madman who thinks he's John Diefenbaker.
Pillbug
'\u1D7AA' is a two-character string consisting of the following:
code:
>>> import unicodedata
>>> [unicodedata.name(c) for c in '\u1D7AA']
['LATIN SMALL LETTER TH WITH STRIKETHROUGH', 'LATIN CAPITAL LETTER A']
The \uxxxx escape sequences require exactly four hexadecimal digits after the \u. For code points over 0xFFFF, you have to use the longer escape sequence format \U00xxxxxx:

code:
>>> [unicodedata.name(c) for c in '\U0001D7AA']
['MATHEMATICAL SANS-SERIF BOLD ITALIC SMALL ALPHA']

namaste friends
Sep 18, 2004

by Smythe
I've just been reading this: http://www.sttmedia.com/unicode-basiclingualplane

Do you think it's worth testing anything above the basic multilingual plane when files are being created by a bunch of regular shmoes with windows boxes in an unremarkable office environment?

It seems to me that anything above the BMP is going to be some specialized academic stuff.

ShadowHawk
Jun 25, 2000

CERTIFIED PRE OWNED TESLA OWNER
Are there any illegal unicode characters for filenames you should test for?

I know on Windows particularly strange things happen if you make a window title start with two unicode right-to-left control characters. I imagine funny things might happen if you name a file that and then a program naively opens that file and uses it as the window title.

Lysidas
Jul 26, 2002

John Diefenbaker is a madman who thinks he's John Diefenbaker.
Pillbug

Cultural Imperial posted:

Do you think it's worth testing anything above the basic multilingual plane

In my opinion, always. Whether you care about failures is a different issue, but you should at least know how things will behave.

supercrooky
Sep 12, 2006

Cultural Imperial posted:

I've just been reading this: http://www.sttmedia.com/unicode-basiclingualplane

Do you think it's worth testing anything above the basic multilingual plane when files are being created by a bunch of regular shmoes with windows boxes in an unremarkable office environment?

It seems to me that anything above the BMP is going to be some specialized academic stuff.

You might not hit them in this use case, but there are emojis outside the BMP that are in common use due to support on social networks.

namaste friends
Sep 18, 2004

by Smythe
So yesterday I was running this script against a windows filesystem that was mounted to my Unix workstation. The script generated what I would guess are illegal Unicode characters in ntfs and would crap out. I stuck in a pass statement to keep the script going. I think, given the limitations of ntfs, it's probably going to be a waste of time to test above the BMP.

Jose Cuervo
Aug 25, 2004
I am trying to use PyCharm with Git for version control. My setup is as follows. I have a Dropbox folder that I am using as my 'remote repository'. I want to check out the code on my work computer (to a folder on the C drive on my work computer), work on the code and commit changes to the local copy of the code. When I am done for the day I would like to push the changes made to the dropbox folder so that I can work on the code from home (check out the code to my home computer).

I can successfully pull the code from the Dropbox folder to my local computer. I can successfully make changes to the code and make commits to the local version of the code. However when I try and push the commits to the Dropbox folder I run into problems and the following error message is displayed:


I don't know what I am doing wrong (and I don't have a huge understanding of Git) and was hoping someone could help me out.

good jovi
Dec 11, 2000

'm pro-dickgirl, and I VOTE!

Jose Cuervo posted:

I don't know what I am doing wrong (and I don't have a huge understanding of Git) and was hoping someone could help me out.

I don't know if this is actually related to your problem, but don't keep git repos in Dropbox. It can corrupt your repos and just isn't the correct model. Put your repo up somewhere accessible like github or bitbucket and use that as your remote.

duck monster
Dec 15, 2004

Having Apple bludgeoning my Python module directories to death (and homebrew repos.. seriously wtf apple?) everytime I upgrade my os is getting pretty loving old.

:sigh:

Hughmoris
Apr 21, 2007
Let's go to the abyss!
I'm following a YouTube tutorial and trying to learn objects and classes. I am using Python 2.7, I believe the tutorial is using Python 3.

I'm an executing this code:
code:
class Person:
	population = 0
	def __init__(self, name, age):
		self.name = name
		self.age = age
		print('{0} has been born!'.format(self.name))
		Person.population +=1
	def totalPop():
		print ('There are {0} people in the world'.format(Person.population))

p1 = Person('Johnny', 20)
p2 = Person('Mary', 30)
print(Person.totalPop())
and receiving this error:
code:
Traceback (most recent call last):
  File "C:\projects\python\Person.py", line 13, in <module>
    print(Person.totalPop())
TypeError: unbound method totalPop() must be called with Person instance as first argument (got nothing instead)
I believe I'm copying the tutorial example exactly but I am receiving an error where they are not. What am I overlooking?

Thermopyle
Jul 1, 2003

...the stupid are cocksure while the intelligent are full of doubt. —Bertrand Russell

edit: nevermind.

I think this is weird code. Maybe its not in context with your tutorial, but as presented I don't like it.

Thermopyle fucked around with this message at 05:44 on Oct 5, 2014

namaste friends
Sep 18, 2004

by Smythe
Shouldn't Person.population, just under __init__ be self.population?

Jewel
May 2, 2009

They're using Person.population as a static variable held by the Person class, and "totalPop" as a static function. I don't like it either but I can't think of a better way to do this right now, hm.

Edit: I think you can decorate "totalPop" with @staticmethod and it'd work, but don't quote me on that.

BeefofAges
Jun 5, 2004

Cry 'Havoc!', and let slip the cows of war.

duck monster posted:

Having Apple bludgeoning my Python module directories to death (and homebrew repos.. seriously wtf apple?) everytime I upgrade my os is getting pretty loving old.

:sigh:

Are you using virtualenvs?

Thermopyle
Jul 1, 2003

...the stupid are cocksure while the intelligent are full of doubt. —Bertrand Russell


Ok, I'm back at my PC. You're using Python 2.7 to run Python 3 code and that code just won't work under Python 2.7. I recommend either switching to Python 3 (best option), or finding a different tutorial.

There's things you can do to make that code work (look into using the classmethod decorator), but that's probably above your head at the moment.

Haystack
Jan 23, 2005





Really, it's just a badly designed class. A Person class should be about a person. Population is a separate concern, and should be represented by separate code. E.g.

Python code:
class Person(object):
	def __init__(self, name, age):
		self.name = name
		self.age = age
		print('{0} has been born!'.format(self.name))

class Population(object):
	def __init__(self):
		self.people = []
	def declare_pop(self):
		print ('There are {0} people in the world'.format(len(self.people)))

pop = Population()
pop.people.append(Person('Johnny', 20))
pop.people.append(Person('Mary', 30))
pop.declare_pop()
Not the most comprehensive example, but it gets the point across (also it should work in python 2.7)

Dominoes
Sep 20, 2007

Jewel posted:

Edit: I think you can decorate "totalPop" with @staticmethod and it'd work, but don't quote me on that.
Yea this is the solution. The prob is you need self as the first arg for any method unless you decorate it with @staticmethod. Your code works otherwise.

The code structure issue dudes are talking about is because you're combining characteristics of a set of instances with the class that defines the instance. It's better to keep the set of instances as a list and leave the class for the instance only.

An example of the @classmethod thing Thermopyle mentioned would be a method like this to create a new person:

Python code:
@classmethod
def from_dog(cls, name, age):
    people_years = age * 6
    return cls(name, people_years)

p1 = Person.from_dog('Sir fetchalot', 5)

One way to handle population:
Python code:
p1 = Person('Johnny', 20)
p2 = Person('Mary', 30)

people = [p1, p2]

population = len(people)
Pep8 nazi says use spaces instead of tabs, put an empty line between methods, and two lines after the class.

Dominoes fucked around with this message at 19:07 on Oct 5, 2014

Tigren
Oct 3, 2003
I am not much of a coder, but I'm trying to hack together a python script that parses a JSON file and sends a SQL API call to online PostGIS database CartoDB.

Python code:
from cartodb import CartoDBAPIKey, CartoDBException
import spot2carto_keys
import urllib
import json

# CartoDB Authentication
cartodb = CartoDBAPIKey(spot2carto_keys.API_KEY, spot2carto_keys.carto_domain)

#SPOT Authentication
url = urllib.urlopen('https://api.findmespot.com/spot-main-web/consumer/rest-api/2.0/public/feed/' +
        spot2carto_keys.spotAPIkey + '/message.json')

#Convert SPOT into usable format
json_data = json.loads(url.read())
data = json_data['response']['feedMessageResponse']['messages']['message']

#Find last added point
try:
    latest = cartodb.sql("SELECT MAX(unixtime) FROM pct_spot", False, True, 'csv')
    max = int(latest[6:-2])
except CartoDBException as e:
    print ("Error: ", e)

#Insert newer points
for messages in data:
    if (messages['unixTime'] > max):
        point_data = {
                'the_geom': 'ST_SetSRID(ST_MakePoint(%s, %s),4326)' % (messages['longitude'], messages['latitude']),
                'id': messages['id'],
                'longitude': messages['longitude'],
                'unixtime': messages['unixTime'],
                'messagetype': messages['messageType'].encode('utf-8'),
                'messengerid': messages['messengerId'].encode('utf-8'),
                'batterystate': messages['batteryState'].encode('utf-8'),
                'latitude': messages['latitude'],
                'hidden': messages['hidden'],
                'modelid': messages['modelId'].encode('utf-8')
                }
        keys = ','.join(['%s' % k for k in point_data.iterkeys()])
        values = []
        for v in point_data.itervalues():
            values.append(v)
        query = 'INSERT INTO pct_spot (%s) VALUES %s' % (keys, tuple(values))
        print cartodb.sql(query)
This yields the following SQL query:

code:
INSERT INTO pct_spot (messagetype,batterystate,latitude,unixtime,hidden,modelid,the_geom,id,longitude,messengerid) 
VALUES ('UNLIMITED-TRACK', 'GOOD', 48.66172, 1412475115, 0, 'SPOT3', 'ST_SetSRID(ST_MakePoint(-120.73102, 48.66172),4326)', 329199789, -120.73102, '0-2414146')
I used the list/tuple combo because the 'values' are a mixture of strings and numbers, but 'the_geom' value (ST_SetSRID) needs to be passed as a SQL function, not a string. I'm not quite sure how to go about this. Can anyone offer any tips? Again, I'm not much of a programmer, more of a dabbler, but I'm open to all suggestions.

Edit:

Instead of building a list and then converting to a tuple, I tried building a string piece by piece for 'values' using the repr() of each value and the str() representation when I pass the SQL function. This seems to work, but is it the best way to do it?

Python code:
        s = '('
        for x in point_data.iterkeys():
                if (s == '('):
                    s = s + repr(point_data[x])
                elif (x != 'the_geom'):
                    s = s + ',' + repr(point_data[x])
                else:
                    s = s + ',' + point_data[x]
        s = s + ')'

Tigren fucked around with this message at 02:56 on Oct 6, 2014

Hughmoris
Apr 21, 2007
Let's go to the abyss!
Thanks for the replies on the People and Population example. I'll look those over and try and find a better tutorial for objects in Python 2.7

KICK BAMA KICK
Mar 2, 2009

Hughmoris posted:

Thanks for the replies on the People and Population example. I'll look those over and try and find a better tutorial for objects in Python 2.7
Think Python Like a Computer Scientist is very good and written for Python 2; here's the chapter on classes and objects.

Thermopyle
Jul 1, 2003

...the stupid are cocksure while the intelligent are full of doubt. —Bertrand Russell

KICK BAMA KICK posted:

Think Python Like a Computer Scientist is very good and written for Python 2; here's the chapter on classes and objects.

This person speaketh the truth. This book is how I learned programming after being away from it for 20 years.

Symbolic Butt
Mar 22, 2009

(_!_)
Buglord

Tigren posted:

Instead of building a list and then converting to a tuple, I tried building a string piece by piece for 'values' using the repr() of each value and the str() representation when I pass the SQL function. This seems to work, but is it the best way to do it?

Python code:
        s = '('
        for x in point_data.iterkeys():
                if (s == '('):
                    s = s + repr(point_data[x])
                elif (x != 'the_geom'):
                    s = s + ',' + repr(point_data[x])
                else:
                    s = s + ',' + point_data[x]
        s = s + ')'

Maybe something like this:

Python code:
s = '({})'.format(', '.join(repr(v) if k != 'the_geom' else v 
                            for k, v in point_data.iteritems()))
But yours is more readable. :shrug:

Space Kablooey
May 6, 2009


e: nevermind

Space Kablooey fucked around with this message at 20:52 on Oct 6, 2014

Lysidas
Jul 26, 2002

John Diefenbaker is a madman who thinks he's John Diefenbaker.
Pillbug

Cultural Imperial posted:

So yesterday I was running this script against a windows filesystem that was mounted to my Unix workstation. The script generated what I would guess are illegal Unicode characters in ntfs and would crap out. I stuck in a pass statement to keep the script going. I think, given the limitations of ntfs, it's probably going to be a waste of time to test above the BMP.

I'm sure it doesn't matter at all for what you're doing, but FYI this:

Cultural Imperial posted:

windows filesystem that was mounted to my Unix workstation
was the source of your problem. NTFS can store non-BMP characters perfectly well. As far as I know, Windows has handled non-BMP characters correctly since Vista -- XP would treat the two surrogate code units as separate characters in a lot of situations.



The boxes are valid code points that are unassigned and/or have no glyph in that font, but they're stored correctly.

Dominoes
Sep 20, 2007

Dominoes posted:

Issue with iPython. When running scripts with long/time-consuming loops, it tends to hang. Ie if it periodically prints something, progress will stop. If I press ctrl+c, the script continues. (Normally ctrl+c kills the running script) This doesn't occur if using normal python; only IPython. Any ideas?
Solved, I think. This is caused by using autoreload, and causing errors while editing the code while the script's running. It's triggering silent autoreload errors that ctrl+c skips, I think.

namaste friends
Sep 18, 2004

by Smythe

Lysidas posted:

wisdom and knowledge

Thanks Lysidas!

Thermopyle
Jul 1, 2003

...the stupid are cocksure while the intelligent are full of doubt. —Bertrand Russell

Dominoes posted:

Solved, I think. This is caused by using autoreload, and causing errors while editing the code while the script's running. It's triggering silent autoreload errors that ctrl+c skips, I think.

Autoreload is awesome, but it causes problems sometimes. And I always forget to check it when I'm having odd problems.

Adbot
ADBOT LOVES YOU

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb
Do you guys use virtualenv-burrito? I filed an issue about its use of ~/.bash_profile and wanted to get a second opinion: https://github.com/brainsik/virtualenv-burrito/issues/51

  • Locked thread