Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Locked thread
Dominoes
Sep 20, 2007

Lumpy posted:

You can write your own ModelManager to handle hooking up the ORM to a remote DB. We are in the process of writing a Django app that uses a custom ModelManager coupled with SQLAlchemy to use a (*shudder*) Azure Cloud MSSQL store for our data.
It looks like Django's model system is working well, and is a straightforward solution.

Adbot
ADBOT LOVES YOU

KICK BAMA KICK
Mar 2, 2009

Thanks for the recommendation of Think Python some time ago. Working through one of the exercises (using some syntax not yet covered in the text) I came up with this as part of my solution:
code:
def sorted_anagrams(anagram_dict, min_chars = 0, min_anagrams = 0):
    """Returns a tuple containing lists of anagrams, in descending order of list length.

    anagram_dict: A dictionary of the form generated by make_anagram_dict. Keys are strings of characters,
    alphabetically sorted; values are lists of anagrams made from those characters.

    min_chars, min_anagrams: If provided, excludes anagrams with fewer than the minimum number
    of characters or groups with fewer than the minimum number of anagrams."""

    return tuple(anagrams for characters, anagrams in sorted(anagram_dict.items(), key=lambda t: len(t[1]), reverse=True)
                 if len(characters) >= min_chars and len(anagrams) >= min_anagrams)
My question: is a one-liner comprehension like that "Pythonic" or is that trying too hard to be Pythonic? To me that was more natural than writing it out in the longer fashion but I could imagine someone else reading that and saying "No, dude, just break that down."

BeefofAges
Jun 5, 2004

Cry 'Havoc!', and let slip the cows of war.

To me that looks practically unreadable. The good thing is that the docstring explains what's going on. When I find an undocumented comprehension that looks like that buried in someone's code I want to strangle them.

QuarkJets
Sep 8, 2008

KICK BAMA KICK posted:

Thanks for the recommendation of Think Python some time ago. Working through one of the exercises (using some syntax not yet covered in the text) I came up with this as part of my solution:
code:
CODE
My question: is a one-liner comprehension like that "Pythonic" or is that trying too hard to be Pythonic? To me that was more natural than writing it out in the longer fashion but I could imagine someone else reading that and saying "No, dude, just break that down."

One-liners are fine if they're easy to comprehend, but this one is pretty complex. One-liners are not good if I'm afraid that it looks easily breakable, which this one does (even if it isn't)

Even if you wanted to keep this as a one-liner, I'd suggest breaking the logic into several more lines (it's already 2 lines, technically)

Plorkyeran
Mar 22, 2007

To Escape The Shackles Of The Old Forums, We Must Reject The Tribal Negativity He Endorsed
I'd pull the sorted(...) out of the comprehension, but otherwise think it's fine.

The problem isn't asking for the final result to be a tuple, btw (the example results are lists and returning a tuple make zero sense there).

SirPablo
May 1, 2004

Pillbug

unixbeard posted:

I need to read a bunch of excel files in python, just reading no writing/creation. It seems like there are a few packages, xlrd and openpyxl, before I dive in does anyone have opinions or advice for/against either of them?

I needed to do some quick work and xlrd was slick, easy to learn.

Dren
Jan 5, 2001

Pillbug

KICK BAMA KICK posted:

Thanks for the recommendation of Think Python some time ago. Working through one of the exercises (using some syntax not yet covered in the text) I came up with this as part of my solution:
code:
def sorted_anagrams(anagram_dict, min_chars = 0, min_anagrams = 0):
    """Returns a tuple containing lists of anagrams, in descending order of list length.

    anagram_dict: A dictionary of the form generated by make_anagram_dict. Keys are strings of characters,
    alphabetically sorted; values are lists of anagrams made from those characters.

    min_chars, min_anagrams: If provided, excludes anagrams with fewer than the minimum number
    of characters or groups with fewer than the minimum number of anagrams."""

    return tuple(anagrams for characters, anagrams in sorted(anagram_dict.items(), key=lambda t: len(t[1]), reverse=True)
                 if len(characters) >= min_chars and len(anagrams) >= min_anagrams)
My question: is a one-liner comprehension like that "Pythonic" or is that trying too hard to be Pythonic? To me that was more natural than writing it out in the longer fashion but I could imagine someone else reading that and saying "No, dude, just break that down."

It's not that bad but I'd pull the sorted(...) step out of the comprehension and put it on its own line.

Computer viking
May 30, 2011
Now with less breakage.

Dren posted:

It's not that bad but I'd pull the sorted(...) step out of the comprehension and put it on its own line.

Even just reformatting would make it easier to read, e.g.
code:
return tuple(
  anagrams 
  for characters, anagrams 
  in sorted(anagram_dict.items(), key=lambda t: len(t[1]), reverse=True)
  if len(characters) >= min_chars and len(anagrams) >= min_anagrams
)

Pollyanna
Mar 5, 2005

Milk's on them.


I have an issue with Heroku. However, I can't replicate the error on my end. I suspect that something is erroring out on Heroku's end, but I can't get a stacktrace from them. All that happens is a 500 error.

I heard about the logging module, and was hoping that it could help me by just printing out a trace to the console, as if you set "-v" flag or something. How do I do this?

Modern Pragmatist
Aug 20, 2008

Pollyanna posted:

I have an issue with Heroku. However, I can't replicate the error on my end. I suspect that something is erroring out on Heroku's end, but I can't get a stacktrace from them. All that happens is a 500 error.

I heard about the logging module, and was hoping that it could help me by just printing out a trace to the console, as if you set "-v" flag or something. How do I do this?

My guess would be directory permissions since it seems that you're dynamically writing HTML files (?). Design decisions aside, you should be able to use the following to print your own stack trace after the IOError is encountered:

Python code:
import traceback

try:
    # blah
except IOError, e:
    traceback.print_exc()

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb
Little test program:
code:
#!/usr/bin/env python


class CoolObject(object):
    def __init__(self, a=None, b=None):
        self.a = a
        self.b = b

    def __eq__(self, other):
        print('__eq__ called')
        if isinstance(other, self.__class__) and self.a == other.a and self.b == other.b:
            return True
        return False

the_list = []

original = CoolObject(a=1, b=2)
the_list.append(original)

duplicate = CoolObject(a=1, b=2)

print('original in the_list? %s' % (original in the_list))

print('duplicate in the_list? %s' % (duplicate in the_list))

print('duplicate NOT in the_list? %s' % (duplicate not in the_list))

print('original == duplicate? %s' % (original == duplicate))
Results:
code:
original in the_list? True
__eq__ called
duplicate in the_list? True
__eq__ called
duplicate NOT in the_list? False
__eq__ called
original == duplicate? True
Is __eq__ only called 3 times because it's just checking the reference in my first print()?

I have a function that generates a bunch of CoolObjects and returns a list, but as I'm generating them I don't want to add them to the list if a duplicate is already in there. Is overriding __eq__ the correct way to handle 'x in somelist' and 'x not in somelist' behavior?

emoji
Jun 4, 2004
C code:
list_contains(PyListObject *a, PyObject *el)
{
    Py_ssize_t i;
    int cmp;

    for (i = 0, cmp = 0 ; cmp == 0 && i < Py_SIZE(a); ++i)
        cmp = PyObject_RichCompareBool(el, PyList_GET_ITEM(a, i),
                                           Py_EQ);
    return cmp;
}
Note If o1 and o2 are the same object, PyObject_RichCompareBool() will always return 1 for Py_EQ and 0 for Py_NE.

But you might want to define __hash__ and use sets if your objects are immutable.

SurgicalOntologist
Jun 17, 2004

Regarding that question I asked last week when everyone was like, "You're looking for asynchronous I/O"- I've been reading a bunch on that, and I think I'm learning something. However, pretty much everything I can find--especially beginner guides--frames the issue as one of dealing with slow I/O processes. I have the opposite problem: an extremely fast input process. The use case is that either (a) something should run every time new data comes in, or (b) something needs to reference only the most recent piece of data (b will be far more common than a). I will never have a situation where I'm waiting for data to come in.

I don't doubt, of course, that asyncio can handle this (whether the proposed 3.4 library or the concept in general), I'm just having trouble thinking about it since everything I've read seems to be "how to elegantly get your program to block until your input comes in". Instead, I want to repeatedly ask for input (which requires calling a specific function over and over), but do so in the background without turning every program using this API into one big loop. There's plenty of jargon I haven't figured out yet, but I don't know where to look--I don't feel like I've found the trail. Any suggestions for further reading?

FoiledAgain
May 6, 2007

What is the reason for string.find() to return -1 on failure? That leads to this unexpected output:

code:
>>> x = 'vikings'
>>> if x.find('eggs'): print('found it!')
...
found it!
>>>

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb
In case it's at position 0:

code:
>>> x = 'eggs over easy'
>>> if x.find('eggs'): print('found it!')
...
>>> 

Opinion Haver
Apr 9, 2007

fletcher posted:

In case it's at position 0:

code:
>>> x = 'eggs over easy'
>>> if x.find('eggs'): print('found it!')
...
>>> 

So why doesn't it return None if it's not found?

suffix
Jul 27, 2013

Wheeee!

Opinion Haver posted:

So why doesn't it return None if it's not found?

The code would still be wrong, since both 0 and None would be false. Better to find the bug early.

.find() isn't usually what you want. If you want to check if it's there, use 'in'. If you need to know the position, use .index(), which raises an exception if it's not there.
code:
if 'eggs' in x: print('found it!')

QuarkJets
Sep 8, 2008

FoiledAgain posted:

What is the reason for string.find() to return -1 on failure? That leads to this unexpected output:

code:
>>> x = 'vikings'
>>> if x.find('eggs'): print('found it!')
...
found it!
>>>

You shouldn't be using string.find() for substring checks anyway, use in. If you wanted to use string.find(), then you could instead search for a less than 0 index

code:
if x.find('eggs') >= 0: print('found it!')
find always returns an integer, which is nice.

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe

SurgicalOntologist posted:

Regarding that question I asked last week when everyone was like, "You're looking for asynchronous I/O"- I've been reading a bunch on that, and I think I'm learning something. However, pretty much everything I can find--especially beginner guides--frames the issue as one of dealing with slow I/O processes. I have the opposite problem: an extremely fast input process. The use case is that either (a) something should run every time new data comes in, or (b) something needs to reference only the most recent piece of data (b will be far more common than a). I will never have a situation where I'm waiting for data to come in.

I don't doubt, of course, that asyncio can handle this (whether the proposed 3.4 library or the concept in general), I'm just having trouble thinking about it since everything I've read seems to be "how to elegantly get your program to block until your input comes in". Instead, I want to repeatedly ask for input (which requires calling a specific function over and over), but do so in the background without turning every program using this API into one big loop. There's plenty of jargon I haven't figured out yet, but I don't know where to look--I don't feel like I've found the trail. Any suggestions for further reading?

Can you give more details about the exact problem you're trying to do? What is the network input is used for? Is this a command line program or a GUI? What GUI framework?

SurgicalOntologist
Jun 17, 2004

Suspicious Dish posted:

Can you give more details about the exact problem you're trying to do? What is the network input is used for? Is this a command line program or a GUI? What GUI framework?

I'm interfacing with a motion capture device, in an application using pyglet (actually I'm writing more of a framework that interfaces this and other equipment together using pyglet, it's the same problem more or less but I'd like the resulting API to be clean if possible). The device supplies position, angle, velocity, etc that will be used to update objects on the screen. Most commonly, the 3D position will be projected onto the plane of the screen. Basically I'm using a motion capture device to make a big touchscreen.

The device has an API with this example code:

code:
import vrpn

def callback(userdata, data):
    print(userdata, " => ", data);

tracker=vrpn.receiver.Tracker("Tracker0@localhost")
tracker.register_change_handler("position", callback, "position")

while 1:
    tracker.mainloop()
I tried having the callback dispatch a pyglet event, and using the pyglet schedule interval function to call tracker.mainloop, but the program blocks as it repeatedly calls the callback. I think it's too fast, and there's always another event waiting so nothing gets to happen. What I'd like for it to do is keep the most recent data in an attribute as well as store a history, so there would be no need for

This is strange, just tested this now: once I register the event with pyglet, even calling tracker.mainloop once causes the program to block. Not sure why that would happen.

Edit: I figured it out. The device fills the buffer if mainloop isn't repeatedly called, and so when I'm testing in the command line as opposed to a script, there could easily be a lot of data waiting. But it wouldn't have actually blocked forever. So I guess I just need to call mainloop faster than the data is coming in, and my program won't block. (Lightbulb) All this callback stuff with pyglet is pretty much asynchronous I/O already, isn't it?

SurgicalOntologist fucked around with this message at 17:14 on Jan 14, 2014

Opinion Haver
Apr 9, 2007

QuarkJets posted:

find always returns an integer, which is nice.

Yeah, but you could still do this:

code:
if x.nonefind('eggs') is not None: print('found it!')

Dickbutt Ouroboros
Nov 13, 2002

handbandit?
Son of a bitch!

I have a short, stupid question I've come across. The exercise solution for this suggest using multiple if statements, but a while loop seemed like it would also sort of work. For some reason this piece of code will only run 3 times. As soon as heads or tails gets to two it stops. Is it possible to use the and inside of the while statement, or am I missing something?

I see why this method is bad in practice, as the for loop will execute the full 10,000 times even after the while conditions are met. I just want to know why the while statement isn't working.

code:
from random import randint
heads = 0
tails = 0
trials = 10000
numFlips = 0
for counter in range(0,trials):
    while (heads <= 1) and (tails <= 1):
        coinflip = randint(0,1)
        if coinflip == 1:
            heads = heads + 1
            numFlips = numFlips + 1
        else:
            tails = tails + 1
            numFlips = numFlips + 1
    
print heads
print tails
print "It takes an average of {} flips to see both heads and tails.".format(numFlips)  

ManoliIsFat
Oct 4, 2002

It's because you want an OR. If heads is 2 and tails is 0, your while condition will stop being true (2<=1 AND 0<=1 evaluates to FALSE), and thus will break out of the while loop.

Computer viking
May 30, 2011
Now with less breakage.

Could you also do something like this?
while any( (heads, tails) <= 1)):

I can't offhand remember if comparing to a list will do what I hope, or even if "any" is actually a Python function. (I don't have a PC at hand to test right now.)

Computer viking fucked around with this message at 23:04 on Jan 14, 2014

Dickbutt Ouroboros
Nov 13, 2002

handbandit?
Son of a bitch!

Okay, I think I see what you're saying. While can take a boolean value as a trigger. I was looking to drop out of the loop when both statements evaluated to FALSE using the and, but it is seeing it as a single statement.

ManoliIsFat
Oct 4, 2002

handbandit posted:

Okay, I think I see what you're saying. While can take a boolean value as a trigger. I was looking to drop out of the loop when both statements evaluated to FALSE using the and, but it is seeing it as a single statement.
Ya, that's how while loops works. It checks the truth of the statement every time it runs. In English, you're used to casually saying "while heads and tails are less than 1, keep doing this loop", but that's not what your boolean is doing. You're bool is saying "if heads >1 AND tails>1, this returns true. All other combinations return false"

pre:
        
         T <= 1     T > 1
H <= 1     T          F
H > 1      F          F
you want your code to look like this:
code:
from random import randint
heads = 0
tails = 0
trials = 10000
numFlips = 0
for counter in range(0,trials):
    while (heads <= 1) or (tails <= 1):
        coinflip = randint(0,1)
        if coinflip == 1:
            heads = heads + 1
            numFlips = numFlips + 1
        else:
            tails = tails + 1
            numFlips = numFlips + 1
    
print heads
print tails
print "It takes an average of {} flips to see both heads and tails.".format(numFlips)  

Dominoes
Sep 20, 2007

Hey dudes, I'm wondering if y'all know a way to serve a generated text-based file directly from a web server for download, without saving it as a file first.

I have this code, where HttpResponse is a Django object, and xml is an xml ElementTree.
Python code:
    xml = lowfly_code.drx_from_db(Notam) # Also saves 'test.xml'
    response = HttpResponse(FileWrapper(open('test.xml')), content_type='application/xml')
    response['Content-Disposition'] = 'attachment; filename=test.xml'
    return response
I've unsuccessfully experimented with using ET's toString function.

OnceIWasAnOstrich
Jul 22, 2006

Dominoes posted:

I've unsuccessfully experimented with using ET's toString function.

What has been unsuccessful? Are you unable to get a string from elementree or unable to create a response with it. It is possible you are using ET.tostring() on the ElementTree instead of an Element.

Python code:
response_text = ElementTree.tostring(xml.getroot())

Dominoes
Sep 20, 2007

OnceIWasAnOstrich posted:

What has been unsuccessful? Are you unable to get a string from elementree or unable to create a response with it. It is possible you are using ET.tostring() on the ElementTree instead of an Element.

Python code:
response_text = ElementTree.tostring(xml.getroot())
Hey, I think you found the problem. I can't test it now since my code's temporarily broken, but that's consistent with the ElementTree docs; I was indeed using it on a Tree.

edit: You nailed it brother.

Python code:
xml, filename = lowfly_code.drx_from_db(Notam)
texml = ET.tostring(xml.getroot(), encoding='unicode')
response = HttpResponse(FileWrapper(io.StringIO(texml)), content_type='application/xml')
response['Content-Disposition'] = 'attachment; filename="{0}"'.format(filename)
return response

Dominoes fucked around with this message at 19:30 on Jan 15, 2014

OnceIWasAnOstrich
Jul 22, 2006

I can't say I am that familiar with Django so FileWrapper may do something that I am not aware of (maybe it streams the file? but you aren't using a streaming HttpResponse), but it seems like wrapping a wrapper of a string is excessive. HttpResponse can be easily created with just a plain string since are already keeping the whole thing in memory at some point.

Pollyanna
Mar 5, 2005

Milk's on them.


Has anyone here done Shift-JIS decoding/encoding? I have a .bin file encoded with Shift-JIS (Japanese text), and I want to read each byte and translate them to UTF-8 or UTF-16. What I was thinking of was looping through it with file.read(1), then decoding/encoding the selected bytes (if that makes any sense). Would this work? Cause so far I just get strings like '/x00/xac' and I don't know how to change that to readable text. Has this been done before? Should I be reading one or two bytes at a time? What final encoding should I use? How I do :saddowns:

Dominoes
Sep 20, 2007

OnceIWasAnOstrich posted:

I can't say I am that familiar with Django so FileWrapper may do something that I am not aware of (maybe it streams the file? but you aren't using a streaming HttpResponse), but it seems like wrapping a wrapper of a string is excessive. HttpResponse can be easily created with just a plain string since are already keeping the whole thing in memory at some point.
Right again; it still works after removing the FileWrapper.

Luigi Thirty
Apr 30, 2006

Emergency confection port.

Is it possible to get the current Windows sound volume via the Windows API libraries? I'm good at Python but bad at Windows and I'm getting conflicting information on doing it from the Googletron.

BeefofAges
Jun 5, 2004

Cry 'Havoc!', and let slip the cows of war.

Luigi Thirty posted:

Is it possible to get the current Windows sound volume via the Windows API libraries? I'm good at Python but bad at Windows and I'm getting conflicting information on doing it from the Googletron.

https://stackoverflow.com/questions/18112457/python-change-windows-7-master-volume ?

pywin32 or https://pypi.python.org/pypi/WMI/ are probably your best bets if you can find the Windows APIs for audio volume. Apparently the Windows audio volume APIs are cryptic and poorly documented. Good luck!

RyceCube
Dec 22, 2003
I have a bunch of data that looks like this:

code:
{"Game":"Chess","title":"just for fun!","size":"2","entriesData":
["PLAYERNAME","IMAGEHERE"],"entryFee":1,"prizeSummary","gameId":"9436","tableSpecId":"1079","dateUpdated":13898
10809648,"dateCreated":1389659697294,"stack":235,"entryHTML":null}
(added linebreaks to prevent table from breaking)


with basically a bunch of entries one after another all on one line from the website.

I want to parse this data to get playername, game type, etc.

I know I should use the JSON library to accomplish this.

The page I get the code from has a bunch of HTML on it as well. Is it okay to use the json.load on the html, or should I strip that from it first?

I'm not really entirely sure where to begin solving this problem, and am a bit confused by the JSON documentation.

Any tips or hints would be greatly appreciated.

Luigi Thirty
Apr 30, 2006

Emergency confection port.

BeefofAges posted:

https://stackoverflow.com/questions/18112457/python-change-windows-7-master-volume ?

pywin32 or https://pypi.python.org/pypi/WMI/ are probably your best bets if you can find the Windows APIs for audio volume. Apparently the Windows audio volume APIs are cryptic and poorly documented. Good luck!

gently caress that, nevermind. I don't need to know the volume that badly.

Phiberoptik posted:

I have a bunch of data that looks like this:

code:
{"Game":"Chess","title":"just for fun!","size":"2","entriesData":
["PLAYERNAME","IMAGEHERE"],"entryFee":1,"prizeSummary","gameId":"9436","tableSpecId":"1079","dateUpdated":13898
10809648,"dateCreated":1389659697294,"stack":235,"entryHTML":null}
(added linebreaks to prevent table from breaking)


with basically a bunch of entries one after another all on one line from the website.

I want to parse this data to get playername, game type, etc.

I know I should use the JSON library to accomplish this.

The page I get the code from has a bunch of HTML on it as well. Is it okay to use the json.load on the html, or should I strip that from it first?

I'm not really entirely sure where to begin solving this problem, and am a bit confused by the JSON documentation.

Any tips or hints would be greatly appreciated.

You need to get it down to just the JSON data if you want to load it into a JSON library. The HTML will make it barf. You could try clever applications of .split() on the raw page to try to get just the JSON separated out.

Luigi Thirty fucked around with this message at 21:13 on Jan 15, 2014

BeefofAges
Jun 5, 2004

Cry 'Havoc!', and let slip the cows of war.

Phiberoptik posted:

I have a bunch of data that looks like this:

code:
{"Game":"Chess","title":"just for fun!","size":"2","entriesData":
["PLAYERNAME","IMAGEHERE"],"entryFee":1,"prizeSummary","gameId":"9436","tableSpecId":"1079","dateUpdated":13898
10809648,"dateCreated":1389659697294,"stack":235,"entryHTML":null}
(added linebreaks to prevent table from breaking)


with basically a bunch of entries one after another all on one line from the website.

I want to parse this data to get playername, game type, etc.

I know I should use the JSON library to accomplish this.

The page I get the code from has a bunch of HTML on it as well. Is it okay to use the json.load on the html, or should I strip that from it first?

I'm not really entirely sure where to begin solving this problem, and am a bit confused by the JSON documentation.

Any tips or hints would be greatly appreciated.

Is there a different API or endpoint you can call that will just give you the JSON without any HTML?

Trying to parse the JSON out of a bunch of HTML sounds like you're doing it wrong.

Dominoes
Sep 20, 2007

Phiberoptik posted:

I have a bunch of data that looks like this:

code:
{"Game":"Chess","title":"just for fun!","size":"2","entriesData":
["PLAYERNAME","IMAGEHERE"],"entryFee":1,"prizeSummary","gameId":"9436",
"tableSpecId":"1079","dateUpdated":13898
10809648,"dateCreated":1389659697294,"stack":235,"entryHTML":null}
(added linebreaks to prevent table from breaking)
You're on the right track. Splitting the data from the HTML will be the hard part. I've heard good things about Beautiful Soup.

Once you've isolated the data as a string, run json.loads() to turn it into a dict.

Lysidas
Jul 26, 2002

John Diefenbaker is a madman who thinks he's John Diefenbaker.
Pillbug

Pollyanna posted:

Has anyone here done Shift-JIS decoding/encoding? I have a .bin file encoded with Shift-JIS (Japanese text), and I want to read each byte and translate them to UTF-8 or UTF-16. What I was thinking of was looping through it with file.read(1), then decoding/encoding the selected bytes (if that makes any sense). Would this work? Cause so far I just get strings like '/x00/xac' and I don't know how to change that to readable text. Has this been done before? Should I be reading one or two bytes at a time? What final encoding should I use? How I do :saddowns:

How big is the file? You shouldn't read individual bytes for this -- these character encodings are standard in Python and you shouldn't even begin to reimplement them. You can do this line by line or (if the file's small enough) by reading the entire file into memory.

I'm curious about the .bin extension -- does the file only contain Shift-JIS text? If so, you can probably do
Python code:
#!/usr/bin/env python3
from argparse import ArgumentParser

p = ArgumentParser()
p.add_argument('input_file')
p.add_argument('output_file')
args = p.parse_args()

with open(args.input_file, encoding='shift-jis') as i, open(args.output_file, 'w', encoding='utf-8') as o:
    for line in i:
        o.write(line)
EDIT: Note that this is a bad reimplementation of the iconv utility. You should probably just use that.

Lysidas fucked around with this message at 22:14 on Jan 15, 2014

Adbot
ADBOT LOVES YOU

John DiFool
Aug 28, 2013

Phiberoptik posted:

I have a bunch of data that looks like this:

code:
{"Game":"Chess","title":"just for fun!","size":"2","entriesData":
["PLAYERNAME","IMAGEHERE"],"entryFee":1,"prizeSummary","gameId":"9436","tableSpecId":"1079","dateUpdated":13898
10809648,"dateCreated":1389659697294,"stack":235,"entryHTML":null}
(added linebreaks to prevent table from breaking)

...

The page I get the code from has a bunch of HTML on it as well. Is it okay to use the json.load on the html, or should I strip that from it first?

...

Are you sure there isn't anyway to grab that data from the server without the HTML? If that's dynamic data on an HTML page then I wouldn't be surprised if the page uses JavaScript to load the JSON from through some specific URL on the server.

  • Locked thread