Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Locked thread
Tacos Al Pastor
Jun 20, 2003

Pretty basic question:

Im using Python 2.7. How would I keep the zero for input 059 using the following code?

Python code:
txid = str(input('Enter the TXID value: '))
It gives me an invalid token error. I would really like to be able to either strip the leading 0 as this is a decimal number or keep it somehow to pass to a function. It seems like input really doesnt like 0. I've tried converting to an integer as well and no luck there.

Adbot
ADBOT LOVES YOU

SurgicalOntologist
Jun 17, 2004

In python 2.7 you should be using raw_input. input will convert 059 to an int and so the 0 is lost immediately. With raw_input you get a string which you can deal with however you want.

Tacos Al Pastor
Jun 20, 2003

SurgicalOntologist posted:

In python 2.7 you should be using raw_input. input will convert 059 to an int and so the 0 is lost immediately. With raw_input you get a string which you can deal with however you want.


Ahh ok. Makes sense. Thanks so much.

Dominoes
Sep 20, 2007

nvm!

Dominoes
Sep 20, 2007

Hey, so I was trying to build a module to make quick, easy plots earlier, and we had a bit of a discussion. Turns out there's no need - Sympy's plotting module does everything I was trying to build! You just make a sympy expression, import plot from sympy.plotting, enter 'plot(expr)' and you've got an appropriately-ranged plot with axes, with one short line of code.

Sympy is great in general, especially with jupyter qtconsole!


This, every time you make a new plot:
Python code:
x = np.linspace(-tau  tau, 1000)
y = np.sin(x) + 1
plt.plot(x, y)
plt.show()
# Plus more stuff to get axes and labels to show up
is replaced with this:

Python code:
# One-time setup
sympy.var('x')
from sympy.plotting import plot
Python code:
plot(sympy.sin(x) + 1)

Dominoes fucked around with this message at 19:44 on May 26, 2016

onionradish
Jul 6, 2006

That's spicy.
This a really stupid, should-be-simple question and I'm angry that I even have to ask after wasting so much time searching on Google, StackOverflow and Python docs.

How do I decode email subject field values like "=?utf-8?Q?Happy=20Memorial=20Day=21?=" into text/unicode?

Python code:
import poplib, email

user = '***'
password = '***'
server = '***

Mailbox = poplib.POP3(server)
Mailbox.user(user)
Mailbox.pass_(password)
numMessages = len(Mailbox.list()[1])
for i in range(numMessages):
    raw_email  = b"\n".join(Mailbox.retr(i+1)[1])
    parsed_email = email.message_from_string(raw_email)
    print(parsed_email.get('Subject'))  #  ?? WTF

Dominoes
Sep 20, 2007

Python code:
decoded = email.header.decode_header("=?utf-8?Q?Happy=20Memorial=20Day=21?=")
decoded[0][0].decode('utf-8')

onionradish
Jul 6, 2006

That's spicy.

Dominoes posted:

Python code:
decoded = email.header.decode_header("=?utf-8?Q?Happy=20Memorial=20Day=21?=")
decoded[0][0].decode('utf-8')
Yes, exactly! Goddammit and thank you!

Edit: and ideally, should it be:

Python code:
# decoded = [('Happy Memorial Day!', 'utf-8')]
txt, encoding = email.header.decode_header(foo)[0]
txt.decode(encoding)

onionradish fucked around with this message at 20:24 on May 27, 2016

Lumpy
Apr 26, 2002

La! La! La! Laaaa!



College Slice
I searched the thread a bit and didn't see any recent discussion on it, so I shall ask about pulling text from PDF documents. I found a lot of packages that do this, but would love to hear any personal experience with them if people have any. I don't need any sort of images or charts from the docs, just plain old text. Thanks!

Dren
Jan 5, 2001

Pillbug

Lumpy posted:

I searched the thread a bit and didn't see any recent discussion on it, so I shall ask about pulling text from PDF documents. I found a lot of packages that do this, but would love to hear any personal experience with them if people have any. I don't need any sort of images or charts from the docs, just plain old text. Thanks!

it will work with varying degrees of success. Depends on the PDF. Could be the pdf you think is text is actually a bunch of images, in which case you'll get nothing. Could be there is some formatting that the text extractor will inconsistently be tripped up by. pdf is a crazy format.

onionradish
Jul 6, 2006

That's spicy.

Lumpy posted:

I searched the thread a bit and didn't see any recent discussion on it, so I shall ask about pulling text from PDF documents. I found a lot of packages that do this, but would love to hear any personal experience with them if people have any. I don't need any sort of images or charts from the docs, just plain old text. Thanks!

Echoing Dren's response. I've used the pdfminer library -- actually the pdf2txt.py CLI script that comes with it -- to extract text from short document PDFs. The order of various text blocks is often mixed up if there is multi-column content, captions, headings or headers/footers. Line breaks are often "hard breaks" requiring re-wrapping of paragraph text. Almost all of that has to do with the PDF format itself. The CLI script has some parameters that can help rejoin blocks of text.

If you're dealing with relatively consistent types of PDFs, especially if they're short, the cleanup isn't too bad -- it's simpler than cleanup after OCR, for example -- but it all depends on the PDF.

Lumpy
Apr 26, 2002

La! La! La! Laaaa!



College Slice

Dren posted:

it will work with varying degrees of success. Depends on the PDF. Could be the pdf you think is text is actually a bunch of images, in which case you'll get nothing. Could be there is some formatting that the text extractor will inconsistently be tripped up by. pdf is a crazy format.


onionradish posted:

Echoing Dren's response. I've used the pdfminer library -- actually the pdf2txt.py CLI script that comes with it -- to extract text from short document PDFs. The order of various text blocks is often mixed up if there is multi-column content, captions, headings or headers/footers. Line breaks are often "hard breaks" requiring re-wrapping of paragraph text. Almost all of that has to do with the PDF format itself. The CLI script has some parameters that can help rejoin blocks of text.

If you're dealing with relatively consistent types of PDFs, especially if they're short, the cleanup isn't too bad -- it's simpler than cleanup after OCR, for example -- but it all depends on the PDF.

Thanks for your responses. I know that PDFs are no fun, and that I can only strive for a "least worst" solution, but anything better than manually copying and pasting thousands of docs is an improvement!

Dominoes
Sep 20, 2007

Giving a shout-out to Python 3.5's infix matrix multiplication (@). I started learning linear algebra after it was released, and used numpy/sympy with it to help. I just tried to do matrix mult on an older version of python, and it's a PITA! Messy, difficult-to-read, and error-prone when using dot funcs and methods.

Feral Integral
Jun 6, 2006

YOSPOS

So I'm working on a project that is going to soon require me to run long-running calculation jobs which are all viewed/edited/created through a django site. Whatever is going to run the jobs will ideally be able to run multiple jobs at a time without blocking other jobs, as well as be able to give status info about running jobs when asked for it.

Is there a good python framework with lots of included bells and whistles for this kind of thing anyone might recommend, or any suggestions from people that have tackled something similar?

edit: http://python-rq.org/docs/workers/ looks p cool

Feral Integral fucked around with this message at 17:19 on Jun 6, 2016

Space Kablooey
May 6, 2009


Feral Integral posted:

So I'm working on a project that is going to soon require me to run long-running calculation jobs which are all viewed/edited/created through a django site. Whatever is going to run the jobs will ideally be able to run multiple jobs at a time without blocking other jobs, as well as be able to give status info about running jobs when asked for it.

Is there a good python framework with lots of included bells and whistles for this kind of thing anyone might recommend, or any suggestions from people that have tackled something similar?

edit: http://python-rq.org/docs/workers/ looks p cool

We use rq for that and it works nicely. You can also take a look at celery, which is a bit more complex, but it seems more powerful in general.

Stringent
Dec 22, 2004


image text goes here
Seconding celery. It's been extremely reliable and has done everything we've wanted out of the box.

SurgicalOntologist
Jun 17, 2004

I just came back to PyCharm for the first time in a few versions (been in Jupyter more lately) and it's super slow. Every time it does an autocomplete lookup it freezes up for 2-3 seconds and all cores of my CPU go to 100%. Running Fedora 23 on a beefy machine. Tried with the Oracle JRE and it didn't help. Yes, it's done indexing. Has anyone successfully troubleshooted a PyCharm performance issue?

Dominoes
Sep 20, 2007

Clean uninstall / reinstall. I had your issue once; this was my solution.

SurgicalOntologist
Jun 17, 2004

That's how I got the new version in the first place.

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb

SurgicalOntologist posted:

I just came back to PyCharm for the first time in a few versions (been in Jupyter more lately) and it's super slow. Every time it does an autocomplete lookup it freezes up for 2-3 seconds and all cores of my CPU go to 100%. Running Fedora 23 on a beefy machine. Tried with the Oracle JRE and it didn't help. Yes, it's done indexing. Has anyone successfully troubleshooted a PyCharm performance issue?

Have you tried increasing the heap size?

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb
Python on Windows question:

I did a `pip install whatever` and it created a file C:\Python35\Scripts\whatever

In a command prompt I try to run it:
code:
C:\Users\fletcher> whatever
'whatever' is not recognized as an internal or external command,
operable program or batch file.
I double checked and both C:\Python35\Scripts\ and C:\Python35\ is in my %PATH% and .PY is in my %PATHEXT%. I thought that would let me run `whatever` from anywhere, but that doesn't seem to be the case. Haven't had any issues like this on my Linux VMs with this module.

Currently to run it I have to do:
code:
C:\Users\fletcher> python C:\Python35\Scripts\whatever --this --that

Space Kablooey
May 6, 2009


Try installing whatever with pip3. Oh no wait, it actually installed. Dunno then.

accipter
Sep 12, 2003

fletcher posted:

Python on Windows question...

Currently to run it I have to do:
code:
C:\Users\fletcher> python C:\Python35\Scripts\whatever --this --that

Are you trying to run whatever.exe or whatever.py? Is whatever.py considered an executable? Look at the Python on Windows FAQ.

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb

accipter posted:

Are you trying to run whatever.exe or whatever.py? Is whatever.py considered an executable? Look at the Python on Windows FAQ.

It's just called "whatever" and it is executable as far as I can tell

code:
C:\Users\fletcher>ls -la C:\Python35\Scripts
-rwxr-xr-x  1 fletcher Administrators   272 Oct 19  2015 whatever
And the first line of the file is:
code:
#!c:\python35\python.exe
Maybe it just can't handle the lack of a file extension?

OnceIWasAnOstrich
Jul 22, 2006

The shebang does nothing on windows without cygwin. If you are relying on %PATHEXT% your script needs to have the extension...

For .py direct execution on windows you either need a *nix runtime or your Python installation to have added itself as a file association for .py files.

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb

OnceIWasAnOstrich posted:

The shebang does nothing on windows without cygwin. If you are relying on %PATHEXT% your script needs to have the extension...

For .py direct execution on windows you either need a *nix runtime or your Python installation to have added itself as a file association for .py files.

Ahh, got it. That makes sense! Thank you.

Lysidas
Jul 26, 2002

John Diefenbaker is a madman who thinks he's John Diefenbaker.
Pillbug

OnceIWasAnOstrich posted:

The shebang does nothing on windows without cygwin. If you are relying on %PATHEXT% your script needs to have the extension...

For .py direct execution on windows you either need a *nix runtime or your Python installation to have added itself as a file association for .py files.

The contents of the shebang line are actually interpreted by the py launcher included with Python 3.3 and newer, and it will even parse Unix paths to try to guess which version of Python should be used:

https://www.python.org/dev/peps/pep-0397/ posted:

The launcher supports shebang lines referring to Python executables with any of the (regex) prefixes "/usr/bin/", "/usr/local/bin" and "/usr/bin/env *", as well as binaries specified without

For example, a shebang line of '#! /usr/bin/python' should work even though there is unlikely to be an executable in the relative Windows directory "\usr\bin". This means that many scripts can use a single shebang line and be likely to work on both Unix and Windows without modification.

The launcher will support fully-qualified paths to executables. While this will make the script inherently non-portable, it is a feature offered by Unix and would be useful for Windows users in some cases.

(Emphasis mine)

GameCube
Nov 21, 2006

Yeah, but that won't let you just type whatever at the command line, will it? Try python -m whatever.

It seems weird to me that whatever doesn't have an extension. What's the actual pip module?

GameCube fucked around with this message at 05:13 on Jun 11, 2016

Dominoes
Sep 20, 2007

GameCube posted:

Yeah, but that won't let you just type whatever at the command line, will it? Try python -m whatever.

It seems weird to me that whatever doesn't have an extension. What's the actual pip module?

'whatever' sounds like it should be a Haskell datatype!

Lysidas
Jul 26, 2002

John Diefenbaker is a madman who thinks he's John Diefenbaker.
Pillbug

GameCube posted:

Yeah, but that won't let you just type whatever at the command line, will it? Try python -m whatever.

It seems weird to me that whatever doesn't have an extension. What's the actual pip module?

It's been a long time since I've run Python on Windows, but I thought the py launcher is now what was associated with the .py extension and so would be invoked when you double-click or run a .py file from the command line.

In other news, Python 3.6 will include a protocol for specifying that an object is "path-like", and support for this has been added to the builtin open function, among other things :dance: This isn't very exciting on its own, but the implications are quite nice given the number of libraries that have convenience I/O functions that take a filename as an argument:

code:
Python 3.6.0a1+ (default:8ed3880e94e5, Jun 11 2016, 12:26:38) 
[GCC 5.3.1 20160413] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> from pathlib import Path
>>> p = Path('data.csv')
>>> data = pd.read_csv(p)
>>> data.shape
(150, 99)
I've been heavily using pathlib to dynamically create timestamped data and output directories in some analysis code, so each run of a script will plot its figures and dump analysis results in e.g. script_name_20160611-124416, and it'll be really nice to not have to wrap every Path object in str when using matplotlib and pandas output methods.

GameCube
Nov 21, 2006

Huh. I never realized that the windows command prompt would launch documents with their associated handler if you just typed the document name at the command line. The point of the py launcher, as far as I can tell, is to select the correct Python interpreter based on the shebang line, though. If the file doesn't have a .py extension, Windows won't know what application to handle it with.

IAmKale
Jun 7, 2007

やらないか

Fun Shoe
I've known about the existence of Mock and MagicMock for a while but never actually used either for testing. I'm starting a new Rest Framework project now and am writing tests as I code, though, so I figured now would be a good time to get better-acquainted with using them.

Is this a good way to use these to write tests? https://gist.github.com/MasterKale/a3d0ac16a375e772cab48e1e5f1bf2ec

It was definitely easier to mock up properties than to insert and grab a user from the database, but it feels kinda cheaty. Maybe that's the point?

Apocadall
Mar 25, 2010

Aren't you the guitarist for the feed dogs?

I'm trying to make a ping logger in python but it wont log the network host unreachable error.

code:
#!/usr/bin/python
import os
import sys

fsock = open('googlePing.txt', 'w')
sys.stderr = fsock

command0 = os.system('echo "New ping started: $(date)" >> googlePing.txt')
command1 = os.system('ping -i 0.5 8.8.8.8 | while read pong; do echo "$(date): $pong"; done >> googlePing.txt')
I'm sure there is more than a few things I'm doing wrong.

theLamer
Dec 2, 2005
I put the lame in blame
that's more bash than python. You probably want to redirect stderr to stdout by doing '2>&1'

Apocadall
Mar 25, 2010

Aren't you the guitarist for the feed dogs?

theLamer posted:

that's more bash than python. You probably want to redirect stderr to stdout by doing '2>&1'

I added that to the end of the ping before the pipe and removed the fsock stuff, it's working now, thank you!

Dominoes
Sep 20, 2007

Troubleshooting sympy's fancy printing in Windows on jupyter qtconsole. Running sympy.init_printing enables fancy equation printing; in a terminal, this uses an ASCII hack; in qtconsole, it's fancier and prettier. In Ubuntu it works fine for me; mostly works as-is, but needs a Latex installation to make some of the fancier stuff work. In windows it worked mostly fine as well, but wouldn't print matrixes; blank output instead. Installing MiTeX fixed the problem, but it now takes about a second to print anything, and pulls up a few cmd prompt windows, then makes them disappear each time.

1 - How can I fix this?
2 - How does Jupyter interact with your latex installation? In both Win and Ubuntu, it just knows it's there, and somehow uses it, despite Latex being offered in a number of separate software packages.

Dominoes fucked around with this message at 14:10 on Jun 14, 2016

hanyolo
Jul 18, 2013
I am an employee of the Microsoft Gaming Division and they pay me to defend the Xbox One on the Something Awful Forums
Is there a cleaner way of doing this?

Essentially what i'm eventually trying to do is have a .json file that I can load which will dynamically create a bunch of class instances with their saved values, and then when i want to save back to that file read and update all the data from those classes back into the file. The below works but I feel theres probably a better way to do it? (especially if there is 20-30 objects inside the class)

code:
class Player:
        def __init__(self, x, y):
                self.x = x
                self.y = y

if __name__ == '__main__':
        savefile = {'player2': {'posx': 200, 'posy': 200}, 'player1': {'posx': 100, 'posy': 100}}

        print savefile

        people = {}
        for key in savefile:
                people[key] = Player(savefile[key]['posx'], savefile[key]['posy'])

        print people

        people['player1'].x = 300
        people['player1'].y = 300

        for key in people:
                savefile[key]['posx'] = people[key].x
                savefile[key]['posy'] = people[key].y

        print savefile

--- Print Output ---
{'player2': {'posx': 200, 'posy': 200}, 'player1': {'posx': 100, 'posy': 100}}
{'player2': <__main__.Player instance at 0x2b57bd3407a0>, 'player1': <__main__.Player instance at 0x2b57bd3407e8>}
{'player2': {'posx': 200, 'posy': 200}, 'player1': {'posx': 300, 'posy': 300}}

hanyolo fucked around with this message at 02:59 on Jun 23, 2016

SurgicalOntologist
Jun 17, 2004

You want to use JSONEncoder and JSONDecoder.

Python code:
In [1]: import json

In [2]: class Player:
   ....:     def __init__(self, x, y):
   ....:         self.x = x
   ....:         self.y = y
   ....:     @classmethod
   ....:     def from_dict(cls, dict_):
   ....:         return cls(dict_['posx'], dict_['posy'])
   ....:     def __repr__(self):
   ....:         return '{}({}, {})'.format(type(self).__name__, self.x, self.y)
   ....:     def as_dict(self):
   ....:         return {'posx': self.x, 'posy': self.y}

In [3]: player_decoder = json.JSONDecoder(object_hook=Player.from_dict)

In [4]: player_encoder = json.JSONEncoder(default=Player.as_json)

]In [5]: json_data = json.dumps([{'posx': 200, 'posy': 200}, {'posx': 100, 'posy': 100}])

In [6]: json_data
Out[6]: '[{"posy": 200, "posx": 200}, {"posy": 100, "posx": 100}]'

In [7]: players = player_decoder.decode(json_data)

In [8]: players
Out[8]: [Player(200, 200), Player(100, 100)]

In [9]: player_encoder.encode(players)
Out[9]: '[{"posy": 200, "posx": 200}, {"posy": 100, "posx": 100}]'
That may be a little more verbose than necessary but it's how I usually do it. Someone will probably come along with a better way. For example if you make the attributes and dictionary keys match you can probably make a shortcut involving Player.__dict__ universal_encoder = JSONEncoder(default=attrgetter('__dict__')).

By the way, this will be tricky if you want your JSON to be a dict of Players, since the decoder uses the object_hook for every dict. Which is why I used a list here. Since you are using keys like 'player1' and 'player2', a list is more appropriate in any case, as the key only contains ordinal information (i.e. you may as well represent "first player" as players[0] rather than players['player1']). If you want to use real names, there may be a way to get a dict of Players to work, but it may be easier to just use a name attribute/key within the object.

SurgicalOntologist fucked around with this message at 04:44 on Jun 23, 2016

hanyolo
Jul 18, 2013
I am an employee of the Microsoft Gaming Division and they pay me to defend the Xbox One on the Something Awful Forums
Thanks! that makes a bit more sense to me (still reading up on class decorators), and the stackexchange examples I saw used them, but I couldn't really figure out how to fit them towards my use case

The keys will be unique names or ID's, but previously i've used a name key within the object so using a list definitely seems like a better idea. I would intend the have the attributes and dictionary keys/attributes match as well, just seems silly not to.

Adbot
ADBOT LOVES YOU

SurgicalOntologist
Jun 17, 2004

hanyolo posted:

Thanks! that makes a bit more sense to me (still reading up on class decorators), and the stackexchange examples I saw used them, but I couldn't really figure out how to fit them towards my use case

The idea of a classmethod is in contrast to a regular method taking as the first argument the object instance, it takes the class itself. So you would call it without a specific player in mind, Player.some_class_method() rather than some_player.some_regular_method().

99% of the time I use a classmethod it's for making an alternate object constructor. So in that example, you could create a player with Player(100, 100) or Player.from_dict({'posx': 100, 'posy': 100}). I think of it like an alternative to __init__.

The use case of an alternative constructor when trying to make a JSONDecoder is just as in my example, to make a constructor that takes a dict. Since that is what JSONDecoder needs: it calls whatever you passed in as object_hook when it encounters a JSON dict, passing it the decoded dict. Every time it encounters a dict, it calls Player.from_dict, giving you a Player rather than a dict.

If you wanted to use a dict of players, the solution would be something like this:

Python code:
def flexible_object_hook(dict_):
    if 'posx' in dict_ and 'posy' in dict_:
        return Player.from_dict(dict_)
    else:
        return dict_

SurgicalOntologist fucked around with this message at 05:10 on Jun 23, 2016

  • Locked thread