Python information and short questions megathread.

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Python information and short questions megathread.

«‹›484 »

Tacos Al Pastor: Jun 20, 2003

Pretty basic question:

Im using Python 2.7. How would I keep the zero for input 059 using the following code?

Python code:

txid = str(input('Enter the TXID value: '))

It gives me an invalid token error. I would really like to be able to either strip the leading 0 as this is a decimal number or keep it somehow to pass to a function. It seems like input really doesnt like 0. I've tried converting to an integer as well and no luck there.

# ? May 20, 2016 20:01

Adbot: ADBOT LOVES YOU

# ? Jun 9, 2024 23:39

SurgicalOntologist: Jun 17, 2004

In python 2.7 you should be using raw_input. input will convert 059 to an int and so the 0 is lost immediately. With raw_input you get a string which you can deal with however you want.

# ? May 20, 2016 20:11

Tacos Al Pastor: Jun 20, 2003

SurgicalOntologist posted:

In python 2.7 you should be using raw_input. input will convert 059 to an int and so the 0 is lost immediately. With raw_input you get a string which you can deal with however you want.

Ahh ok. Makes sense. Thanks so much.

# ? May 20, 2016 20:16

Dominoes: Sep 20, 2007

nvm!

# ? May 26, 2016 05:24

Dominoes: Sep 20, 2007

Hey, so I was trying to build a module to make quick, easy plots earlier, and we had a bit of a discussion. Turns out there's no need - Sympy's plotting module does everything I was trying to build! You just make a sympy expression, import plot from sympy.plotting, enter 'plot(expr)' and you've got an appropriately-ranged plot with axes, with one short line of code.

Sympy is great in general, especially with jupyter qtconsole!

This, every time you make a new plot:

Python code:

x = np.linspace(-tau  tau, 1000)
y = np.sin(x) + 1
plt.plot(x, y)
plt.show()
# Plus more stuff to get axes and labels to show up

is replaced with this:

Python code:

# One-time setup
sympy.var('x')
from sympy.plotting import plot

Python code:

plot(sympy.sin(x) + 1)

Dominoes fucked around with this message at 19:44 on May 26, 2016

# ? May 26, 2016 19:35

onionradish: Jul 6, 2006; That's spicy.

This a really stupid, should-be-simple question and I'm angry that I even have to ask after wasting so much time searching on Google, StackOverflow and Python docs.

How do I decode email subject field values like "=?utf-8?Q?Happy=20Memorial=20Day=21?=" into text/unicode?

Python code:

import poplib, email

user = '***'
password = '***'
server = '***

Mailbox = poplib.POP3(server)
Mailbox.user(user)
Mailbox.pass_(password)
numMessages = len(Mailbox.list()[1])
for i in range(numMessages):
    raw_email  = b"\n".join(Mailbox.retr(i+1)[1])
    parsed_email = email.message_from_string(raw_email)
    print(parsed_email.get('Subject'))  #  ?? WTF

# ? May 27, 2016 19:53

Dominoes: Sep 20, 2007

Python code:

decoded = email.header.decode_header("=?utf-8?Q?Happy=20Memorial=20Day=21?=")
decoded[0][0].decode('utf-8')

# ? May 27, 2016 20:11

onionradish: Jul 6, 2006; That's spicy.

Dominoes posted:

Python code:

decoded = email.header.decode_header("=?utf-8?Q?Happy=20Memorial=20Day=21?=")
decoded[0][0].decode('utf-8')

Yes, exactly! Goddammit and thank you!

Edit: and ideally, should it be:

Python code:

# decoded = [('Happy Memorial Day!', 'utf-8')]
txt, encoding = email.header.decode_header(foo)[0]
txt.decode(encoding)

onionradish fucked around with this message at 20:24 on May 27, 2016

# ? May 27, 2016 20:19

Lumpy: Apr 26, 2002; La! La! La! Laaaa!; College Slice

I searched the thread a bit and didn't see any recent discussion on it, so I shall ask about pulling text from PDF documents. I found a lot of packages that do this, but would love to hear any personal experience with them if people have any. I don't need any sort of images or charts from the docs, just plain old text. Thanks!

# ? May 31, 2016 17:01

Dren: Jan 5, 2001; Pillbug

Lumpy posted:

I searched the thread a bit and didn't see any recent discussion on it, so I shall ask about pulling text from PDF documents. I found a lot of packages that do this, but would love to hear any personal experience with them if people have any. I don't need any sort of images or charts from the docs, just plain old text. Thanks!

it will work with varying degrees of success. Depends on the PDF. Could be the pdf you think is text is actually a bunch of images, in which case you'll get nothing. Could be there is some formatting that the text extractor will inconsistently be tripped up by. pdf is a crazy format.

# ? May 31, 2016 22:41

onionradish: Jul 6, 2006; That's spicy.

Lumpy posted:

I searched the thread a bit and didn't see any recent discussion on it, so I shall ask about pulling text from PDF documents. I found a lot of packages that do this, but would love to hear any personal experience with them if people have any. I don't need any sort of images or charts from the docs, just plain old text. Thanks!

Echoing Dren's response. I've used the pdfminer library -- actually the pdf2txt.py CLI script that comes with it -- to extract text from short document PDFs. The order of various text blocks is often mixed up if there is multi-column content, captions, headings or headers/footers. Line breaks are often "hard breaks" requiring re-wrapping of paragraph text. Almost all of that has to do with the PDF format itself. The CLI script has some parameters that can help rejoin blocks of text.

If you're dealing with relatively consistent types of PDFs, especially if they're short, the cleanup isn't too bad -- it's simpler than cleanup after OCR, for example -- but it all depends on the PDF.

# ? Jun 2, 2016 03:09

Lumpy: Apr 26, 2002; La! La! La! Laaaa!; College Slice

Dren posted:

it will work with varying degrees of success. Depends on the PDF. Could be the pdf you think is text is actually a bunch of images, in which case you'll get nothing. Could be there is some formatting that the text extractor will inconsistently be tripped up by. pdf is a crazy format.

onionradish posted:

Echoing Dren's response. I've used the pdfminer library -- actually the pdf2txt.py CLI script that comes with it -- to extract text from short document PDFs. The order of various text blocks is often mixed up if there is multi-column content, captions, headings or headers/footers. Line breaks are often "hard breaks" requiring re-wrapping of paragraph text. Almost all of that has to do with the PDF format itself. The CLI script has some parameters that can help rejoin blocks of text.

If you're dealing with relatively consistent types of PDFs, especially if they're short, the cleanup isn't too bad -- it's simpler than cleanup after OCR, for example -- but it all depends on the PDF.

Thanks for your responses. I know that PDFs are no fun, and that I can only strive for a "least worst" solution, but anything better than manually copying and pasting thousands of docs is an improvement!

# ? Jun 2, 2016 16:43

Dominoes: Sep 20, 2007

Giving a shout-out to Python 3.5's infix matrix multiplication (@). I started learning linear algebra after it was released, and used numpy/sympy with it to help. I just tried to do matrix mult on an older version of python, and it's a PITA! Messy, difficult-to-read, and error-prone when using dot funcs and methods.

# ? Jun 6, 2016 07:39

Feral Integral: Jun 6, 2006; YOSPOS

So I'm working on a project that is going to soon require me to run long-running calculation jobs which are all viewed/edited/created through a django site. Whatever is going to run the jobs will ideally be able to run multiple jobs at a time without blocking other jobs, as well as be able to give status info about running jobs when asked for it.

Is there a good python framework with lots of included bells and whistles for this kind of thing anyone might recommend, or any suggestions from people that have tackled something similar?

edit: http://python-rq.org/docs/workers/ looks p cool

Feral Integral fucked around with this message at 17:19 on Jun 6, 2016

# ? Jun 6, 2016 17:14

Space Kablooey: May 6, 2009

Feral Integral posted:

So I'm working on a project that is going to soon require me to run long-running calculation jobs which are all viewed/edited/created through a django site. Whatever is going to run the jobs will ideally be able to run multiple jobs at a time without blocking other jobs, as well as be able to give status info about running jobs when asked for it.

Is there a good python framework with lots of included bells and whistles for this kind of thing anyone might recommend, or any suggestions from people that have tackled something similar?

edit: http://python-rq.org/docs/workers/ looks p cool

We use rq for that and it works nicely. You can also take a look at celery, which is a bit more complex, but it seems more powerful in general.

# ? Jun 6, 2016 17:50

Stringent: Dec 22, 2004; image text goes here

Seconding celery. It's been extremely reliable and has done everything we've wanted out of the box.

# ? Jun 6, 2016 23:41

SurgicalOntologist: Jun 17, 2004

I just came back to PyCharm for the first time in a few versions (been in Jupyter more lately) and it's super slow. Every time it does an autocomplete lookup it freezes up for 2-3 seconds and all cores of my CPU go to 100%. Running Fedora 23 on a beefy machine. Tried with the Oracle JRE and it didn't help. Yes, it's done indexing. Has anyone successfully troubleshooted a PyCharm performance issue?

# ? Jun 7, 2016 20:45

Dominoes: Sep 20, 2007

Clean uninstall / reinstall. I had your issue once; this was my solution.

# ? Jun 7, 2016 20:55

SurgicalOntologist: Jun 17, 2004

That's how I got the new version in the first place.

# ? Jun 7, 2016 20:58

fletcher: Jun 27, 2003; ken park is my favorite movie; Cybernetic Crumb

SurgicalOntologist posted:

I just came back to PyCharm for the first time in a few versions (been in Jupyter more lately) and it's super slow. Every time it does an autocomplete lookup it freezes up for 2-3 seconds and all cores of my CPU go to 100%. Running Fedora 23 on a beefy machine. Tried with the Oracle JRE and it didn't help. Yes, it's done indexing. Has anyone successfully troubleshooted a PyCharm performance issue?

Have you tried increasing the heap size?

# ? Jun 10, 2016 00:24

fletcher: Jun 27, 2003; ken park is my favorite movie; Cybernetic Crumb

Python on Windows question:

I did a `pip install whatever` and it created a file C:\Python35\Scripts\whatever

In a command prompt I try to run it:

code:

C:\Users\fletcher> whatever
'whatever' is not recognized as an internal or external command,
operable program or batch file.

I double checked and both C:\Python35\Scripts\ and C:\Python35\ is in my %PATH% and .PY is in my %PATHEXT%. I thought that would let me run `whatever` from anywhere, but that doesn't seem to be the case. Haven't had any issues like this on my Linux VMs with this module.

Currently to run it I have to do:

code:

C:\Users\fletcher> python C:\Python35\Scripts\whatever --this --that

# ? Jun 10, 2016 00:27

Space Kablooey: May 6, 2009

~~Try installing whatever with pip3.~~ Oh no wait, it actually installed. Dunno then.

# ? Jun 10, 2016 21:02

accipter: Sep 12, 2003

fletcher posted:

Python on Windows question...

Currently to run it I have to do:
code:
C:\Users\fletcher> python C:\Python35\Scripts\whatever --this --that

Are you trying to run whatever.exe or whatever.py? Is whatever.py considered an executable? Look at the Python on Windows FAQ.

# ? Jun 10, 2016 23:41

fletcher: Jun 27, 2003; ken park is my favorite movie; Cybernetic Crumb

accipter posted:

Are you trying to run whatever.exe or whatever.py? Is whatever.py considered an executable? Look at the Python on Windows FAQ.

It's just called "whatever" and it is executable as far as I can tell

code:

C:\Users\fletcher>ls -la C:\Python35\Scripts
-rwxr-xr-x  1 fletcher Administrators   272 Oct 19  2015 whatever

And the first line of the file is:

code:

#!c:\python35\python.exe

Maybe it just can't handle the lack of a file extension?

# ? Jun 10, 2016 23:50

OnceIWasAnOstrich: Jul 22, 2006

The shebang does nothing on windows without cygwin. If you are relying on %PATHEXT% your script needs to have the extension...

For .py direct execution on windows you either need a *nix runtime or your Python installation to have added itself as a file association for .py files.

# ? Jun 11, 2016 01:06

fletcher: Jun 27, 2003; ken park is my favorite movie; Cybernetic Crumb

OnceIWasAnOstrich posted:

The shebang does nothing on windows without cygwin. If you are relying on %PATHEXT% your script needs to have the extension...

For .py direct execution on windows you either need a *nix runtime or your Python installation to have added itself as a file association for .py files.

Ahh, got it. That makes sense! Thank you.

# ? Jun 11, 2016 01:11

Lysidas: Jul 26, 2002; John Diefenbaker is a madman who thinks he's John Diefenbaker.; Pillbug

OnceIWasAnOstrich posted:

The shebang does nothing on windows without cygwin. If you are relying on %PATHEXT% your script needs to have the extension...

For .py direct execution on windows you either need a *nix runtime or your Python installation to have added itself as a file association for .py files.

The contents of the shebang line are actually interpreted by the py launcher included with Python 3.3 and newer, and it will even parse Unix paths to try to guess which version of Python should be used:

https://www.python.org/dev/peps/pep-0397/ posted:

The launcher supports shebang lines referring to Python executables with any of the (regex) prefixes "/usr/bin/", "/usr/local/bin" and "/usr/bin/env *", as well as binaries specified without

For example, a shebang line of '#! /usr/bin/python' should work even though there is unlikely to be an executable in the relative Windows directory "\usr\bin". This means that many scripts can use a single shebang line and be likely to work on both Unix and Windows without modification.

The launcher will support fully-qualified paths to executables. While this will make the script inherently non-portable, it is a feature offered by Unix and would be useful for Windows users in some cases.

(Emphasis mine)

# ? Jun 11, 2016 03:48

GameCube: Nov 21, 2006

Yeah, but that won't let you just type whatever at the command line, will it? Try python -m whatever.

It seems weird to me that whatever doesn't have an extension. What's the actual pip module?

GameCube fucked around with this message at 05:13 on Jun 11, 2016

# ? Jun 11, 2016 05:09

Dominoes: Sep 20, 2007

GameCube posted:

Yeah, but that won't let you just type whatever at the command line, will it? Try python -m whatever.

It seems weird to me that whatever doesn't have an extension. What's the actual pip module?

'whatever' sounds like it should be a Haskell datatype!

# ? Jun 11, 2016 05:19

Lysidas: Jul 26, 2002; John Diefenbaker is a madman who thinks he's John Diefenbaker.; Pillbug

GameCube posted:

Yeah, but that won't let you just type whatever at the command line, will it? Try python -m whatever.

It seems weird to me that whatever doesn't have an extension. What's the actual pip module?

It's been a long time since I've run Python on Windows, but I thought the py launcher is now what was associated with the .py extension and so would be invoked when you double-click or run a .py file from the command line.

In other news, Python 3.6 will include a protocol for specifying that an object is "path-like", and support for this has been added to the builtin open function, among other things :dance:

This isn't very exciting on its own, but the implications are quite nice given the number of libraries that have convenience I/O functions that take a filename as an argument:

code:

Python 3.6.0a1+ (default:8ed3880e94e5, Jun 11 2016, 12:26:38) 
[GCC 5.3.1 20160413] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> from pathlib import Path
>>> p = Path('data.csv')
>>> data = pd.read_csv(p)
>>> data.shape
(150, 99)

I've been heavily using pathlib to dynamically create timestamped data and output directories in some analysis code, so each run of a script will plot its figures and dump analysis results in e.g. script_name_20160611-124416, and it'll be really nice to not have to wrap every Path object in str when using matplotlib and pandas output methods.

# ? Jun 11, 2016 17:46

GameCube: Nov 21, 2006

Huh. I never realized that the windows command prompt would launch documents with their associated handler if you just typed the document name at the command line. The point of the py launcher, as far as I can tell, is to select the correct Python interpreter based on the shebang line, though. If the file doesn't have a .py extension, Windows won't know what application to handle it with.

# ? Jun 11, 2016 21:33

IAmKale: Jun 7, 2007; やらないか; Fun Shoe

I've known about the existence of Mock and MagicMock for a while but never actually used either for testing. I'm starting a new Rest Framework project now and am writing tests as I code, though, so I figured now would be a good time to get better-acquainted with using them.

Is this a good way to use these to write tests? https://gist.github.com/MasterKale/a3d0ac16a375e772cab48e1e5f1bf2ec

It was definitely easier to mock up properties than to insert and grab a user from the database, but it feels kinda cheaty. Maybe that's the point?

# ? Jun 12, 2016 15:27

Apocadall: Mar 25, 2010; Aren't you the guitarist for the feed dogs?

I'm trying to make a ping logger in python but it wont log the network host unreachable error.

code:

#!/usr/bin/python
import os
import sys

fsock = open('googlePing.txt', 'w')
sys.stderr = fsock

command0 = os.system('echo "New ping started: $(date)" >> googlePing.txt')
command1 = os.system('ping -i 0.5 8.8.8.8 | while read pong; do echo "$(date): $pong"; done >> googlePing.txt')

I'm sure there is more than a few things I'm doing wrong.

# ? Jun 13, 2016 02:11

theLamer: Dec 2, 2005; I put the lame in blame

that's more bash than python. You probably want to redirect stderr to stdout by doing '2>&1'

# ? Jun 13, 2016 02:35

Apocadall: Mar 25, 2010; Aren't you the guitarist for the feed dogs?

theLamer posted:

that's more bash than python. You probably want to redirect stderr to stdout by doing '2>&1'

I added that to the end of the ping before the pipe and removed the fsock stuff, it's working now, thank you!

# ? Jun 13, 2016 02:45

Dominoes: Sep 20, 2007

Troubleshooting sympy's fancy printing in Windows on jupyter qtconsole. Running sympy.init_printing enables fancy equation printing; in a terminal, this uses an ASCII hack; in qtconsole, it's fancier and prettier. In Ubuntu it works fine for me; mostly works as-is, but needs a Latex installation to make some of the fancier stuff work. In windows it worked mostly fine as well, but wouldn't print matrixes; blank output instead. Installing MiTeX fixed the problem, but it now takes about a second to print anything, and pulls up a few cmd prompt windows, then makes them disappear each time.

1 - How can I fix this?
2 - How does Jupyter interact with your latex installation? In both Win and Ubuntu, it just knows it's there, and somehow uses it, despite Latex being offered in a number of separate software packages.

Dominoes fucked around with this message at 14:10 on Jun 14, 2016

# ? Jun 14, 2016 14:08

hanyolo: Jul 18, 2013; I am an employee of the Microsoft Gaming Division and they pay me to defend the Xbox One on the Something Awful Forums

Is there a cleaner way of doing this?

Essentially what i'm eventually trying to do is have a .json file that I can load which will dynamically create a bunch of class instances with their saved values, and then when i want to save back to that file read and update all the data from those classes back into the file. The below works but I feel theres probably a better way to do it? (especially if there is 20-30 objects inside the class)

code:

class Player:
        def __init__(self, x, y):
                self.x = x
                self.y = y

if __name__ == '__main__':
        savefile = {'player2': {'posx': 200, 'posy': 200}, 'player1': {'posx': 100, 'posy': 100}}

        print savefile

        people = {}
        for key in savefile:
                people[key] = Player(savefile[key]['posx'], savefile[key]['posy'])

        print people

        people['player1'].x = 300
        people['player1'].y = 300

        for key in people:
                savefile[key]['posx'] = people[key].x
                savefile[key]['posy'] = people[key].y

        print savefile

--- Print Output ---
{'player2': {'posx': 200, 'posy': 200}, 'player1': {'posx': 100, 'posy': 100}}
{'player2': <__main__.Player instance at 0x2b57bd3407a0>, 'player1': <__main__.Player instance at 0x2b57bd3407e8>}
{'player2': {'posx': 200, 'posy': 200}, 'player1': {'posx': 300, 'posy': 300}}

hanyolo fucked around with this message at 02:59 on Jun 23, 2016

# ? Jun 23, 2016 02:57

SurgicalOntologist: Jun 17, 2004

You want to use JSONEncoder and JSONDecoder.

Python code:

In [1]: import json

In [2]: class Player:
   ....:     def __init__(self, x, y):
   ....:         self.x = x
   ....:         self.y = y
   ....:     @classmethod
   ....:     def from_dict(cls, dict_):
   ....:         return cls(dict_['posx'], dict_['posy'])
   ....:     def __repr__(self):
   ....:         return '{}({}, {})'.format(type(self).__name__, self.x, self.y)
   ....:     def as_dict(self):
   ....:         return {'posx': self.x, 'posy': self.y}

In [3]: player_decoder = json.JSONDecoder(object_hook=Player.from_dict)

In [4]: player_encoder = json.JSONEncoder(default=Player.as_json)

]In [5]: json_data = json.dumps([{'posx': 200, 'posy': 200}, {'posx': 100, 'posy': 100}])

In [6]: json_data
Out[6]: '[{"posy": 200, "posx": 200}, {"posy": 100, "posx": 100}]'

In [7]: players = player_decoder.decode(json_data)

In [8]: players
Out[8]: [Player(200, 200), Player(100, 100)]

In [9]: player_encoder.encode(players)
Out[9]: '[{"posy": 200, "posx": 200}, {"posy": 100, "posx": 100}]'

That may be a little more verbose than necessary but it's how I usually do it. Someone will probably come along with a better way. For example if you make the attributes and dictionary keys match you can probably make a shortcut involving Player.__dict__ universal_encoder = JSONEncoder(default=attrgetter('__dict__')).

By the way, this will be tricky if you want your JSON to be a dict of Players, since the decoder uses the object_hook for every dict. Which is why I used a list here. Since you are using keys like 'player1' and 'player2', a list is more appropriate in any case, as the key only contains ordinal information (i.e. you may as well represent "first player" as players[0] rather than players['player1']). If you want to use real names, there may be a way to get a dict of Players to work, but it may be easier to just use a name attribute/key within the object.

SurgicalOntologist fucked around with this message at 04:44 on Jun 23, 2016

# ? Jun 23, 2016 04:33

hanyolo: Jul 18, 2013; I am an employee of the Microsoft Gaming Division and they pay me to defend the Xbox One on the Something Awful Forums

Thanks! that makes a bit more sense to me (still reading up on class decorators), and the stackexchange examples I saw used them, but I couldn't really figure out how to fit them towards my use case

The keys will be unique names or ID's, but previously i've used a name key within the object so using a list definitely seems like a better idea. I would intend the have the attributes and dictionary keys/attributes match as well, just seems silly not to.

# ? Jun 23, 2016 04:52

Adbot: ADBOT LOVES YOU

# ? Jun 9, 2024 23:39

SurgicalOntologist: Jun 17, 2004

hanyolo posted:

Thanks! that makes a bit more sense to me (still reading up on class decorators), and the stackexchange examples I saw used them, but I couldn't really figure out how to fit them towards my use case

The idea of a classmethod is in contrast to a regular method taking as the first argument the object instance, it takes the class itself. So you would call it without a specific player in mind, Player.some_class_method() rather than some_player.some_regular_method().

99% of the time I use a classmethod it's for making an alternate object constructor. So in that example, you could create a player with Player(100, 100) or Player.from_dict({'posx': 100, 'posy': 100}). I think of it like an alternative to __init__.

The use case of an alternative constructor when trying to make a JSONDecoder is just as in my example, to make a constructor that takes a dict. Since that is what JSONDecoder needs: it calls whatever you passed in as object_hook when it encounters a JSON dict, passing it the decoded dict. Every time it encounters a dict, it calls Player.from_dict, giving you a Player rather than a dict.

If you wanted to use a dict of players, the solution would be something like this:

Python code:

def flexible_object_hook(dict_):
    if 'posx' in dict_ and 'posy' in dict_:
        return Player.from_dict(dict_)
    else:
        return dict_

SurgicalOntologist fucked around with this message at 05:10 on Jun 23, 2016

# ? Jun 23, 2016 05:05

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Python information and short questions megathread.

«‹›484 »