Python information and short questions megathread.

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Python information and short questions megathread.

«‹›484 »

fletcher: Jun 27, 2003; ken park is my favorite movie; Cybernetic Crumb

I made a simple shell script that will create a virtualenv, activate it, install requirements.txt, run unittests, and then tear down the virtualenv.

I setup a jenkins job to run this script, but it's not finding any tests:

code:

+ python -m unittest -v my_company.test_thing

----------------------------------------------------------------------
Ran 0 tests in 0.000s

If I SSH into the jenkins machine and cd to the workspace for this job, I can run the script just fine and it executes the tests. Why isn't the jenkins job able to do the same?

# ? Sep 17, 2014 21:45

Adbot: ADBOT LOVES YOU

# ? May 8, 2024 20:56

BeefofAges: Jun 5, 2004; Cry 'Havoc!', and let slip the cows of war.

I don't have an answer specifically for your question, but I'd like to note that there's a Jenkins virtualenv plugin that does all of the setup and dependency management for you.

# ? Sep 17, 2014 23:27

fletcher: Jun 27, 2003; ken park is my favorite movie; Cybernetic Crumb

BeefofAges posted:

I don't have an answer specifically for your question, but I'd like to note that there's a Jenkins virtualenv plugin that does all of the setup and dependency management for you.

This one right? https://github.com/jenkinsci/shiningpanda-plugin

The last couple plugins I have tried have been awful...I was hoping not to have to use a plugin unless absolutely necessary since the quality is just so hit and miss. What I'm trying to do seems like it would be really simple without having to use a plugin. edit: Plus it's be easy for other devs to run the job like jenkins is going to run it, since it's just a bash script that lives with the project in source control

Have you used that shiningpanda one with success?

fletcher fucked around with this message at 00:04 on Sep 18, 2014

# ? Sep 18, 2014 00:01

fletcher: Jun 27, 2003; ken park is my favorite movie; Cybernetic Crumb

Figured it out. It was the PATH environment variable that Jenkins was using. It didn't have /usr/local/bin on it (where Python 2.7.8 is installed) so I think it was trying to use some older Python 2.6 in /usr/bin

# ? Sep 18, 2014 00:56

suffix: Jul 27, 2013; Wheeee!

FoiledAgain posted:

Question about pickles and bytes. I have the following code, which works.
code:
class Unpickler(pickle.Unpickler):

    def __init__(self, file):
        self.data = io.BytesIO(file.read())
        super(pickle.Unpickler, self).__init__(self.data)

    def load(self):
    #This overrides the original Unpickler.load() function

        try:
            while 1:
    	        bite = self.data.read(1)
	        dispatch[bite[0]](self) #does something appropriate with the current byte

        except pickle._Stop as stopinst:
            #success!
            return stopinst.value
However, this minimally different load() function does not work:
code:
def load(self):
    try:
        for bite in self.data:
            dispatch[bite[0]](self)
    except pickle._Stop as stopinst:
        #success!
        return stopinst.value
The error I am getting is raised in pickle.py, and it says "Wrong protocol number: 67". My understanding is that for a bytes object b, b[0] is an int, so I'm guessing that the 67 is one of these ints that is not being treated correctly by pickle. But how come calling self.data.read(1) doesn't cause the same problem? Does the for-loop not iterate over chunks of the same size as what read(1) does?

IOBase objects like BytesIO are iterated line by line.

Python code:

>>> list(io.BytesIO(b'one\ntwo\nthree\n'))
[b'one\n', b'two\n', b'three\n']

So the first version reads in one byte, while the second reads up to a newline.

I'm curious what you are doing here though, since this doesn't look like a method you would normally need to override. If you need to control the unpickling of objects, maybe you could use __setstate__() instead?

# ? Sep 18, 2014 01:37

Hughmoris: Apr 21, 2007; Let's go to the abyss!

I need some help. I'm trying to parse a simple webpage:

http://www.nlm.nih.gov/medlineplus/abdominalpain.html

There is a summary of abdominal pain that I want to grab and print to console. The summary is after the div tag:

code:

<div id="tpsummary">stuff</div>

I've adjusted a script from a tutorial I've watched to try and snag that information but the only output I receive is "[]". What am I doing wrong? I'm assuming that I'm jacking up the regex since I know jack about re.

code:

import urllib
import re


webpage = urllib.urlopen('http://www.nlm.nih.gov/medlineplus/abdominalpain.html').read()

regex = '<div id="tpsummary">(.*)</div>'
pattern = re.compile(regex)

summary = re.findall(pattern, webpage)
print summary

# ? Sep 18, 2014 03:04

KICK BAMA KICK: Mar 2, 2009

HTMLParser might be more useful than doing that manually with a regex. Here's a very similar example.

If you ever do need to work out a regex, Regex101 is very helpful.

# ? Sep 18, 2014 03:14

Hughmoris: Apr 21, 2007; Let's go to the abyss!

I've made progress with this:

code:

import bs4
import urllib

webpage = urllib.urlopen('http://www.nlm.nih.gov/medlineplus/abdominalpain.html')

soup = BeautifulSoup(webpage) #no clue what this does, copied it from a tutorial
summary = soup.find("div", {"id":"tpsummary"})

print summary

That will print out the information I want but it includes all of the html tags within that <div>. Is there an easy way to remove the html tags from SUMMARY and print just the text?

# ? Sep 18, 2014 03:23

KernelSlanders: May 27, 2013; Rogue operating systems on occasion spread lies and rumors about me.

Just chiming in to echo that HTMLParser is amazing and you should use it.

e: ^^ BeutifulSoup is also very nice, although they are quite different in the way you use them.

# ? Sep 18, 2014 03:24

Lyon: Apr 17, 2003

Hughmoris posted:

I've made progress with this:
code:
import bs4
import urllib

webpage = urllib.urlopen('http://www.nlm.nih.gov/medlineplus/abdominalpain.html')

soup = BeautifulSoup(webpage) #no clue what this does, copied it from a tutorial
summary = soup.find("div", {"id":"tpsummary"})

print summary
That will print out the information I want but it includes all of the html tags within that <div>. Is there an easy way to remove the html tags from SUMMARY and print just the text?

Does beautiful soup support xpath? Google xpath and read the tutorial on w3school or whatever that place is. If beautiful soup doesn't support it switch to lxml maybe?

# ? Sep 18, 2014 03:42

BeefofAges: Jun 5, 2004; Cry 'Havoc!', and let slip the cows of war.

fletcher posted:

This one right? https://github.com/jenkinsci/shiningpanda-plugin

The last couple plugins I have tried have been awful...I was hoping not to have to use a plugin unless absolutely necessary since the quality is just so hit and miss. What I'm trying to do seems like it would be really simple without having to use a plugin. edit: Plus it's be easy for other devs to run the job like jenkins is going to run it, since it's just a bash script that lives with the project in source control

Have you used that shiningpanda one with success?

Yeah, that's the one I use. It works great.

# ? Sep 18, 2014 04:43

Hughmoris: Apr 21, 2007; Let's go to the abyss!

KICK BAMA KICK posted:

HTMLParser might be more useful than doing that manually with a regex. Here's a very similar example.

If you ever do need to work out a regex, Regex101 is very helpful.

Thanks for these. Regex has always kicked my rear end but I've spent some effort trying to learn it now, and its still kicking my rear end.

As a simple experiment, I'm trying to pull the zipcode from this page with this code:
http://www.nlm.nih.gov/medlineplus/abdominalpain.html

code:

import re
import bs4
import urllib

webpage = urllib.urlopen('http://www.nlm.nih.gov/medlineplus/abdominalpain.html').read()

regex = '\d\d\d\d\d'
pattern = re.compile(regex)

result = re.match(pattern, webpage)
print result

I've tried every variation I can think of, and yet my RESULT still equals NONE when I print it. The regex expression is working on the regex website you linked.

Any ideas where I'm going wrong?

*EDIT
I modified the code and instead of using re.match(pattern, webpage) I use re.findall(pattern, webpage) and it works. Not sure I really understand the difference between re.match and re.findall

Hughmoris fucked around with this message at 05:15 on Sep 18, 2014

# ? Sep 18, 2014 05:06

emoji: Jun 4, 2004

I'm having a strange issue rounding numbers in pandas and exporting to dict.

my_series is the result of some aggregations on a dataframe with .astype('float') applied at the end.

Unrounded:

Python code:

my_series.unstack(level=0).to_dict()
{1410843600: {u'a': 0.081347309478284627,
  u'b': 0.035099699535645998,
  u'c': 0.61429595738869158,
  u'd': 0.019871619776017483,
  u'e': 0.24938541382136029},
 1410930000: {u'a': 0.074382538770821363,
  u'b': 0.039919586444572087,
  u'c': 0.59180547578020293,
  u'd': 0.017710128278766991,
  u'e': 0.27618227072563661}
...}

No arguments to np.round():

Python code:

np.round(my_series.unstack(level=0)).to_dict()
{1410843600: {u'a': 0.0,
  u'b': 0.0,
  u'c': 1.0,
  u'd': 0.0,
  u'e': 0.0},
 1410930000: {u'a': 0.0,
  u'b': 0.0,
  u'c': 1.0,
  u'd': 0.0,
  u'e': 0.0}
...}

Rounding to 2 decimal places (desired):

Python code:

np.round(my_series.unstack(level=0), 2).to_dict()
{1410843600: {u'a': 0.080000000000000002,
  u'b': 0.040000000000000001,
  u'c': 0.60999999999999999,
  u'd': 0.02,
  u'e': 0.25},
 1410930000: {u'a': 0.070000000000000007,
  u'b': 0.040000000000000001,
  u'c': 0.58999999999999997,
  u'd': 0.02,
  u'e': 0.28000000000000003}
...}

Some of the numbers are not being rounded as I would expect.
to_csv(), to_records(), etc. results in the correct rounding. Only to_dict() poses a problem. Is this a bug?

emoji fucked around with this message at 11:28 on Sep 18, 2014

# ? Sep 18, 2014 11:26

grate deceiver: Jul 10, 2009; Just a funny av. Not a redtext or an own ok.

Another dumb newbie question. I'm trying to learn how to structure projects properly and I can't get my head around how to manipulate variables across modules.

Let's say I have my main module, in which I have a list of objects I need to keep track of within the main program loop. Then in another module I have defined some functions that work on that list.

code:

from module import function

obj_list = [obj1, obj2, ...]

function()

code:

def function(a):
    from main import obj_list (???)
    for obj in obj_list:
        if obj.x == a:
            return True
    return False

Well, obviously this doesn't even work. I'm not entirely clear on how all this should look like, but importing something from main immediately seems like a Bad Idea. The function in question also gets called by other functions in the module, so I don't think I could effortlessly just move it to main.

Any good beginner tutorials on working with modules? The stuff I've read so far seems a bit too high-level for the kind of problems I'm encountering.

# ? Sep 18, 2014 13:25

Space Kablooey: May 6, 2009

See if this works:

Python code:

# main.py

from module import function

obj_list = [obj1, ...]

function(obj_list, x)

Python code:

# module.py

function(obj_list, x):
    for obj in obj_list:
        if obj.x == a:
            return True
    return False

However, it sounds to me that you should wrap the obj_list and the functions with a class.

# ? Sep 18, 2014 14:22

Jose Cuervo: Aug 25, 2004

grate deceiver posted:

Another dumb newbie question. I'm trying to learn how to structure projects properly and I can't get my head around how to manipulate variables across modules.

Let's say I have my main module, in which I have a list of objects I need to keep track of within the main program loop. Then in another module I have defined some functions that work on that list.
code:
from module import function

obj_list = [obj1, obj2, ...]

function()
code:
def function(a):
    from main import obj_list (???)
    for obj in obj_list:
        if obj.x == a:
            return True
    return False
Well, obviously this doesn't even work. I'm not entirely clear on how all this should look like, but importing something from main immediately seems like a Bad Idea. The function in question also gets called by other functions in the module, so I don't think I could effortlessly just move it to main.

Any good beginner tutorials on working with modules? The stuff I've read so far seems a bit too high-level for the kind of problems I'm encountering.

Is there a reason you are not passing the list of objects to the function as a parameter?

In general though if you have an object/function in a file named my_stuff.py and want to use in in another file called my_other_stuff.py then you must have "import my_stuff" in the file "my_other_stuff.py".

# ? Sep 18, 2014 14:24

Thermopyle: Jul 1, 2003; ...the stupid are cocksure while the intelligent are full of doubt. �Bertrand Russell

Hughmoris posted:

Thanks for these. Regex has always kicked my rear end but I've spent some effort trying to learn it now, and its still kicking my rear end.

As a simple experiment, I'm trying to pull the zipcode from this page with this code:
http://www.nlm.nih.gov/medlineplus/abdominalpain.html
code:
import re
import bs4
import urllib

webpage = urllib.urlopen('http://www.nlm.nih.gov/medlineplus/abdominalpain.html').read()

regex = '\d\d\d\d\d'
pattern = re.compile(regex)

result = re.match(pattern, webpage)
print result
I've tried every variation I can think of, and yet my RESULT still equals NONE when I print it. The regex expression is working on the regex website you linked.

Any ideas where I'm going wrong?

*EDIT
I modified the code and instead of using re.match(pattern, webpage) I use re.findall(pattern, webpage) and it works. Not sure I really understand the difference between re.match and re.findall

http://stackoverflow.com/a/1732454

You might find that interesting.

# ? Sep 18, 2014 14:28

grate deceiver: Jul 10, 2009; Just a funny av. Not a redtext or an own ok.

Jose Cuervo posted:

Is there a reason you are not passing the list of objects to the function as a parameter?

Oh, right. Guess I just forgot in the confusion. However, this still doesn't work:

Python code:

# main.py

from module import function

obj_list = [obj1, ...]

function(obj_list, x)

Python code:

# module.py

function(obj_list, x):
    for obj in obj_list:
        if obj.x == a:
            return True
    return False

I'm getting "NameError: global name 'objects' is not defined". It looks like I still need to import something somewhere, or declare obj_list as global at some point.

HardDisk posted:

However, it sounds to me that you should wrap the obj_list and the functions with a class.

Yeah, the more I struggle with it, the more it looks like this might be the best approach.

# ? Sep 18, 2014 14:53

Space Kablooey: May 6, 2009

Python code:

# main.py
from module import function


class Foo:
    x = 1

obj_list = [Foo(), Foo()]

x = 2

print function(obj_list, x)

Python code:

# module.py

def function(obj_list, x):
    for obj in obj_list:
        if obj.x == x:
            return True
    return False

# ? Sep 18, 2014 15:06

Hughmoris: Apr 21, 2007; Let's go to the abyss!

Thermopyle posted:

http://stackoverflow.com/a/1732454

You might find that interesting.

I stumbled upon that a while back during my ill-fated attempt to learn Perl. I have it bookmarked. I was actually thinking about it last night while I was trying to parse HTML using RegEx. The <center> cannot hold it is too late. That being said, after much cussing and frustration, I was finally able to parse what I wanted using RegEx. :smug:

# ? Sep 18, 2014 15:08

SurgicalOntologist: Jun 17, 2004

grate deceiver posted:

Oh, right. Guess I just forgot in the confusion. However, this still doesn't work:
Python code:
# main.py

from module import function

obj_list = [obj1, ...]

function(obj_list, x)
Python code:
# module.py

function(obj_list, x):
    for obj in obj_list:
        if obj.x == a:
            return True
    return False
I'm getting "NameError: global name 'objects' is not defined". It looks like I still need to import something somewhere, or declare obj_list as global at some point.

You don't have the name objects in your example code anywhere, so we can't troubleshoot your problem. You probably have a name you forgot to change somewhere.

The best solution really depends on the specifics though. What are these objects representing? Constants? Some sort of global program state you want to be available in many places?

# ? Sep 18, 2014 15:42

grate deceiver: Jul 10, 2009; Just a funny av. Not a redtext or an own ok.

Nevermind, I'm dumb. Turns out there's a ton of interdependencies all over the place that I need to untangle in order to move anything. :negative:

# ? Sep 18, 2014 15:47

SurgicalOntologist: Jun 17, 2004

SurgicalOntologist posted:

I have an idea for a mini-project and I'd like to run it by here to see what people think, and if there's a better way to accomplish this.

The problem: I have a separate virtualenv/conda environment for every project, I do a lot of my work in IPython Notebooks, and much of that is remote work. Since I have separate envs, I usually have to ssh in, check my tmux sessions for what servers are already running, possibly start up a new server, note what port it's on, and open up that port in the firewall before finally connecting. It's tedious and one of these days I'm bound to find myself wanting to get some work done without access to one of my authorized ssh keys.

The idea: Setup a web server, such that if you access the URL <server>/<env> (or perhaps <server>/ipython/<env>, you will be connected to the environment according to the URL. If a notebook is not already running, it will spin up.

Issues:

Framework? I've never done web stuff before, but the decorator notation of Flask looks nice. Are there other minimal frameworks I should consider?

Starting notebook servers. Does IPython have an API for this (apparently not) or do I have to shell out for lots of stuff?

When do I shut them down? Adding a home page at <server>/ipython to start and stop servers is probably a good option but that makes it much less minimal. Any downside to leaving them running?

Directing traffic. This is my biggest roadblock conceptually. Every Notebook server must run on a unique port. This means I must do some reverse proxy work, and furthermore I will have to change the redirection rules on the fly. That rules out all the easy options, I think. Are there any reverse proxy implementations in Python that expose an API? Or is there a better way...

Edit: just realized that nginx has a graceful restart control signal (HUP). This could work, but it would still be nice to decouple this app from the reverse proxy somehow. It would work for a purely local solution but would be hard to package as a stand-alone drop-in app for others to use. Still, it would be relatively easy to change the nginx configuration, gracefully restart it, and have Flask serve a redirect to the new address. Tempting.

Alternative: Write a shell script to add a location block to my nginx reverse proxy and create a new ipython profile with the same port. Call the shell script every time I make a new environment. Easy but not as fun.

Wondering if anyone has a more "elegant" idea before I start implementing dynamic implementations of the nginx config.

# ? Sep 18, 2014 16:00

Comrade Gritty: Sep 19, 2011; This Machine Kills Fascists

SurgicalOntologist posted:

Wondering if anyone has a more "elegant" idea before I start implementing dynamic implementations of the nginx config.

You could write a reverse proxy in Twisted pretty easily and give it whatever API you wanted.

# ? Sep 18, 2014 17:10

David Pratt: Apr 21, 2001

Thermopyle posted:

http://stackoverflow.com/a/1732454

You might find that interesting.

Came in to post this. Classic.

# ? Sep 18, 2014 17:17

accipter: Sep 12, 2003

Hughmoris posted:

*EDIT
I modified the code and instead of using re.match(pattern, webpage) I use re.findall(pattern, webpage) and it works. Not sure I really understand the difference between re.match and re.findall

To quote from the documentation:

quote:

Python offers two different primitive operations based on regular expressions: re.match() checks for a match only at the beginning of the string, while re.search() checks for a match anywhere in the string (this is what Perl does by default).

# ? Sep 18, 2014 17:22

TheOtherContraGuy: Jul 4, 2007; brave skeleton sacrifice

This mostly a general design question.

I'm trying to clean up some code about 100 lines of it is just me defining some parameters for several different webpages I make. Is it kosher for me to just pickle these presets?

# ? Sep 18, 2014 18:35

QuarkJets: Sep 8, 2008

TheOtherContraGuy posted:

This mostly a general design question.

I'm trying to clean up some code about 100 lines of it is just me defining some parameters for several different webpages I make. Is it kosher for me to just pickle these presets?

What's the purpose of pickling them? You're still going to have to have lines of code that define those parameters , more lines that pickle those parameters , and then you'll have to add lines of code to unpickle them. That's way messier than just putting your parameters into a separate .py and importing from it the specific parameters that you need

# ? Sep 18, 2014 18:48

TheOtherContraGuy: Jul 4, 2007; brave skeleton sacrifice

QuarkJets posted:

What's the purpose of pickling them? You're still going to have to have lines of code that define those parameters , more lines that pickle those parameters , and then you'll have to add lines of code to unpickle them. That's way messier than just putting your parameters into a separate .py and importing from it the specific parameters that you need

Good point. I'll do that.

# ? Sep 18, 2014 18:51

SurgicalOntologist: Jun 17, 2004

Steampunk Hitler posted:

You could write a reverse proxy in Twisted pretty easily and give it whatever API you wanted.

It all seems like such overkill for what I thought would be a simple personal-use webapp.

# ? Sep 18, 2014 19:07

KernelSlanders: May 27, 2013; Rogue operating systems on occasion spread lies and rumors about me.

kraftwerk singles posted:

I'm having a strange issue rounding numbers in pandas and exporting to dict.

my_series is the result of some aggregations on a dataframe with .astype('float') applied at the end.

Unrounded:
Python code:
my_series.unstack(level=0).to_dict()
{1410843600: {u'a': 0.081347309478284627,
  u'b': 0.035099699535645998,
  u'c': 0.61429595738869158,
  u'd': 0.019871619776017483,
  u'e': 0.24938541382136029},
 1410930000: {u'a': 0.074382538770821363,
  u'b': 0.039919586444572087,
  u'c': 0.59180547578020293,
  u'd': 0.017710128278766991,
  u'e': 0.27618227072563661}
...}
No arguments to np.round():
Python code:
np.round(my_series.unstack(level=0)).to_dict()
{1410843600: {u'a': 0.0,
  u'b': 0.0,
  u'c': 1.0,
  u'd': 0.0,
  u'e': 0.0},
 1410930000: {u'a': 0.0,
  u'b': 0.0,
  u'c': 1.0,
  u'd': 0.0,
  u'e': 0.0}
...}
Rounding to 2 decimal places (desired):
Python code:
np.round(my_series.unstack(level=0), 2).to_dict()
{1410843600: {u'a': 0.080000000000000002,
  u'b': 0.040000000000000001,
  u'c': 0.60999999999999999,
  u'd': 0.02,
  u'e': 0.25},
 1410930000: {u'a': 0.070000000000000007,
  u'b': 0.040000000000000001,
  u'c': 0.58999999999999997,
  u'd': 0.02,
  u'e': 0.28000000000000003}
...}
Some of the numbers are not being rounded as I would expect.
to_csv(), to_records(), etc. results in the correct rounding. Only to_dict() poses a problem. Is this a bug?

This appears to be working as intended. You told python you only cared about two decimal places and then complained when the 18th decimal place wasn't what you expected.

Basically, when you call round on 0.081347309478284627, you're asking for a number that is indistinguishable from 0.08 within floating point precision. That is exactly what you got. The problem is, 0.08 cannot be specified exactly in binary just like 1/3 cannot be specified exactly in decimal. Try subtracting 0.08 from 0.080000000000000002, you should get 0 not 2E-12. If you want number represented exactly to two decimal places, use a decimal class.

# ? Sep 18, 2014 20:57

emoji: Jun 4, 2004

KernelSlanders posted:

This appears to be working as intended. You told python you only cared about two decimal places and then complained when the 18th decimal place wasn't what you expected.

Basically, when you call round on 0.081347309478284627, you're asking for a number that is indistinguishable from 0.08 within floating point precision. That is exactly what you got. The problem is, 0.08 cannot be specified exactly in binary just like 1/3 cannot be specified exactly in decimal. Try subtracting 0.08 from 0.080000000000000002, you should get 0 not 2E-12. If you want number represented exactly to two decimal places, use a decimal class.

Yea, I figured this out after some investigation. It still seems weird that the other export functions round to two decimal places. Keeping with the Decimal class made my manipulations a bit messier since I'm combining disparate data sources. I ended up using string formatting since it's for display purposes and I wanted to minimize the size of the json sent over the network.

# ? Sep 18, 2014 22:33

Modern Pragmatist: Aug 20, 2008

Hughmoris posted:

Thanks for these. Regex has always kicked my rear end but I've spent some effort trying to learn it now, and its still kicking my rear end.

As a simple experiment, I'm trying to pull the zipcode from this page with this code:
http://www.nlm.nih.gov/medlineplus/abdominalpain.html
code:
import re
import bs4
import urllib

webpage = urllib.urlopen('http://www.nlm.nih.gov/medlineplus/abdominalpain.html').read()

regex = '\d\d\d\d\d'
pattern = re.compile(regex)

result = re.match(pattern, webpage)
print result
I've tried every variation I can think of, and yet my RESULT still equals NONE when I print it. The regex expression is working on the regex website you linked.

Any ideas where I'm going wrong?

*EDIT
I modified the code and instead of using re.match(pattern, webpage) I use re.findall(pattern, webpage) and it works. Not sure I really understand the difference between re.match and re.findall

Are you only parsing data from Medline Plus? If so, why not use the API that the NIH provides (http://www.nlm.nih.gov/api/). They have a free API for pretty much all of their services.

# ? Sep 19, 2014 04:27

Hughmoris: Apr 21, 2007; Let's go to the abyss!

I have a simple, 9-line python regex script that I wrote for work. Currently, it will read a text file and print out matches defined by the user. I'd like to get it to the point where it can iterate over the contents of a folder and find matches in each text file in that folder.

I'd like my coworkers to be able to use this script. The problem is that none of them know anything about programming, and none of them have python installed on their systems. What would be an easy way to make this script available and easy to use for them? We have a shared network drive. I've read that there is a python-to-exe converter. How is that process? Since the script is so small, would it maybe be easier to write it in another language? Maybe in a language that uses web browser? I think our work computers are still rocking IE 8.

Modern Pragmatist posted:

Are you only parsing data from Medline Plus? If so, why not use the API that the NIH provides (http://www.nlm.nih.gov/api/). They have a free API for pretty much all of their services.

I wasn't aware they had an API. Thank you for this.

Hughmoris fucked around with this message at 14:54 on Sep 19, 2014

# ? Sep 19, 2014 14:28

Heavy_D: Feb 16, 2002; "rararararara" contains the meaning of everything, kept in simple rectangular structures

Having a bit of a problem with py2exe and a short python script I've written. The script raises a warning under some circumstances, and depending on the command line the program may crash when that warning is issued. The weird part about it is that it's actually tied to how you specify the executable name. Calling "fbxtomdl.exe" with the .exe extension present will exhibit the bug, but just calling "fbxtomdl" without an extension prints the warning and works fine.

The traceback is

pre:

Traceback (most recent call last):
  File "fbxtomdl.py", line 180, in <module>
  File "E:\Python33\lib\warnings.py", line 18, in showwarning
    file.write(formatwarning(message, category, filename, lineno, line))
  File "E:\Python33\lib\warnings.py", line 25, in formatwarning
    line = linecache.getline(filename, lineno) if line is None else line
  File "E:\Python33\lib\linecache.py", line 15, in getline
    lines = getlines(filename, module_globals)
  File "E:\Python33\lib\linecache.py", line 41, in getlines
    return updatecache(filename, module_globals)
  File "E:\Python33\lib\linecache.py", line 126, in updatecache
    with tokenize.open(fullname) as fp:
  File "E:\Python33\lib\tokenize.py", line 438, in open
    encoding, lines = detect_encoding(buffer.readline)
  File "E:\Python33\lib\tokenize.py", line 416, in detect_encoding
    encoding = find_cookie(first)
  File "E:\Python33\lib\tokenize.py", line 380, in find_cookie
    raise SyntaxError(msg)
SyntaxError: invalid or missing encoding declaration for '..\\..\\fbxtomdl\\fbxt
omdl.exe'

From what I can tell, the issue is that in trying to report the file and line number for the warning, the code has tried to read the executable file looking for a magic cookie which tells Python the file encoding. Of course, the .exe header it finds is binary code, not valid utf-8, so it crashes. Apparently something magical happens when I leave the .exe off the end - my working theory is that it finds fbxtomdl.py instead and so doesn't have a problem. But I'm a bit out of my depth with picking through the standard library to diagnose things, especially when it's only occurring in a .exe file, so I can't use a python shell to debug the issue. Has anyone encountered something similar and maybe found a workaround?

Code can be found at http://tomeofpreach.wordpress.com/qmdl/fbxtomdl/ if that helps.

# ? Sep 22, 2014 01:56

David Pratt: Apr 21, 2001

Hughmoris posted:

I'd like my coworkers to be able to use this script.

You could put a python runtime on the network share, then have a batch file which calls the script against that runtime.

code:

\\share\scripts\my_cool_script.bat
\\share\scripts\my_cool_script.py
\\share\scripts\python_runtime\python27\...

batch file -> python_runtime\python27\python.exe my_cool_script.py

Python doesn't need to be installed-via-the-installer on the system running the script to work.

David Pratt fucked around with this message at 16:03 on Sep 22, 2014

# ? Sep 22, 2014 16:00

regularizer: Mar 5, 2012

I'm trying to figure out python syntax, and one of the codecademy exercises is to write a function that removes all duplicates of integers in a list, so if you have [1,1,2,2,3] it returns [1,2,3]. I first wrote:

code:

def remove_duplicates(ints):
    out = []
    for i in ints:
        if i not in out:
            out.append(i)
    return out

I'm trying to compress by putting the for and if statements together with the append command like I've seen done on stack overflow, so I tried this:

code:

def remove_duplicates(ints):
    out = []
    out = [out.append(i) for i in ints if i not in out]
    return out

Which doesn't work. How do I structure the code so that I can put everything together instead of using nested for/if statements?

# ? Sep 22, 2014 16:18

SurgicalOntologist: Jun 17, 2004

The problem is that out.append(i) modifies out in-place and returns None. In a comprehension, you need to use the value itself.

So you should try:

Python code:

out = [i for i in ints if i not in out]

However, this won't work either, because the comprehension on the right is not assigned to out until after it's constructed. So the not in check is not going to find anything. Your original is probably fine if you care about the order of the items in the list. Otherwise look into set.

# ? Sep 22, 2014 16:33

KernelSlanders: May 27, 2013; Rogue operating systems on occasion spread lies and rumors about me.

SurgicalOntologist posted:

The problem is that out.append(i) modifies out in-place and returns None. In a comprehension, you need to use the value itself.

So you should try:
Python code:
out = [i for i in ints if i not in out]
However, this won't work either, because the comprehension on the right is not assigned to out until after it's constructed. So the not in check is not going to find anything. Your original is probably fine if you care about the order of the items in the list. Otherwise look into set.

There's also the numpy.unique function.

# ? Sep 22, 2014 18:05

Adbot: ADBOT LOVES YOU

# ? May 8, 2024 20:56

Hughmoris: Apr 21, 2007; Let's go to the abyss!

David Pratt posted:

You could put a python runtime on the network share, then have a batch file which calls the script against that runtime.
code:
\\share\scripts\my_cool_script.bat
\\share\scripts\my_cool_script.py
\\share\scripts\python_runtime\python27\...

batch file -> python_runtime\python27\python.exe my_cool_script.py
Python doesn't need to be installed-via-the-installer on the system running the script to work.

Thanks for this. It is exactly what I'm looking for but I'm having a bit of a trouble to get working. I have a script called hello_world.py.I have Python installed on my laptop at C:\Python27\python.exe. I want to make the script accessible to my coworkers on the network drive I:\Public\Scripts.

The batch file below does not appear to launch the script from the network drive:

code:

python_runtime\python27\python.exe hello_world.py

I substituted C:\ for "python runtime" and it executes from my computer but it doesn't execute from other computers because they don't have access to my c drive. Not sure if I'm explaining that right.

What am I missing?

Edit:
Do I need to install Python to the network drive? Not sure if the IT guys will be a fan of that. :smith:

Hughmoris fucked around with this message at 18:42 on Sep 22, 2014

# ? Sep 22, 2014 18:20

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Python information and short questions megathread.

«‹›484 »