Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Locked thread
Hammerite
Mar 9, 2007

And you don't remember what I said here, either, but it was pompous and stupid.
Jade Ear Joe

Cingulate posted:

Yup, it wasn't a realistic question - I assumed from the beginning that sum(list_of_str) would fail, I just wanted to know why, under-the-hood, it failed. Thanks for the answer too!

For the future, if for some reason you really do need to use sum to sum non-numbers, you can provide the starting value for the accumulator variable as an optional argument. (It still doesn't work to use strings.)

code:
>>> sum([(2, 3), (4, 5)], ())
(2, 3, 4, 5)
>>> sum(["bu", "tts"], "")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: sum() can't sum strings [use ''.join(seq) instead]
>>>
I think it would make for clearer code to provide the optional variable as a keyword argument, to signal its purpose, but unfortunately you can't do that.

Adbot
ADBOT LOVES YOU

Edison was a dick
Apr 3, 2010

direct current :roboluv: only
You could try reduce with operator.add too.

Plasmafountain
Jun 17, 2008

Hoping someone can help me out because I cant find anything applicable on stackexchange.

I have a list of numbers that is an imported txt file output in rows/columns from a tabulated data output. The numbers are currently in str form as elements in the array and is ~16000 elements long.

How can I split this list such that:

The values in each row are assigned to one array row;
A new row is created at every 16th element where the table row ends and a new one begins.

I can't quite figure out how to get this, but what I am really after is the ability to look up the data in it with a simple data[row][col] in a later script. I'm really struggling with turning this long list into an organised array in a nice manner.

Plasmafountain fucked around with this message at 13:56 on Feb 3, 2016

sunaurus
Feb 13, 2012

Oh great, another bookah.

Zero Gravitas posted:

Hoping someone can help me out because I cant find anything applicable on stackexchange.

I have a list of numbers that is an imported txt file output in rows/columns from a tabulated data output. The numbers are currently in str form as elements in the array and is ~16000 elements long.

How can I split this list such that:

The values in each row are assigned to one array row;
A new row is created at every 16th element where the table row ends and a new one begins.

I can't quite figure out how to get this, but what I am really after is the ability to look up the data in it with a simple data[row][col] in a later script. I'm really struggling with turning this long list into an organised array in a nice manner.

Am I understanding correctly that you want to split your big list into a list of smaller lists (16 elements each)?

organised_array = [long_list[i:i + 16] for i in range(0, len(long_list), 16)]

after this you can do organised_array[0][0] to get the first column of the first row.

If I misunderstood you then could you describe your input list a bit more? Also, if this works for you, but you don't understand how it works, then I'll be happy to explain.

sunaurus fucked around with this message at 14:11 on Feb 3, 2016

Plasmafountain
Jun 17, 2008

Its a lot of numbers that are an output from a CFD program to generate some graphs and diagnose some issues with funky simulations I've been running at work.

Thats essentially:

Organised = longlist(index i to i +16) for i in range of (0 to the end of the list also in steps of 16)

It works perfectly - I thought I had an issue with trying to come up with a way to turn them from strings to floats, but I've put that in further up the script where the numbers are read from the original reader into the longlist.

I dip my toe in the water every once in a while so my python is very very rusty. Thanks for helping me out!

huhu
Feb 24, 2006
I'm trying to install the beautofulsoup4 module, my first attempt at installing a module, and I'm totally lost reading online guides. Could someone point me to an idiots guide to installing modules?

huhu fucked around with this message at 01:37 on Feb 4, 2016

fletcher
Jun 27, 2003

ken park is my favorite movie

Cybernetic Crumb

huhu posted:

I'm trying to install the beautofulsoup4 module, my first attempt at installing a module, and I'm totally lost reading online guides. Could someone point me to an idiots guide to installing modules?

Did you try this one? http://www.crummy.com/software/BeautifulSoup/bs4/doc/#installing-beautiful-soup

accipter
Sep 12, 2003

huhu posted:

I'm trying to install the beautofulsoup4 module, my first attempt at installing a module, and I'm totally lost reading online guides. Could someone point me to an idiots guide to installing modules?

What OS? If you are on windows, install miniconda3, then open terminal and run:
code:
conda install beautifulsoup4

huhu
Feb 24, 2006

accipter posted:

What OS? If you are on windows, install miniconda3, then open terminal and run:
code:
conda install beautifulsoup4

That's about as idiot-level as you can get. Thanks!

Rocko Bonaparte
Mar 12, 2002

Every day is Friday!
Is there some way to have mock.patch raise an exception at the end of a with clause it's used in if it never patched anything? I wanted to particularly trap patch statements that don't do anything. I had refactored a few things in some unit tests, but not the mock.patch strings, and a bunch of stuff started screwing up. I would have nailed it pretty quickly if I had been notified mock.patch never had to actually make the patch on anything.

Cingulate
Oct 23, 2012

by Fluffdaddy
If anybody feels like optimising some code ...

I have two lists of strings (actually, one data frame with one column that can be used to subset it into two). I want to, as far as possible, match each element in list 1 to one as of yet unmatched element in list 2, where matching can be done if the two strings are the same. In the end, I want a list of length 2 tuples (list 1 and list 2 indices for pairs of matches), plus another list of integers (indices of unmatched members of list 1).
I can think of a ton of ways of going about this, but this is the least bad one I've come up with so far. Is there something more idiomatic?

code:
outs = []
for ii, m in enumerate(list_a):
    try:
        ind = list_b.index(m)
        outs.append((ii, ind))
        del list_b[ind]
    except:
        print("no match for {}".format(ii))

QuarkJets
Sep 8, 2008

Cingulate posted:

If anybody feels like optimising some code ...

I have two lists of strings (actually, one data frame with one column that can be used to subset it into two). I want to, as far as possible, match each element in list 1 to one as of yet unmatched element in list 2, where matching can be done if the two strings are the same. In the end, I want a list of length 2 tuples (list 1 and list 2 indices for pairs of matches), plus another list of integers (indices of unmatched members of list 1).
I can think of a ton of ways of going about this, but this is the least bad one I've come up with so far. Is there something more idiomatic?

code:
outs = []
for ii, m in enumerate(list_a):
    try:
        ind = list_b.index(m)
        outs.append((ii, ind))
        del list_b[ind]
    except:
        print("no match for {}".format(ii))

Are the strings unique, or is it possible to have duplicates? If there are no duplicates, or if you don't care about counting duplicates, then you could get better search performance using sets. Checking whether a string is in a set is a lot faster than checking whether a string is in a list

OnceIWasAnOstrich
Jul 22, 2006

You could pre-sort both lists and then search list_b[index_of_last_match+1:] each iteration so you would have very minimal search time (O(N) I believe) at the expense of an O(nlogn) sort set, instead of your current O(n^2) search. That said, the suggestion of sets is better if you don't have duplicates in list_a because you don't even need to search the set, you can just use set difference operations to directly get a list of things in list_a but not list_b, and if you do have duplicates in list_a you don't seem to care about them in list_b so it can be a set for the improved search performance.

OnceIWasAnOstrich fucked around with this message at 02:00 on Feb 5, 2016

Cingulate
Oct 23, 2012

by Fluffdaddy
There's tons of dupes actually, and about 25% of the runs trigger the except cause there is no match (left).

It's not about performance - there are only a few thousand entries. It's about idiomatic Python. I'm not particularly convinced my code there is elegant.

I don't care so much for the actual strings, but for their list position - the strings are one of many aspects of a small database, and I actually want to match table rows, not the matching strings.

OnceIWasAnOstrich
Jul 22, 2006

Cingulate posted:

There's tons of dupes actually, and about 25% of the runs trigger the except cause there is no match (left).

It's not about performance - there are only a few thousand entries. It's about idiomatic Python. I'm not particularly convinced my code there is elegant.

I don't care so much for the actual strings, but for their list position - the strings are one of many aspects of a small database, and I actually want to match table rows, not the matching strings.

If that's actually exactly what you need done (you want both indices, you only want one-to-one matches that are first based on order in both lists as opposed to one-to-many or many-to-many which sound like they exist) and the performance isn't an issue it isn't bad, I see no reason to mess with it if it works and speed isn't an issue. If they are in some sort of relational database you could use database operations for this, especially if you are already using an ORM, but otherwise that would likely add complexity.

QuarkJets
Sep 8, 2008

Cingulate posted:

There's tons of dupes actually, and about 25% of the runs trigger the except cause there is no match (left).

It's not about performance - there are only a few thousand entries. It's about idiomatic Python. I'm not particularly convinced my code there is elegant.

I don't care so much for the actual strings, but for their list position - the strings are one of many aspects of a small database, and I actually want to match table rows, not the matching strings.

Are they all of the rows in a table that you're comparing against, or some subset of rows? If the latter, then you should probably be using the ID numbers that you pull from the database instead of the indices of the list. If you're using SQL or something then there's probably an elegant way to do whatever you need in that

I guess I'm not a fan of creating a list of tuples, but without knowing more about what you're doing I can't really recommend something better. Maybe a simple class that you just use as a struct or something like that would be more idiomatic. There's certainly nothing wrong with creating a list of tuples, any argument that I make against it is just going to be opinion-based ideology

Begall
Jul 28, 2008
So I'm writing a Flask web app that includes offloading validation work on file uploads to child processes using the multiprocessing module. I'm currently only running the dev flask web server rather than anything proper, so I don't know if that impacts on this, but I've found that where I have uploaded the files and have offloaded the work to another process with code similar to the following:

Python code:
p2 = multiprocessing.Process(target=processor.start())
p2.start() 
the main flask thread will block until all the child processes have finished. But by putting print statements around the p2.start() call I can see that it returns virtually instantly, but if for example I put in a sleep(5) into the child thread there will be a 5 second delay before I see either the "before" timer or the "after" timer in the log, and the browser will not redirect to the destination page until those 5 seconds have passed. Is this something that would be fixed by switching to a real web server?

huhu
Feb 24, 2006

accipter posted:

What OS? If you are on windows, install miniconda3, then open terminal and run:
code:
conda install beautifulsoup4
So beautifulsoup4 is installed but in the Python Shell when I try and import it it's not found. Is there a special command I have to use with conda to import it?

Edit:
Getting this:
code:
C:\Users\huhu>python -m pip install beautifulsoup4
Requirement already satisfied (use --upgrade to upgrade): beautifulsoup4 in c:\users\huhu\miniconda3\lib\site-packages
If I try and import it in the Python Shell it can't be found.

huhu fucked around with this message at 01:30 on Feb 6, 2016

accipter
Sep 12, 2003

huhu posted:

So beautifulsoup4 is installed but in the Python Shell when I try and import it it's not found. Is there a special command I have to use with conda to import it?

Edit:
Getting this:
code:
C:\Users\huhu>python -m pip install beautifulsoup4
Requirement already satisfied (use --upgrade to upgrade): beautifulsoup4 in c:\users\huhu\miniconda3\lib\site-packages
If I try and import it in the Python Shell it can't be found.

I am guessing that you now have multiple Python versions installed. Run the following commands you should see something similar
.
code:
C:\Users\accipter>python --version
Python 3.4.4 :: Continuum Analytics, Inc.

C:\Users\accipter>where python
C:\Users\accipter\AppData\Local\Continuum\Miniconda3\python.exe

huhu
Feb 24, 2006

accipter posted:

I am guessing that you now have multiple Python versions installed. Run the following commands you should see something similar
.
code:
C:\Users\accipter>python --version
Python 3.4.4 :: Continuum Analytics, Inc.

C:\Users\accipter>where python
C:\Users\accipter\AppData\Local\Continuum\Miniconda3\python.exe
code:
C:\Users\huhu>python --version
Python 3.5.1 :: Continuum Analytics, Inc.

C:\Users\huhu>
I did have Python 2.7 installed by I uninstalled it. Maybe not fully?

Edit: Forgot the second part.
code:
C:\Users\Travis>where python
C:\Users\Travis\Miniconda3\python.exe

huhu fucked around with this message at 01:43 on Feb 6, 2016

baka kaba
Jul 19, 2003

PLEASE ASK ME, THE SELF-PROFESSED NO #1 PAUL CATTERMOLE FAN IN THE SOMETHING AWFUL S-CLUB 7 MEGATHREAD, TO NAME A SINGLE SONG BY HIS EXCELLENT NU-METAL SIDE PROJECT, SKUA, AND IF I CAN'T PLEASE TELL ME TO
EAT SHIT

Are you importing from 'bs4' instead of from 'BeautifulSoup'?

huhu
Feb 24, 2006
:suicide: The folder I was in was C:\Python34 but I'm using Python35 which is in a completely different location.

Plasmafountain
Jun 17, 2008

Is there a way to use python to open a windows program using command line arguments?

sunaurus
Feb 13, 2012

Oh great, another bookah.

Zero Gravitas posted:

Is there a way to use python to open a windows program using command line arguments?

Yeah, many ways.
This is probably what you're looking for:
https://docs.python.org/3.5/library/os.html#os.system

Begall
Jul 28, 2008

Begall posted:

So I'm writing a Flask web app that includes offloading validation work on file uploads to child processes using the multiprocessing module. I'm currently only running the dev flask web server rather than anything proper, so I don't know if that impacts on this, but I've found that where I have uploaded the files and have offloaded the work to another process with code similar to the following:

Python code:
p2 = multiprocessing.Process(target=processor.start())
p2.start() 
the main flask thread will block until all the child processes have finished. But by putting print statements around the p2.start() call I can see that it returns virtually instantly, but if for example I put in a sleep(5) into the child thread there will be a 5 second delay before I see either the "before" timer or the "after" timer in the log, and the browser will not redirect to the destination page until those 5 seconds have passed. Is this something that would be fixed by switching to a real web server?

So I found the answer to this myself - it seems that I need to use a package like Celery to allow the files to be validated in the background while Flask continues without being blocked, using multiprocessing is not enough.

So I've set up celery, and I managed to get it to the point where I could send simple test messages. However, when I try to run the celery worker after integrating it into my application after setting "CELERY_IMPORTS" to the appropriate modules, I run into issues.

This is my celery worker:

code:
#!flask/bin/python

from celery import Celery
worker = Celery('tasks', broker='redis://localhost:6379/0')
worker.conf.update(CELERY_IMPORTS = ("app"))
It is run from the command line with something like "celery -A myCelery worker --loglevel=info"

I have a main app folder, which includes an __init__ with the following code:

Python code:
from flask import Flask
from flask.ext.login import LoginManager
from flask.ext.sqlalchemy import SQLAlchemy

app = Flask(__name__)
app.config.from_object('config')
app.debug = True
db = SQLAlchemy(app)

lm = LoginManager()
lm.init_app(app)

from app import views, models
Even if I place the celery worker .py file in the app folder, in the same location as the __init__.py file, the celery worker will fail on the first import in the __init__ i.e. "from flask import Flask". The __init__ file obviously runs fine when it is called from the main application so the Flask lib can be found in that location. I should also note that it isn't just this file that causes an import issue. I have moved the celery worker around, and if I put it at a lower level then it fails to find the first import of any other file it is associated to. Is there a reason that when celery is run independently it is unable to locate anything? The only way I can get it to work is with the simplest use case where it does not reference any tasks outside of its own file.

huhu
Feb 24, 2006
If I have a variable and I want to check if it's either a non-empty string, an empty string, or was never actually created... how would I check if it was never actually created?

Thermopyle
Jul 1, 2003

...the stupid are cocksure while the intelligent are full of doubt. —Bertrand Russell

huhu posted:

If I have a variable and I want to check if it's either a non-empty string, an empty string, or was never actually created... how would I check if it was never actually created?

try/except?

sunaurus
Feb 13, 2012

Oh great, another bookah.

huhu posted:

If I have a variable and I want to check if it's either a non-empty string, an empty string, or was never actually created... how would I check if it was never actually created?

Just out of curiosity, why do you need this?

huhu
Feb 24, 2006

Illegal Move posted:

Just out of curiosity, why do you need this?

I'm scraping data from a language translator and depending on the inputs you give it: "father", "the father", "like father like son", you get something that's either a string, an empty string, or doesn't exist. There might be a more eloquent way to deal with this but I'm just trying to get through writing my first program.

Thermopyle posted:

try/except?
I'll try that out.

huhu fucked around with this message at 00:53 on Feb 7, 2016

Thermopyle
Jul 1, 2003

...the stupid are cocksure while the intelligent are full of doubt. —Bertrand Russell

What do you mean doesn't exist? It sounds like maybe it gives you None, which is different.

Begall
Jul 28, 2008
I've seen similar with Python libraries that deal with excel files - where a field is not populated it simply does not appear in the library output. My approach is to set the variable to "" (or whatever is appropriate) and then only changing the value if the corresponding field is in the file.

Hegel
Dec 17, 2009

huhu posted:

I'm scraping data from a language translator and depending on the inputs you give it: "father", "the father", "like father like son", you get something that's either a string, an empty string, or doesn't exist. There might be a more eloquent way to deal with this but I'm just trying to get through writing my first program.

I'll try that out.

In what form is the translator giving you the output? a dictionary? a single output on one of many runs? Depending on how it's returning the maybe-string there are easier/simpler ways to do what you're talking about. For example, .get(key, default) on a dictionary pretty closely matches what you want. If the output is a list or some other collection, you could filter(lambda x: x is not None and len(x) > 0, <yourcollection>) or something like that could work.

Cingulate
Oct 23, 2012

by Fluffdaddy

QuarkJets posted:

Are they all of the rows in a table that you're comparing against, or some subset of rows? If the latter, then you should probably be using the ID numbers that you pull from the database instead of the indices of the list. If you're using SQL or something then there's probably an elegant way to do whatever you need in that
Well okay. I have two data frames (in truth, two subsets of one data frame) with words and a bunch of characteristics about these words (e.g., corpus frequency, meaning, morphology). I want to go through the first data frame word by word. For the first word, I want to find a member of the second data frame that is identical on one characteristic (morphology), somehow record that these can form a match, and then I want to do do the same for the second word of the first data frame and see if there is a match with any of the so far unmatched members of the second data frame (that's why I'm using `del` in my original implementation). And so on.

Is that clearer?

E: maybe it'd be better to say, I want lists of unique pairings of two data frames, matched on one criterion so that in every list, each word from df1 is linked to at most one word from df2 and each word from df2 is linked to at most 1 word from df1, where some words will end up not being matched at all. Matching criterion is being identical in one string (word_from_df1["characteristic1"] == word_from_df2["characteristic1"]).
This is what my original code does, but it seems unidiomatic.

Cingulate fucked around with this message at 14:10 on Feb 7, 2016

vikingstrike
Sep 23, 2007

whats happening, captain
Is morphology unique to each word (this may be stupid if you know the problem)? If so, couldn't you just merge the two frames on morphology and then keep the intersection?

pandas.merge(df1, df2, on="morphology", how="inner")

Plasmafountain
Jun 17, 2008

Trying to carry on with the command line thing I mentioned earlier.

I've got a part of a script where I have:

code:
import subprocess

subprocess.Popen(['D:\\File\\post_convert.exe','convert.in'])
subprocess.Popen(['C:\\Windows\\notepad.exe' , 'D:\\File\\convert.in'])

But for some reason, while the notepad line works fine and I get to check the format passed to post_convert, it doesnt actually execute post_convert.

If I use the windows command line:

code:
D:\File\>post_convert.exe convert.in
Then it runs quite happily. Changing post_convert to postconvert doesnt make a difference.

Does anyone have any idea whats up?

Lysidas
Jul 26, 2002

John Diefenbaker is a madman who thinks he's John Diefenbaker.
Pillbug
How long does the post_convert program take to execute? You're not waiting for it to finish after running it, so you're probably invoking Notepad before the convert.in file is processed.

The object returned by the Popen call has a wait method that will suspend execution of your Python code until the subprocess has exited, so I'd try adjusting your code as follows:

Python code:
import subprocess

subprocess.Popen(['D:\\File\\post_convert.exe','convert.in']).wait()
subprocess.Popen(['C:\\Windows\\notepad.exe', 'D:\\File\\convert.in'])
You could also add .wait() after running Notepad, if desired.

huhu
Feb 24, 2006
code:
def funct(word):
    print(word)
funct(hello world)
Is there something I can add to this so that I don't have to put quotes around my string? OR is that just dumb coding practice?

I'm writing a class with dictionary.openD(filename.txt) and I'm lazy and don't want to write dictionary.openD("filename.txt")

Nippashish
Nov 2, 2005

Let me see you dance!

huhu posted:

Is there something I can add to this so that I don't have to put quotes around my string?

No.

baka kaba
Jul 19, 2003

PLEASE ASK ME, THE SELF-PROFESSED NO #1 PAUL CATTERMOLE FAN IN THE SOMETHING AWFUL S-CLUB 7 MEGATHREAD, TO NAME A SINGLE SONG BY HIS EXCELLENT NU-METAL SIDE PROJECT, SKUA, AND IF I CAN'T PLEASE TELL ME TO
EAT SHIT

The quotes are what make it a string

Adbot
ADBOT LOVES YOU

Hammerite
Mar 9, 2007

And you don't remember what I said here, either, but it was pompous and stupid.
Jade Ear Joe

huhu posted:

code:
def funct(word):
    print(word)
funct(hello world)
Is there something I can add to this so that I don't have to put quotes around my string? OR is that just dumb coding practice?

I'm writing a class with dictionary.openD(filename.txt) and I'm lazy and don't want to write dictionary.openD("filename.txt")

"filename.txt" without quotes would mean "the txt attribute of the filename object". "hello world" without quotes is incorrect syntax.

Why do you feel like you need to do this? If it's just opening files in one or two places you should probably just suck it up and accept that you want a string literal, and this is achieved using quotes. If you need to open lots of files then there might be better ways than having them all hardcoded as literals.

  • Locked thread