Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Hughmoris
Apr 21, 2007
Let's go to the abyss!
*Disregard, found answer in OP.

Hughmoris fucked around with this message at 00:44 on Apr 1, 2017

Adbot
ADBOT LOVES YOU

Hughmoris
Apr 21, 2007
Let's go to the abyss!
For those that work with Excel, what Python library do you use?

Hughmoris
Apr 21, 2007
Let's go to the abyss!
Is there a way to scan audio/video file data with Python to pick up on certain sound bits? I'm watching Arrested Development and I'm curious when and how often certain jingles are played, and figured Python might have tools for that.

Hughmoris
Apr 21, 2007
Let's go to the abyss!
I've created a simple python script that checks an RSS movie feed and performs an IMDB lookup if it finds new entries. I'd like to run this script every 15 minutes. Is it better practice to keep the script running in a loop and have it sleep for 15 minutes, or to use Windows Task Scheduler to launch it every 15 minutes? Or does it not really matter?

Hughmoris
Apr 21, 2007
Let's go to the abyss!

breaks posted:

Use the task scheduler unless you have a good reason not to. Running it in a loop will work until your computer reboots or it throws an exception and whatever series of other problems, then by the time you find and fix all those all you get for the extra work and inconvenience is probably a worse task scheduler.

Ok, I'll give task scheduler a shot. Thanks.

Hughmoris
Apr 21, 2007
Let's go to the abyss!
Are there any recommended articles/tutorials/blogs on working with sqlite in Python? I've just started learning a little bit about SQL and I'm trying to find best practices when incorporating it into a script.

I'd like to use it in a small script that parses an RSS feed and, if it's a new entry, inserts it into the DB.

Hughmoris
Apr 21, 2007
Let's go to the abyss!

accipter posted:

Do you want to work with sqlite directly? Or indirectly? If you want to work with it indirectly, look at Object Relational Mappers such as peewee or SQLAlchemy. Peewee is simpler, while SQLAlchemy is the standard (?) ORM for Python.

I can't say I know enough to know which way I want to go. I'll do some reading on ORM, thanks.

Hughmoris
Apr 21, 2007
Let's go to the abyss!
Speaking of Pandas, I run in to trouble when I need to create additional columns that are filled based on other column criteria. For example, if I have a CSV of:
code:
name,party_size,ticket_price
john,3,$14
sarah,1,$20
phil,6,$11
After I read that into Pandas, I then want to add two more columns. First column "More_Than_One" is Y/N based on party size being greater than 1. Next column is "Total_Cost" which is party_size * ticket_price.

How would I do something like that?

Hughmoris
Apr 21, 2007
Let's go to the abyss!

vikingstrike posted:

code:

import pandas as pd

frame = pd.read_csv('my_data.csv')

frame = frame.assign(More_Than_One=(frame.party_size > 1))
frame = frame.assign(Total_Cost=frame.party_size * frame.ticket_price)


Jose Cuervo posted:

Or even simpler:

code:

import pandas as pd

df= pd.read_csv('my_data.csv')

df['More_Than_One'] = df['party_size'] > 1
df['Total_Cost'] = df['party_size'] * df['ticket_price']


Thanks for these.

Hughmoris
Apr 21, 2007
Let's go to the abyss!
Here is a great article that explains routine usage of matplotlib + pandas.
http://pbpython.com/effective-matplotlib.html

Hughmoris
Apr 21, 2007
Let's go to the abyss!
I want to create a simple auto-extractor for torrents. I'm on Windows 10 and have Winrar.

What is the best practice to call Winrar (or processes in general) from a python script? Is it using subprocess.call?

Hughmoris fucked around with this message at 23:11 on Sep 29, 2017

Hughmoris
Apr 21, 2007
Let's go to the abyss!

Data Graham posted:

Why not use a native python rar library?

I tried rarfile but I was having issues with it finding UnRAR, even when I provided it the full path. I've got subprocess working now though.

Hughmoris
Apr 21, 2007
Let's go to the abyss!

CarForumPoster posted:

This is good to know.

Also, hi thread. I am learning python to do some web scraping and data manipulation and eventually machine learning stuff.

Holy crap is it easy. I got all the data from a webpage into a csv with like 4 lines of code (pandas) and theres 19 bajillion examples of how to do this online.

I'm working my way through Automate The Boring Stuff and am on the Web Scraping section. Just curious, with your pandas example, are you scraping full tables or are you using selectors to nab individual items and then building a dataframe?

Hughmoris fucked around with this message at 16:48 on Oct 6, 2017

Hughmoris
Apr 21, 2007
Let's go to the abyss!
What Python books (if any) do you all have? I'm thinking about picking up Fluent Python and Effective Python.

Hughmoris
Apr 21, 2007
Let's go to the abyss!

huhu posted:

Are you just starting out? Automate the Boring stuff is a great starting point.

Nah, I've been poking around Python for some time but I haven't really progressed from beginner -> intermediate. Automate The Boring Stuff is a great book.

Hughmoris
Apr 21, 2007
Let's go to the abyss!
Does anyone have any good examples/blogs/libraries using functional programming in Python?

Hughmoris
Apr 21, 2007
Let's go to the abyss!

Thermopyle posted:

Functional programming in python can be done effectively and sometimes it can be done appropriately, but generally you shouldn't go "ok I'm going to write this program functionally".

Use it when it makes sense.

Guido doesn't optimize style for writing functionally and it shows.

https://stackoverflow.com/questions/1017621/why-isnt-python-very-good-for-functional-programming

That list isn't exactly right, but it gets the idea across.

Thanks for the links.

Hughmoris
Apr 21, 2007
Let's go to the abyss!
Is there a good way to step through Python code? I found a Python library to parse torrent names (https://github.com/divijbindlish/parse-torrent-name/blob/master/PTN/parse.py) and I can't quite figure out how it works.

Hughmoris
Apr 21, 2007
Let's go to the abyss!

a witch posted:

Pycharm. Add the library to your project, set a breakpoint in it and run the debugger.

Thermopyle posted:

PyCharm is great, but if you don't want to use it, you can use pdb or ipdb.

Foxfire_ posted:

pudb's my favorite if you're on unix

Thanks for the ideas.

Hughmoris
Apr 21, 2007
Let's go to the abyss!
I'm new to Pycharm and utilizing virtual environments, and I'm running Windows 10.

When creating a new project in Pycharm, I can't find a module that I want to install (https://github.com/divijbindlish/parse-torrent-name). Is my next best option to open up a console window, activate the virtual environment and install the module? Or is there a way to help Pycharm find the module for installation?

Hughmoris
Apr 21, 2007
Let's go to the abyss!

Thermopyle posted:

I usually click the Terminal button in PyCharm and install packages that way. It automatically activates the virtualenv or conda env for the project.

That worked, thanks. PyCharm is a bit overwhelming coming from Vim or Atom.

Hughmoris
Apr 21, 2007
Let's go to the abyss!

Tigren posted:

Phone posting, but you should be able to open the project interpreter settings and install packages there.

https://www.jetbrains.com/help/pycharm/installing-uninstalling-and-upgrading-packages.html

Thanks. That was the initial route I pursued but the package I needed wasn't in the available list.

Hughmoris
Apr 21, 2007
Let's go to the abyss!

Tigren posted:

Weird, works for me.



What is listed when you click on that "Manage Repositories" button?

Mine has https://pypi.python.org/simple listed.

My "Manage Repositories" list was initially empty. I added the one that you listed and refreshed available packages and no change, still can't find it. It looks like it might only be displaying Conda packages? A quick google search says this might not be an extremely uncommon issue but I haven't found a solution.

Hughmoris
Apr 21, 2007
Let's go to the abyss!

Thermopyle posted:

Umm, I'm not at my pc but there's a button on the right side at the bottom that switches between virtual environments and conda.

That was it. I'm able to find the package. Of course, when I go to install it, it errors out. :suicide:

I get the same error when attempting to install it from CMD but I am able to manually install it with setup.py .

Hughmoris
Apr 21, 2007
Let's go to the abyss!
How is python+selenium for filling out lots of repetitive forms? I noticed that some people on my project team are manually entering in the data for 2000+ users in to a web portal. They've asked for help but my eyes will fall out of my head if I have to manually type in crap.

I have all of the user data in a clean csv file. The steps that are needed are basically:
  • I log in to the web portal (just once)
  • Click on search field and enter user name
  • Click on said user
  • Fill in a couple of text boxes, check a couple of boxes, select values from a drop down list
  • save form
  • GOTO search for a user

I used AutoIT for a similar job a few years ago but I figured I'd give Python a try for this (plus I forgot AutoIT).

Hughmoris
Apr 21, 2007
Let's go to the abyss!

baka kaba posted:

It is basically automating someone sitting at the computer and doing all that stuff though, probably take a while. Is it possible to use something like Requests and just POST the form data that's being sent, without having to load their web pages?

Hmmmmm...

My web knowledge is pretty sparse. To see if this is feasible, should I try to record the network traffic while I submit a form and examine the parameters of the POST?

Hughmoris
Apr 21, 2007
Let's go to the abyss!

Sockser posted:

Hi there, I'm an automation engineer. This is entirely my bread and butter.

The solution to this depends what you're trying to solve.
Is this a testing thing? Or is it literally just people doing manual data entry?

If testing, are you testing that some service / webpage that takes data and crunches it? If that's the case, as people before have said, you're better off just issuing posts requests to the page, or if it's a service, interacting directly with the service. If you're testing some UI components, then selenium is the jam.
If it's data entry, then yeah, see above minus the selenium suggestion.

Automation engineer sounds pretty cool.

No testing involved in this, just plain 'ole data entry. With my skill set, using pure POST requests seems pretty risky. I'll likely play it safe and use selenium to navigate the page while I surf the web.

Thanks for the ideas.

Hughmoris
Apr 21, 2007
Let's go to the abyss!
Thanks for the advice. After attempting to (badly) analyze the network traffic for direct POST requests, I ended up going the selenium route. It was a nice learning exercise, and I've discovered quite a few little tricks that I think will make the next time easier.

Hughmoris
Apr 21, 2007
Let's go to the abyss!
For the Pandas users out there, what type of things (if any) do you bounce back to Excel for?

Hughmoris
Apr 21, 2007
Let's go to the abyss!

duck monster posted:

Assuming you dont have access to the forms source code, consider using something like TamperData to get a capture of the forms data, then build up a script using python requests or a similar library (use requests, its fantastic) and just pump them in that way. requests should be able to hand any cookying you'll need to do.

Selenium can be kinda flakey at times, due to externalities like page load times and the like. Just do POSTs using the requests library. Its super easy. Unless its some sort of hosed up ASP thing with all sorts of nasty javascript and hosed up non-restful state. Those things can be bitches to scrape and post to.

edit: I must be the 5th or 6th person to suggest this. My bad.

Thanks for the advice. I've been poking around this some more to see if I can figure out how to do direct POST actions.

I'm seeing a lot of jQuery and ajax stuff occuring when I'm looking in developer tools. I believe I've found where the form post occurs but it appears to have some key/tokens associated with it so I'm not sure how to approach it in Python.

My web knowledge is weak so I might not be using the right terms.

Hughmoris
Apr 21, 2007
Let's go to the abyss!

duck monster posted:

Whats the web thing based on?

I'm not positive what your question is. Its an administrative web portal where a user can assign roles and permissions to other users. I'm not sure what stack it's running but the interface looks pretty old.

Hughmoris
Apr 21, 2007
Let's go to the abyss!
I've never written a test for any of my projects, and I'd like to change that. I know very little about the subject in general. As a novice, what testing package/methodology should I commit to learning? I'll be using Pycharm, if it makes any difference.

Hughmoris
Apr 21, 2007
Let's go to the abyss!

Seventh Arrow posted:

I have a comma-separated spreadsheet with a bunch of information about condos in my city, most importantly it has the latitude and longitude of these places. I want to be able to output these coordinates onto google maps but I'm not sure how to go about doing this. I looked at this link but none of the API's seem to quite provide what I'm looking for (at least, not with python). Any suggestions?

Maybe something like this? https://github.com/vgm64/gmplot

Hughmoris
Apr 21, 2007
Let's go to the abyss!
Anyone here use Pandas to generate reports for end-users? If so, what does your workflow look like? I'm stuck a bit in the middle where my current process is to use Python to do data cleanup but then I load the data in an Excel workbook for charts and pivot tables to share with users.

Hughmoris
Apr 21, 2007
Let's go to the abyss!

vikingstrike posted:

What type of reports are you thinking? Out of what you describe, the logical addition would be matplotlib/seaborn to plot figures in python.

I work in healthcare and my current report goes to department managers and shows staff compliance for documentation of a certain procedure. The vast majority of managers are not technical but they are comfortable enough to open up the Excel workbook I email them and at least look at the first chart that shows how their department is doing against the hospital.

If there is a way to paste an image inline in Outlook 2013, I've thought about removing the workbook entire and generate an email for each department and paste the charts and table inside the email body. Basically trying to spoon feed the end-user as much as possible to make their life easier.

Hughmoris
Apr 21, 2007
Let's go to the abyss!
I'm trying to figure out how this torrent name parsing library works, and I'm stumped when looking at the code. I get the gist of it but I don't fully understand how it goes about it.

https://github.com/divijbindlish/parse-torrent-name/blob/master/PTN/parse.py

What's a good method of attack when trying to figure out something like this? Toss in print statements everywhere? Does PyCharm have functionality specifically for this that I should explore?

Hughmoris
Apr 21, 2007
Let's go to the abyss!
Also on Reddit is a clean list of Pycon 2018 talks.

Hughmoris
Apr 21, 2007
Let's go to the abyss!
I was reading this blog talking about a simple Perl and Perl 6 benchmark, and I was curious how Python stacks up: http://brrt-to-the-future.blogspot.com/2018/08/a-curious-benchmark.html

Using a Raspberry Pi 3, the C code runs in 1.9 seconds. The Perl code runs in 44 seconds.

My straight-forward Python 3.4 code runs in 64 seconds. Are there any easy wins to speed this up?
Python code:
x = 0

for i in range(1, 50000001):
    x = x + (1 / i)

print(x) # 18.304749238293297

Hughmoris fucked around with this message at 17:41 on Sep 9, 2018

Hughmoris
Apr 21, 2007
Let's go to the abyss!

Spime Wrangler posted:

code:
def version_c():
    """
    Numpy to generate reciprocals and to sum
    """
    xs = np.arange(1, max)
    rs = 1/xs
    x = np.cumsum(rs)[-1]
    print(x)

Thanks for this. On the Raspberry Pi 3, version_c tosses me a MemoryError but it helps to see how I could use Numpy for something like this.

It chewed through 40,000,000 in 9 seconds.

Adbot
ADBOT LOVES YOU

Hughmoris
Apr 21, 2007
Let's go to the abyss!

Dr Subterfuge posted:

Use np.sum instead of np.cumsum. It should just return a scalar in this case, which is all you want anyway.

Awesome. This change allowed the RP3 script to complete the original 50,000,000.

Original pure python solution: 64 seconds
Numpy solution: 3 seconds.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply