Python

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Python

«‹›230 »

Nippashish: Nov 2, 2005; Let me see you dance!

The arange and the reciprocal aren't helping with memory use either.

# ? Sep 9, 2018 19:32

Adbot: ADBOT LOVES YOU

# ? May 22, 2024 14:49

Hughmoris: Apr 21, 2007; Let's go to the abyss!

Dr Subterfuge posted:

Use np.sum instead of np.cumsum. It should just return a scalar in this case, which is all you want anyway.

Awesome. This change allowed the RP3 script to complete the original 50,000,000.

Original pure python solution: 64 seconds
Numpy solution: 3 seconds.

# ? Sep 9, 2018 19:45

QuarkJets: Sep 8, 2008

Dr Subterfuge posted:

I had to resist just giving a trite answer of "use numpy." Is there any other python way that comes close to the same speed?

Numba would likely be just as fast if not faster (it takes a little time to compile the function but then you no longer need to create huge temporary arrays), and the implementation would be adding 1 decorator to version A

# ? Sep 9, 2018 20:26

Spime Wrangler: Feb 23, 2003; Because we can.

Dr Subterfuge posted:

Use np.sum instead of np.cumsum. It should just return a scalar in this case, which is all you want anyway.

Yeah of course do this lol

# ? Sep 9, 2018 20:58

Hughmoris: Apr 21, 2007; Let's go to the abyss!

QuarkJets posted:

Numba would likely be just as fast if not faster (it takes a little time to compile the function but then you no longer need to create huge temporary arrays), and the implementation would be adding 1 decorator to version A

Holy moly, Numba flies. I hopped over to my main desktop to test (since it appears numba cant run on RP3)

Python code:

max_n = 50000001


def version_numpy():
    """
    Numpy to generate reciprocals and to sum
    """
    xs = np.arange(1, max_n)
    rs = 1/xs
    x = np.sum(rs)
    print(x)

@jit
def version_numba():
    """
    Numba to generate reciprocals and to sum
    """
    x = 0
    for i in range(1, max_n):
        x = x + (1 / i)
    print(x)

n = 20

total_numpy = timeit(version_numpy, number=n)
total_numba = timeit(version_numba, number=n)

print('avg time for numpy = {0:0.3f} seconds'.format(total_numpy/n))
print('avg time for numba = {0:0.3f} seconds'.format(total_numba/n))

code:

avg time for numpy = 0.285 seconds
avg time for numba = 0.052 seconds

# ? Sep 9, 2018 21:04

Munkeymon: Aug 14, 2003; Motherfucker's got an
armor-piercing crowbar! Rigoddamndicu𝜆ous.

I think division might happen in the OS instead of being an instruction on ARM, so your desktop would have wildly different performance for that specific operation. At least for integers - dunno about floats.

# ? Sep 10, 2018 15:08

SurgicalOntologist: Jun 17, 2004

This is a very specific question but maybe someone has been in this situation before.

I have a SQL database I'm managing with SQLAlchemy. Lots of tables, lots of code, using lots of different SQLAlchemy features. I'm now going to build a web frontend for the whole thing with Flask. So I'm wondering if there is anything to look out for porting from Sqlalchemy to Flask-Sqlalchemy. The base classes are different so there's at least a chance for some shenanigans. Luckily I have extensive unit tests so I should at least become aware of any issues but I thought I'd ask around before I start the migration.

# ? Sep 11, 2018 23:03

PBS: Sep 21, 2015

SurgicalOntologist posted:

This is a very specific question but maybe someone has been in this situation before.

I have a SQL database I'm managing with SQLAlchemy. Lots of tables, lots of code, using lots of different SQLAlchemy features. I'm now going to build a web frontend for the whole thing with Flask. So I'm wondering if there is anything to look out for porting from Sqlalchemy to Flask-Sqlalchemy. The base classes are different so there's at least a chance for some shenanigans. Luckily I have extensive unit tests so I should at least become aware of any issues but I thought I'd ask around before I start the migration.

I couldn't figure out how to get cursors working with it, but I'm new to it and python so it's probably just me.

# ? Sep 12, 2018 01:51

bob dobbs is dead: Oct 8, 2017; I love peeps; Nap Ghost

SurgicalOntologist posted:

This is a very specific question but maybe someone has been in this situation before.

I have a SQL database I'm managing with SQLAlchemy. Lots of tables, lots of code, using lots of different SQLAlchemy features. I'm now going to build a web frontend for the whole thing with Flask. So I'm wondering if there is anything to look out for porting from Sqlalchemy to Flask-Sqlalchemy. The base classes are different so there's at least a chance for some shenanigans. Luckily I have extensive unit tests so I should at least become aware of any issues but I thought I'd ask around before I start the migration.

flask-sqlalchemy is almost completely useless, just use normal sqlalchemy and like 5-15 lines of wrapper code

flask is good as hell, flask-hyphen-stuff sucks

# ? Sep 12, 2018 02:32

SurgicalOntologist: Jun 17, 2004

haha ok, easy enough

# ? Sep 12, 2018 02:48

SnatchRabbit: Feb 23, 2006; by sebmojo

I'm writing a function that checks over some db snapshots. First, it is going to check the timestamp on the snapshot. If older than 60 days it will check for some tags. If it finds the tag I want, it will delete the snapshot. If not I want it to give me a skipping message. The function seems to evaluate all my snapshots more or less correctly, the issue I am having is that I only get the "snapshotid does not have Weekly tag, skipping" only shows up when there is a list of other tags present, but not if the tags are completely empty. Is there a relatively simple way to adapt my if statement to accommodate an empty tag list?

code:

  for snapshot in dbsnapshots:
    dbsnapshotid = snapshot['DBSnapshotIdentifier']
    dbsnapshotcreatetime = snapshot['SnapshotCreateTime']
    dbsnapshotarn = snapshot['DBSnapshotArn']
    timedifference = currentdate - dbsnapshotcreatetime
    if timedifference.days>60:
      print (dbsnapshotid + " is older than 60 days. Checking tags")
      tags = client.list_tags_for_resource(ResourceName=dbsnapshotarn)
      tags = tags['TagList']
      for tag in tags:
        if tag["Key"] == 'DBBackupFrequency' and tag["Value"] == 'Weekly':
          print ('Weekly tag found, Deleting snapshot')
        else:
          print (dbsnapshotid + ' does not have Weekly tag, skipping')
    else:
      print (dbsnapshotid + " is not older than 60 days, skipping")

# ? Sep 12, 2018 17:44

M. Night Skymall: Mar 22, 2012

SnatchRabbit posted:

I'm writing a function that checks over some db snapshots. First, it is going to check the timestamp on the snapshot. If older than 60 days it will check for some tags. If it finds the tag I want, it will delete the snapshot. If not I want it to give me a skipping message. The function seems to evaluate all my snapshots more or less correctly, the issue I am having is that I only get the "snapshotid does not have Weekly tag, skipping" only shows up when there is a list of other tags present, but not if the tags are completely empty. Is there a relatively simple way to adapt my if statement to accommodate an empty tag list?

I'd just add a check for an empty list like so:

code:

  for snapshot in dbsnapshots:
    dbsnapshotid = snapshot['DBSnapshotIdentifier']
    dbsnapshotcreatetime = snapshot['SnapshotCreateTime']
    dbsnapshotarn = snapshot['DBSnapshotArn']
    timedifference = currentdate - dbsnapshotcreatetime
    if timedifference.days>60:
      print (dbsnapshotid + " is older than 60 days. Checking tags")
      tags = client.list_tags_for_resource(ResourceName=dbsnapshotarn)
      tags = tags['TagList']
      for tag in tags:
        if tag["Key"] == 'DBBackupFrequency' and tag["Value"] == 'Weekly':
          print ('Weekly tag found, Deleting snapshot')
        else:
          print (dbsnapshotid + ' does not have Weekly tag, skipping')
      if not tags:
        print (dbsnapshotid + ' does not have Weekly tag, skipping')
    else:
      print (dbsnapshotid + " is not older than 60 days, skipping")

# ? Sep 12, 2018 18:03

SurgicalOntologist: Jun 17, 2004

That would work, but to better reflect the logic of the operation (you want to know if the tag is there or not) I would do it something like this:

Python code:

if  any(tag['Key'] == 'DBBackupFrequency' and tag['Value'] == 'Weekly' for tag in tags):
    print(f'Weekly tag found, deleting snapshot {dbsnapshotid}')
else:
    print(f'Snapshot {dbsnapshotid} does not have weekly tag, skipping')

Any will return False if the list is empty--it's the funcion you're looking for here.

Edit: Also, I would recomend using the continue statement to skip an element in a loop. More readable than if/else spaghetti--the actual delete code will be only one level nested (the for loop) rather than under a couple of ifs.

Python code:

for snapshot in dbsnapshots:
    dbsnapshotid = snapshot['DBSnapshotIdentifier']
    dbsnapshotcreatetime = snapshot['SnapshotCreateTime']
    dbsnapshotarn = snapshot['DBSnapshotArn']

    timedifference = currentdate - dbsnapshotcreatetime
    if timedifference.days <= 60:
        print (f'{dbsnapshotid} is not older than 60 days, skipping')
        continue

    tags = client.list_tags_for_resource(ResourceName=dbsnapshotarn)['TagList']
    if not any(tag['Key'] == 'DBBackupFrequency' and tag['Value'] == 'Weekly' for tag in tags):
        print(f'Snapshot {dbsnapshotid} does not have weekly tag, skipping')
        continue

    print(f'Weekly tag found, deleting snapshot {dbsnapshotid}')
    # Do the delete here.

SurgicalOntologist fucked around with this message at 18:16 on Sep 12, 2018

# ? Sep 12, 2018 18:09

SnatchRabbit: Feb 23, 2006; by sebmojo

Thanks!

# ? Sep 12, 2018 18:55

Cirofren: Jun 13, 2005; Pillbug

SurgicalOntologist posted:

Edit: Also, I would recomend using the continue statement to skip an element in a loop.

This looks great and is rarely something that comes to mind. Thanks for this.

# ? Sep 12, 2018 22:12

mr_package: Jun 13, 2000

When working with pathlib do you write your functions so that they assume they will receive Path objects, or do you still call Path(input_value) on input in case something is a string? Similarly, when the paths come into your program as strings (user input, databases, etc.) do you convert them to Path right away or do you save them as strings and then convert to Path when needed?

# ? Sep 14, 2018 21:26

Thermopyle: Jul 1, 2003; ...the stupid are cocksure while the intelligent are full of doubt. �Bertrand Russell

The caller of my functions already has to conform to the signature of my function, so I don't get wishy-washy with the types of my parameters.

If I'm working with pathlib, I take Path objects not strings.

# ? Sep 14, 2018 21:36

Data Graham: Dec 28, 2009; 📈📊🍪😋

In general I like to write my methods to take richer objects rather then sparser. It�s easier to expand the later that way (example: method that takes an array of things to operate on rather than a single thing)

# ? Sep 15, 2018 14:46

velvet milkman: Feb 13, 2012; by R. Guyovich

I pretty much exclusively work with Path objects now, including type annotations for methods, and just cast them to strings when totally necessary.

# ? Sep 15, 2018 15:00

Tortilla Maker: Dec 13, 2005; Un Desmadre A Toda Madre

Brief: How can I iterate through a list when I need to specify the index number?

New to programming languages so please excuse bad phrasing/terminology.

I need to work with an API that limits query returns to just 50 rows of data per call. To allow for pagination, I can specify the starting row I'd like to query for:
Call 1: Start at row 0
Call 2: Start at row 50
Call 3: etc.

As I need far more than 50 rows, I need to query across many, many calls/paginations. I wrote code to loop through a few instances and it successfully calls the API the designated number of times. However, since I"m iterating with a loop, I can't append to a dictionary so I'm having to use a list.

I want to send the JSON return into a dataframe but I'm having trouble figuring out how to do this loop. Since it's a list, I'm having to designate the index number but I'm not sure how I can iterate through the full sequence automatically.

This is an example of how I would call each index manually from the list:

code:

vin = []

for item in json_response[0]['listings']:
    vins.append({'vin':item['vin']})

for item in json_response[1]['listings']:
    vins.append({'vin':item['vin']})
    
for item in json_response[2]['listings']:
    vins.append({'vin':item['vin']})

This is a failed attempt to loop through the index:

code:

vin = []
for item in range(len(json_response['listings'])):
    vins.append({'vin':item['vin']})

Any thoughts?

# ? Sep 16, 2018 05:27

bob dobbs is dead: Oct 8, 2017; I love peeps; Nap Ghost

enumerate is your friend

also, you don't need it

also, i would've named the toplevel var `json_responses`

code:

vins = []
for json_response in json_responses:
    vins.extend(list(map(lambda x: {'vin': x['vin']}, json_response["listings"])))

or,

code:

vins = sum(map(process_json_response, json_responses))

def process_json_response(resp):
    return list(map(lambda x: {'vin': x['vin']}, resp["listings"]))

not very iteratorish, i must admit

bob dobbs is dead fucked around with this message at 05:49 on Sep 16, 2018

# ? Sep 16, 2018 05:36

baka kaba: Jul 19, 2003; PLEASE ASK ME, THE SELF-PROFESSED NO #1 PAUL CATTERMOLE FAN IN THE SOMETHING AWFUL S-CLUB 7 MEGATHREAD, TO NAME A SINGLE SONG BY HIS EXCELLENT NU-METAL SIDE PROJECT, SKUA, AND IF I CAN'T PLEASE TELL ME TO
EAT SHIT

In case that's not clear (since you said you're new to this)

Python code:

json_response = [
    {'listings': [
            {'vin': 1234},
            {'vin': 9999}
        ]
    },
    {'listings': [
            {'vin': 3456}
        ]
    }
]

vins = []
for item in json_response:
    for listing in item['listings']:
        vins.append({'vin': listing['vin']})

print(vins)

The first loop is grabbing each item in the response, and the second loop pulls out the 'listings' object from that item dictionary, and loops over each listing. And inside that loop, you're pulling the vin from the listing, and popping it a new dictionary to add to the list you're building

So you don't actually need to mess with indices, you take a thing and say "for thing in collection_of_things" and write your code to mess with each 'thing' in the loop body. Sometimes (like here) you need to do that again inside the loop, because 'thing' has another bunch of stuff you want to iterate over, and so on

There are way cleverer and more concise ways of doing this (like bob's examples) but I just wanted to lay out the basics just in case

# ? Sep 16, 2018 22:50

Tortilla Maker: Dec 13, 2005; Un Desmadre A Toda Madre

Thank you both for your responses. This was really helpful and definitely put me on the right path!

# ? Sep 17, 2018 01:12

bob dobbs is dead: Oct 8, 2017; I love peeps; Nap Ghost

Tortilla Maker posted:

Thank you both for your responses. This was really helpful and definitely put me on the right path!

this sort of manipulation is always a lot easier with a lil copied concrete example in comments somewhere or a type annotation

# ? Sep 17, 2018 06:54

Dominoes: Sep 20, 2007

Now that Python 3.7's been out for a few months, what do you think of dataclasses?

I've been making liberal use of them; they remind me of structs in C or Rust, yet include features you'd expect in a high-level language, like default behavior for printing and equality. For many cases, I seem them as a nicer way to bundle data than [named]tuples.

# ? Sep 18, 2018 01:15

cinci zoo sniper: Mar 15, 2013

Same opinion as yours, neat replacement to names tipped.

# ? Sep 18, 2018 10:21

Slimchandi: May 13, 2005; That finger on your temple is the barrel of my raygun

I'm trying to understand more about environments and how they will fit in to my use case, but I'm struggling so any help appreciated.

I've been developing apps with a GUI built in ipywidgets, so they are intended to run entirely within a notebook. I have created separate packages and host them as a zip on a local file server.

I would also like to have separate environments for each app, to control external dependencies and versions etc. Ideally I would include the env with the python files in the package.

I'm not sure whether I can either:
I
1) create separate envs and then choose the correct env inside jupyter notebook (preferable)

Or

2) create separate envs and then install notebook into each environment, so you have to preselect the correct env before launching notebook and each app.

Or do I completely misunderstand all of this?

# ? Sep 24, 2018 17:18

QuarkJets: Sep 8, 2008

The second one. An environment describes what modules (and executables) will be available to the notebook

There may be a way to do the first but I can't imagine it so it's probably convoluted and difficult

# ? Sep 25, 2018 05:30

Jose Cuervo: Aug 25, 2004

I taught a class for a colleague last week that used Jupyter notebooks. Each student was told to download the notebook from the class website, and then we went through the notebook in class and the students had cells where they had to type their own code.

One of the issues I encountered when plotting using seaborn was that not everyone had the exact same plot - i.e., the scatter plot matrix looked slightly different between students (the underlying shape etc was correct, but the presentation was different). I think this came down to the fact that not everyone was using Python 3 like I was, and not everyone had the same version of seaborn installed.

Another issue was pandas.cut() worked slightly differently for everyone because of changes between versions.

Question: Is there a standard / simple way in the Notebook to ensure that everyone uses the same version of Python, and the same version of the packages being imported?

# ? Sep 26, 2018 21:01

Symbolic Butt: Mar 22, 2009; (_!_); Buglord

Installing Anaconda seems to be the best solution for this in my experience.

# ? Sep 26, 2018 21:39

Loel: Jun 4, 2012; "For the Emperor."

There was a terrible noise.
There was a terrible silence.

M. Night Skymall posted:

The other problem with that code is that the exception you're using the try/except block to catch happens when you convert the input from a string into an integer, you need to move that part inside the try.

Aha, thank you

My next project, I'm trying to get the top player each morning from a fantasy football site.

code:

import bs4, requests

resRB = requests.get('http://www03.myfantasyleague.com/2018/adp?COUNT=200&POS=RB&ROOKIES=0&INJURED=0&CUTOFF=5&FRANCHISES=12&IS_PPR=1&IS_KEEPER=0&IS_MOCK=0&TIME=')
res.raise_for_status() 
Round1 = sb4.BeautifulSoup(resRB.text, 'html.parser') #this is getting the website info for RBs

Ive gotten this far to pull the data for the site, but Im not clear on how to format 'grab the top guy's name'.

# ? Sep 26, 2018 23:05

OnceIWasAnOstrich: Jul 22, 2006

Jose Cuervo posted:

Question: Is there a standard / simple way in the Notebook to ensure that everyone uses the same version of Python, and the same version of the packages being imported?

Run a JupyterHub server and set everyone to use one consistent environment?

# ? Sep 27, 2018 00:13

baka kaba: Jul 19, 2003; PLEASE ASK ME, THE SELF-PROFESSED NO #1 PAUL CATTERMOLE FAN IN THE SOMETHING AWFUL S-CLUB 7 MEGATHREAD, TO NAME A SINGLE SONG BY HIS EXCELLENT NU-METAL SIDE PROJECT, SKUA, AND IF I CAN'T PLEASE TELL ME TO
EAT SHIT

How would you explain it to someone? How do they find the top guy on that page?

Sometimes you can look at the CSS and find a handy identifier for the thing you're looking for, other times you need to start thinking about how the document is structured. Find thing, get other thing within that, etc

# ? Sep 27, 2018 00:15

bob dobbs is dead: Oct 8, 2017; I love peeps; Nap Ghost

Symbolic Butt posted:

Installing Anaconda seems to be the best solution for this in my experience.

anaconda is de facto within continuum analytics a two-man dealie

one of those two men is quitting

take as you will

# ? Sep 27, 2018 05:35

cinci zoo sniper: Mar 15, 2013

bob dobbs is dead posted:

anaconda is de facto within continuum analytics a two-man dealie

one of those two men is quitting

take as you will

Oh. Oh.

# ? Sep 27, 2018 06:27

bob dobbs is dead: Oct 8, 2017; I love peeps; Nap Ghost

less bad than it sounds because they have a lotta developers who are just ok familiar w/ it but lol

# ? Sep 27, 2018 06:33

pmchem: Jan 22, 2010

bob dobbs is dead posted:

anaconda is de facto within continuum analytics a two-man dealie

one of those two men is quitting

take as you will

...but isn't that their primary product to the world? What's the story?

# ? Sep 27, 2018 13:08

Jose Cuervo: Aug 25, 2004

Symbolic Butt posted:

Installing Anaconda seems to be the best solution for this in my experience.

What does this process look like? Get everyone in the class to download Anaconda on day one of the class, then...?

OnceIWasAnOstrich posted:

Run a JupyterHub server and set everyone to use one consistent environment?

The documentation for this looks like it is tailor made for this, thanks.

# ? Sep 27, 2018 16:51

SurgicalOntologist: Jun 17, 2004

Jose Cuervo posted:

What does this process look like? Get everyone in the class to download Anaconda on day one of the class, then...?

You can have them copy-paste a line that installs the right versions of everything. Or give them an environment.yml file that specifies everything (although they will still have to copy paste the line that installs it).

JupyterHub is idea though if you have access to a server and a bit of patience to set it up. I did a data science class in JupyterHub, it was a bit of a pain to set up authentication and everything but was smooth after that. That was a couple years ago, it's probably easier now.

# ? Sep 27, 2018 17:44

Adbot: ADBOT LOVES YOU

# ? May 22, 2024 14:49

Thermopyle: Jul 1, 2003; ...the stupid are cocksure while the intelligent are full of doubt. �Bertrand Russell

pmchem posted:

...but isn't that their primary product to the world? What's the story?

I'm interested in this as well. I though anaconda was the reason they existed.

I think we used to have a continuum guy who posts here sometimes...

# ? Sep 27, 2018 20:06

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Python

«‹›230 »