Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Thermopyle
Jul 1, 2003

...the stupid are cocksure while the intelligent are full of doubt. —Bertrand Russell

What's the right tool to use nowadays to install tools and scripts that are pip-installable without cluttering up the system python? To be clear, I'm not talking about things I want to use for development or to develop on. I'm talking about things that should actually be a standalone download (for Windows) or a apt-get install away, but the author doesn't package them that way. Think about tools like youtube-dl.

I seem to recall something that makes it easy to use virtualenvs in a seamless way with such things...

Adbot
ADBOT LOVES YOU

QuarkJets
Sep 8, 2008

Thermopyle posted:

What's the right tool to use nowadays to install tools and scripts that are pip-installable without cluttering up the system python? To be clear, I'm not talking about things I want to use for development or to develop on. I'm talking about things that should actually be a standalone download (for Windows) or a apt-get install away, but the author doesn't package them that way. Think about tools like youtube-dl.

I seem to recall something that makes it easy to use virtualenvs in a seamless way with such things...

Are you thinking of conda environments?

Boris Galerkin
Dec 17, 2011

I don't understand why I can't harass people online. Seriously, somebody please explain why I shouldn't be allowed to stalk others on social media!

Thermopyle posted:

What's the right tool to use nowadays to install tools and scripts that are pip-installable without cluttering up the system python? To be clear, I'm not talking about things I want to use for development or to develop on. I'm talking about things that should actually be a standalone download (for Windows) or a apt-get install away, but the author doesn't package them that way. Think about tools like youtube-dl.

I seem to recall something that makes it easy to use virtualenvs in a seamless way with such things...

I asked last page for I think the same thing that you described and someone suggested py2exe or pex. I haven’t looked into them too much but at a casual glance I think pex is what I was looking for.

Thanks goonMalcom XML who recommended those by the way.

e:

Malcolm XML posted:

Yeah py2exe or pex

Boris Galerkin fucked around with this message at 08:49 on Apr 6, 2018

Thermopyle
Jul 1, 2003

...the stupid are cocksure while the intelligent are full of doubt. —Bertrand Russell

Boris Galerkin posted:

I asked last page for I think the same thing that you described and someone suggested py2exe or pex. I haven’t looked into them too much but at a casual glance I think pex is what I was looking for.

Thanks goon who recommended those by the way.

Yes, pex is what I was thinking of, thanks!

Dr Subterfuge
Aug 31, 2005

TIME TO ROC N' ROLL
I've been trying to figure out decorators, but one thing that I've always had trouble with is composition of functions. Basically, I realized that I was creating a bunch of methods that were each supposed to modify a different class attribute. I was getting the attribute from self, doing whatever I needed to change the value, and then setting it back. This seems like an ideal use case for a decorator, but I'm struggling to write one that does this.

It seems like I should be able to change this:
code:
class A(object):
    def __init__(self, data):
        self.a = [1]
        self.data = data

    def self.update_a(self):
        a = self.a
        a.append(self.data)
        self.a = a
to something like this:
code:
class A(object):
    def __init__(self, data):
        self.a = [1]
        self.data = data

    @modify_self('a')
    def update_a(self, a=[]):
        a.append(self.data)
        return a
where
code:
>> A(2)
>> A.update_a()
>> A.a
[1, 2]
I know I can use inspect to find the class that called update_a (this seems promising), and then use getattr(cls, 'a') to get the value to inject. I think my questions are
1. how to get arbitrary arguments out of a decorator call and
2. how to use functools.update_wrapper to inject the new value. I think I actually need functools.partialmethod?

On the other hand, if there is a better pattern than this class structure to go about updating a bunch of fields (in different ways) from a given input, that would be good to know, too. I'm well past the point where my Python abilities have outstripped my design abilities.

E: Well, getting the class is proving to be difficult. The object the decorator says it's wrapping it is the function A.update_a, which breaks the linked class sniffer because the module __main__ doesn't have an attribute A

E2: I think this works. Classes to the rescue! Modified code from here.

Python code:
class modify_self(object):
    """
    Injects the supplied  class attributes into the decorated method and updates those attributes
    with the method's output
    
    usage:
    @modify_self('a'):
    def update_a(self, a=None)
        #do stuff
        return a,
        
    is equivalent to
    def update_a(self):
        a = self.a
        #do stuff
        self.a = a
    """
    def __init__ (self, *args):
        # store arguments passed to the decorator
        self.args = args

    def __call__(self, func):
        def newf(*args):
            #the 'self' for a method function is passed as args[0]
            slf = args[0]

            # get the passed attributes from the containing class
            kwargs = {attr: getattr(slf, attr) for attr in self.args}

            # call the method
            result = func(slf, **kwargs)

            # put things back
            for field, value in zip(self.args, result):
                setattr(slf, field, value)

        newf.__doc__ = func.__doc__
        return newf

class A(object):
    def __init__(self, data):
        self.a = [1]
        self.b = [2]
        self.c = [3]
        self.data = data

    @modify_self('a', 'b')
    def update_a_and_b(self, a=None, b=None):
        a.append(self.data)
        b.append(self.data)
        return a, b

    @modify_self('c')
    def update_c(self, c=None):
        c.append(self.data)
        return c,


>> test = A('test')
>> test.update_a_and_b()
>> print(test.a)
[1, 'test']
>> print(test.b)
[2, 'test']
>> test.update_c()
>> print(test.c)
[3, 'test']
Works as long as the return value of the modified function is always a tuple eg return c, instead of return c.

E3: using functools.wraps(func) as a decorator on newf is probably superior to just making the __doc__ attributes equal?

E4: If someone could explain how the args in the __init__ and __call__ methods are magically different (and why args in __init__ doesn't consume func) I'd be really interested to know. Because this still feels like witchcraft to me.

Dr Subterfuge fucked around with this message at 23:57 on Apr 7, 2018

Wallet
Jun 19, 2006

unpacked robinhood posted:

Is there a simple common method to correct typos ?
I have a list of maybe mistyped words, and a csv with the reference spelling. The CSV has 36000 entries.

I can bruteforce it using Levenshtein distance between each word of each list but it's massively inefficient.

e: I found this

I don't know if there's a common method, but pyenchant implements Enchant in python and works pretty well (although the author of pyenchant has very recently stopped actively maintaining the project).

breaks
May 12, 2001

Dr Subterfuge posted:

E3: using functools.wraps(func) as a decorator on newf is probably superior to just making the __doc__ attributes equal?

E4: If someone could explain how the args in the __init__ and __call__ methods are magically different (and why args in __init__ doesn't consume func) I'd be really interested to know. Because this still feels like witchcraft to me.

I don't comprehend at all what you're trying to accomplish. What you've got there at the moment seem to be one hell of a way to write self.some_list.append(some_stuff). On the other hand I've probably never manged to successfully understand a post in this thread when reading it at 2AM (or maybe ever) so I'll just answer these two specific questions:

Yes, if you're wrapping a function and want to make your wrapper appear to be what it's wrapping, use wraps.

The decorator syntax is only a shortcut for something that's otherwise ugly:

code:
def f(*args):
    print(*args)

f = decorator(f)  # If there was no @decorator syntax you would just write this
There is nothing magical about @decorator(a, b). Writing it out by hand results in a very literal translation:

code:
f = decorator(a, b)(f)
# or to really spell it out
actual_decorator = decorator_factory(a, b)
f = actual_decorator(f)


Disregarding multiple decorators and within the rather restrictive limits of what Python's grammar allows after the @, @X before f is the same as f = X(f) after it.

Does this help you understand what is going on? A decorator always gets called with one argument, which is the thing it's decorating. A "decorator with arguments" is a callable that returns a decorator. I think it's more clear to think of that as a decorator factory, but for whatever reasons the common terminology is just to lump it all together as "decorator". In what you wrote, __init__ is called as part of the process of instantiating the class. That's the factory part. Once the instance is created it's then called, which is possible since you defined __call__, and that's the decorator part.

breaks fucked around with this message at 08:42 on Apr 8, 2018

unpacked robinhood
Feb 18, 2013

by Fluffdaddy

Wallet posted:

I don't know if there's a common method, but pyenchant implements Enchant in python and works pretty well (although the author of pyenchant has very recently stopped actively maintaining the project).

I'm giving it a try but I'm missing something. Is this a correct usage to spellcheck against a word list:

Python code:
import enchant
b=enchant.Broker()
d=b.request_pwl_dict('data/villes_fr.txt')
b.request('PARIS')
?

villes_fr.txt is a simple flat file like this:
pre:
QUINTAL
AMBILLY
TALLOIRES
PARIS
ESLETTES
The above code runs fine but returns False.

Wallet
Jun 19, 2006

unpacked robinhood posted:

I'm giving it a try but I'm missing something. Is this a correct usage to spellcheck against a word list:

Python code:
import enchant
b=enchant.Broker()
d=b.request_pwl_dict('data/villes_fr.txt')
b.request('PARIS')

You probably want something like this:
Python code:
import enchant
word = "PORIS"
d = enchant.request_pwl_dict('data/villes_fr.txt')
if not d.check(word):
	suggestions = d.suggest(word)
	print(suggestions)
pyenchant handles dealing with Brokers for you if you don't specify one, so you (probably) don't need to mess with them.

Wallet fucked around with this message at 14:18 on Apr 8, 2018

Dr Subterfuge
Aug 31, 2005

TIME TO ROC N' ROLL

breaks posted:

I don't comprehend at all what you're trying to accomplish. What you've got there at the moment seem to be one hell of a way to write self.some_list.append(some_stuff). On the other hand I've probably never manged to successfully understand a post in this thread when reading it at 2AM (or maybe ever) so I'll just answer these two specific questions:

Yes, if you're wrapping a function and want to make your wrapper appear to be what it's wrapping, use wraps.

The decorator syntax is only a shortcut for something that's otherwise ugly:

code:
def f(*args):
    print(*args)

f = decorator(f)  # If there was no @decorator syntax you would just write this
There is nothing magical about @decorator(a, b). Writing it out by hand results in a very literal translation:

code:
f = decorator(a, b)(f)
# or to really spell it out
actual_decorator = decorator_factory(a, b)
f = actual_decorator(f)


Disregarding multiple decorators and within the rather restrictive limits of what Python's grammar allows after the @, @X before f is the same as f = X(f) after it.

Does this help you understand what is going on? A decorator always gets called with one argument, which is the thing it's decorating. A "decorator with arguments" is a callable that returns a decorator. I think it's more clear to think of that as a decorator factory, but for whatever reasons the common terminology is just to lump it all together as "decorator". In what you wrote, __init__ is called as part of the process of instantiating the class. That's the factory part. Once the instance is created it's then called, which is possible since you defined __call__, and that's the decorator part.

Basically, I have some huge scraper functions that I'm trying to refactor into smaller components (with maybe the eventual goal of messing around with asyncio), and the structure I came up with was to turn each scraper into a class (or rather a subclass of a base scraper) and use attribute access in class methods. So I'm not really just appending things. It was mostly to illustrate that I want to be able to modify an existing value. I'm aware there are whole scraping packages like scrapy that have already solved most of these problems. I'm mostly just messing around with rolling something myself to see if I can learn anything in the process. (But one of those is getting better at designing things, so if there's a better way to go about updating a bunch of different fields in a data structure that would be cool to know.)

Looking at my code again and seeing your examples, I can see now how the two sets of args are different. The first args come from the arguments of the class, and the second args come from the fact that newf replaces the decorated function and gets called on the decorated function's arguments. The big thing I was missing was how the call that modifies the decorated function works.

unpacked robinhood
Feb 18, 2013

by Fluffdaddy

Wallet posted:

You probably want something like this:

Thanks ! At a glance it seems way faster too.

Wallet
Jun 19, 2006

unpacked robinhood posted:

Thanks ! At a glance it seems way faster too.

It's fairly performant from my experience; I've been using it in a recent project to deal with checking fairly large texts against a 130,000 entry pwl dictionary and it generally takes longer to load the word list than it does to check the words.

breaks
May 12, 2001

Dr Subterfuge posted:

Basically, I have some huge scraper functions that I'm trying to refactor into smaller components (with maybe the eventual goal of messing around with asyncio), and the structure I came up with was to turn each scraper into a class (or rather a subclass of a base scraper) and use attribute access in class methods. So I'm not really just appending things. It was mostly to illustrate that I want to be able to modify an existing value. I'm aware there are whole scraping packages like scrapy that have already solved most of these problems. I'm mostly just messing around with rolling something myself to see if I can learn anything in the process. (But one of those is getting better at designing things, so if there's a better way to go about updating a bunch of different fields in a data structure that would be cool to know.)

It sounds like the scrapers are mutating some shared data structure(s), but is that really necessary? Why not have each scraper do its scraping then return whatever data it gathered, which other parts of the program can then store or pass to other scrapers or whatever as needed? If you can reorganize things so that each scraper sticks to its scraping task and doesn't care about what happens to the scraped data, that will probably simplify the design of each component.

SurgicalOntologist
Jun 17, 2004

I've got a SQL database on a NAS that I'm trying to query through SQLAlchemy. A NAS is obviously not ideal for running the queries but I'm just trying to do something simple. Unfortunately the process runs out of memory (1GB plus 4GB swap) just running this:
Python code:
user = session.Query(User).one()
I just want to grab one user instance for experimentation but this takes a couple hours then crashes. Meanwhile I can do simple groupby-count queries no problem (although they are also slow). Any ideas for how to make this work?

Space Kablooey
May 6, 2009


Sqlalchemy .one() returns exactly one result or returns an error

.one is a bit of a minefield on SQLAlchemy to say the least. IME it's more intended to be the execution of a string of filters, and then to validate if there's exactly one of a record with that criteria, and, lastly to return it. It will raise errors if there's no record matching that criteria or if there's more than one record matching that criteria.

If you just want one object and don't care which record it is, you should use .first instead. Be aware that this will be the first record found by the database, which may or may not be ordered according to the primary key(s).

SurgicalOntologist
Jun 17, 2004

:doh:

Thanks!

Space Kablooey
May 6, 2009


No problems!

I've worked with SQLA for a long time and I still make that mistake when I'm writing one-off scripts far more often than I'd like.

I have no clue why it takes hours for you, though. Hopefully .first will be much, much faster.

SnatchRabbit
Feb 23, 2006

by sebmojo
Is there a simple way to beautify JSON strings in python, specifically with the json library? Essentially, I'm pulling json data, using response = json.dumps(response, indent=4, sort_keys=True, default=str) and I am sending it in an email, which all works fine. The only issue is the json is a giant block of text where I would like it to be a string (html formatted if possible?) so I can dump it into the body of the email and have it be somewhat comprehensible to a human being.

Cingulate
Oct 23, 2012

by Fluffdaddy
edit: I am incredibly stupid

Cingulate fucked around with this message at 07:00 on Apr 12, 2018

necrotic
Aug 2, 2005
I owe my brother big time for this!
He's doing that.

Just send the email as plain text so the newlines work. Otherwise you'll need something like pygments to generate stylized HTML. Bare minimum would be replacing newlines with <br> tags.

Data Graham
Dec 28, 2009

📈📊🍪😋



Or pprint?


That’s more for native objects though, not sure if it handles json or anything.

necrotic
Aug 2, 2005
I owe my brother big time for this!
If you parse it then pprint you get the native representation. But that wouldn't help with the case of injecting that output into an email any more than the JSON.dump(indent=4) approach would.

Wallet
Jun 19, 2006

Yeah, I'm pretty sure you're going to need to convert at least linebreaks for formatting. json2html might do what you need fairly easily. I'm not sure if you actually need it to be functional JSON or if you're just trying to make it readable.

Wallet fucked around with this message at 12:54 on Apr 12, 2018

Data Graham
Dec 28, 2009

📈📊🍪😋



Could you not just put the output in a <pre> tag?

SnatchRabbit
Feb 23, 2006

by sebmojo

Wallet posted:

Yeah, I'm pretty sure you're going to need to convert at least linebreaks for formatting. json2html might do what you need fairly easily. I'm not sure if you actually need it to be functional JSON or if you're just trying to make it readable.

It's just a readable copy for the staff at this point. Essentially, we need to validate that the environment we built is what we say it is.

Data Graham posted:

Could you not just put the output in a <pre> tag?

This might be fairly easy to inject. What exactly would it do?

edit: it worked! sweet! Thanks!

SnatchRabbit fucked around with this message at 16:02 on Apr 12, 2018

Wallet
Jun 19, 2006

<pre> wraps around preformatted text, which is displayed in fixed-width and preserves consecutive spaces and linebreaks. It's probably a good solution.

SnatchRabbit
Feb 23, 2006

by sebmojo
Another newbie question. I hope this makes sense. I have a series of aws python calls I want to run, and for each command I want to have it dump the response into a file. I'm thinking I should use a for loop with each method I want to execute, but I'm not sure of the exact syntax to build the command inside the loop.

The general structure of each command will look like this:
code:
response = client.do_the_thing()
I have a list of strings that looks like this for the method part:
code:
commandlist = ['do_thing_one', 'do_thing_two', 'do_thing_three']
And my loop looks like this:
code:
   for x in commandlist:
      response = client.[commandlist]()
      response = json.dumps(response, indent=4, sort_keys=True, default=str)
      NetworkDevicesFile = open('/tmp/NetworkDevicesFile.txt','w')
      NetworkDevicesFile.write(response)
      NetworkDevicesFile.close()
How exactly would I write the variable part where i am iterating through the list above?

Space Kablooey
May 6, 2009


Python code:
for method in commandlist:
 response = getattr(client, method)()
Also you should use context managers for dealing with files:
Python code:
with open("path/to/file", 'w') as file:
 file.write(content)
In that way, when you leave the with block, the file will be automatically closed.

SnatchRabbit
Feb 23, 2006

by sebmojo
Thanks!

Space Kablooey
May 6, 2009


Which testing framework is the new hotness nowadays? I've been using nose for the longest time and I want to get on with the times.

Master_Odin
Apr 15, 2010

My spear never misses its mark...

ladies

HardDiskD posted:

Which testing framework is the new hotness nowadays? I've been using nose for the longest time and I want to get on with the times.
Assuming a modern version of Python3, either pytest or unittest. I prefer the latter when possible as it's part of the standard library, but the former is more powerful. nose has seen work in a couple years and nose2 sort of fell flat and has this on their github page: "However, given the current climate, with much more interest accruing around pytest, nose2 is prioritizing bugfixes and maintenance ahead of new feature development."

SurgicalOntologist
Jun 17, 2004

If you really want something different, check out hypothesis. I use it whenever I can (with pytest as a test runner). Given a domain of inputs, it tries to find values that break your tests. The stateful testing module is especially neat, when I used it it was able to find some super obscure bugs in my code that only occurred after a very specific sequence of events.

Thermopyle
Jul 1, 2003

...the stupid are cocksure while the intelligent are full of doubt. —Bertrand Russell

Yes, hypothesis is good.

Space Kablooey
May 6, 2009


Hypothesis looks pretty cool, thanks.

Boris Galerkin
Dec 17, 2011

I don't understand why I can't harass people online. Seriously, somebody please explain why I shouldn't be allowed to stalk others on social media!
I'm trying to compress ASCII files (~20-30 MB) using the bz2 module and I'm just wondering why there is a difference in timing between these two snippits:

code:
%%time

with open('test.txt', 'rb') as f:
    original_data = f.read()
    compressed_data = bz2.compress(original_data)

with open('test.txt.bz2', 'wb') as f:
    f.write(compressed_data)

> CPU times: user 2.02 s, sys: 11.9 ms, total: 2.03 s
> Wall time: 2.04 s
code:
%%time

with open('test.txt', 'rb') as f:
    original_data = f.read()
    compressed_data = bz2.compress(original_data)

with bz2.open('test.txt.bz2', 'wb') as f:
    f.write(compressed_data)

> CPU times: user 2.69 s, sys: 16.8 ms, total: 2.7 s
> Wall time: 2.71 s
The second method (using bz2.open) is consistently slower than the first. I'm not really concerned about it being slower by less than 1 s but I'm just curious where this extra overhead is coming from?

'ST
Jul 24, 2003

"The world has nothing to fear from military ambition in our Government."
It looks like you should be writing the uncompressed data to the bz2.open io object. I think you are double-compressing.

Python code:
with open('test.txt', 'rb') as f:
    original_data = f.read()

with bz2.open('test.txt.bz2', 'wb') as f:
    f.write(original_data)

Boris Galerkin
Dec 17, 2011

I don't understand why I can't harass people online. Seriously, somebody please explain why I shouldn't be allowed to stalk others on social media!

'ST posted:

It looks like you should be writing the uncompressed data to the bz2.open io object. I think you are double-compressing.

Python code:
with open('test.txt', 'rb') as f:
    original_data = f.read()

with bz2.open('test.txt.bz2', 'wb') as f:
    f.write(original_data)

Oh I didn't know that it automatically compressed it. Thanks!

I didn't bother running the timing tests again but that's good to know. Also I switched to using lzma because the decompression speed was much faster than bz2 although the compression was much slower, but I only need to compress them once before throwing them into the git repo so that's fine.


Pycharm question: can anyone explain to me in simple dumb terms what marking directories as "sources," "resources," and "templates" actually do? The official website says this about it

quote:

Source roots
These roots contain the actual source files and resources. PyCharm uses the source roots as the starting point for resolving imports.

The files under the source roots are interpreted according to their type. PyCharm can parse, inspect, index, and compile the contents of these roots.

Resource roots
These roots are intended for resource files in your application (images, Style Sheets, etc.) By assigning a folder to this category, you tell PyCharm that files in it and in its subfolders can be referenced relative to this folder instead of specifying full paths to them.

But I just can not comprehend what those words mean.

In terms of my python projects, how does it affect me? Examples would be nice because I can't find examples of this anywhere.

QuarkJets
Sep 8, 2008

Boris Galerkin posted:

Pycharm question: can anyone explain to me in simple dumb terms what marking directories as "sources," "resources," and "templates" actually do? The official website says this about it


But I just can not comprehend what those words mean.

In terms of my python projects, how does it affect me? Examples would be nice because I can't find examples of this anywhere.

Think of sources as .py files. Simple enough. Sources folders are added to the PYTHONPATH so that you can import their contents. If you want to import python files from other python files, and some of those files are in directories, then you'll either need to modify your PYTHONPATH yourself or establish those directories as source folders.

Think of resources as any data that you may want to use in your project; CSS files, images, etc. Then if you try to load a file (say "data.txt" which lives deep in some subdirectory) you can use a shorter root path ("/data.txt") instead of having to specify a relative path from the root directory of your project (eg "full/path/to/some/bullshit/directory/data.txt") or a full path

Templates contain template files, which is only really relevant if you use a template language such as Django

Nigel Tufnel
Jan 4, 2005
You can't really dust for vomit.
I need some advice on the ‘unknown unknowns’ of my Python knowledge.

There seems to be an uncrossable gulf for me between ‘can write and understand a small program that uses logic, loops, different data structures etc’ and ‘can write something useful that sits on a server and does something’.

Therefore I’d like to get something into production by writing my own todo list web app (not for general use, just for me). My HTML/CSS is pretty good so not so worried about that element.

Top level features would be:
- secure login
- markup based list with adding, deleting and marking as done
- persistency when logged off

Basically I don’t know what I don’t know and I’m not sure how I can turn a decentish knowledge of python’s standard library into something functioning that I can use. Any help would be much appreciated.

Adbot
ADBOT LOVES YOU

huhu
Feb 24, 2006
Todo apps are like the most popular way of learning backend. Google "flask/Django todo app tutorial".

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply