Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Locked thread
Lonely Wolf
Jan 20, 2003

Will hawk false idols for heaps and heaps of dough.

Janin posted:

There's no way to separate the definition and registration of a plugin. This makes testing difficult -- do you want all your test classes showing up as plugins?

There's no way to know what plugins are available without invoking their definitions, which may depend on libraries that aren't installed. Rather than displaying a helpful error message ("FooPlugin requires libfoo >= 1.7"), plugin scanning will barf with an ImportError.

Both of these seem like implementation details that could be handled by the metaclass either letting you put a sentienel to determine a test case or catching errors and passing them to the application to handle.

quote:

Violates "explicit is better than implicit" -- registration is hidden in a magic class, and is difficult to discover except through grepping through the source for "__metaclass__".

This is definitely a valid concern. I see it as a tradeoff. You're providing a little magic to make it easier for someone to worry about writing a plugin for your app without having to do a lot of boilerplate config files.

quote:

Violates the principles of duck-typing -- merely implementing a prescribed interface is no longer sufficient, as only classes inheriting from the registration class will be found.

This is the worst part of the metaclass scheme, but the simplicity is worth it in my opinion. Maybe I've spent too much time playing in metaprogramming land to have a sane perspective though.

quote:

Multiple applications can't share plugins without a separate, shared registration library.

I don't see how that isn't true for any plugin system.

Adbot
ADBOT LOVES YOU

TOO SCSI FOR MY CAT
Oct 12, 2008

this is what happens when you take UI design away from engineers and give it to a bunch of hipster art student "designers"

Lonely Wolf posted:

I don't see how that isn't true for any plugin system.
If the plugin metadata is stored in a documented format, it can be used to load a class that implements the documented interface, which is then used as a plugin. No dependence on an external registration library is needed.

This is an issue I ran into when trying to write a cleaned-up version of Screenlets. Each plugin loaded the library directly, and subclassed the main Screenlet class to auto-register. Thus, every plugin required the same version of the library to be active or it would fail to load. If they had used a separate metadata file, I could have parsed that file to determine which version of the libraries to provide.

Boblonious
Jan 14, 2007

I HATE PYTHON IT'S THE SHITTIEST LANGUAGE EVER PHP IS WAY BETTER BECAUSE IT IS MADE FOR THE WEB AND IT USES MYSQL WHICH IS ALSO USED BY GOOGLE AND THATS WHY ITS SO MUCH BETTER THAN PYTHON EVEN HTML IS A BETTER PROGRAMMING LANGUAGE THAN PYTHON
Thanks for the feedback everyone. I don't think I want to go with an external library, I don't want my code to depend on any other projects except the standard libraries.

I'm thinking that, while the metaclass property may auto register the class when it's imported, I'd still have control over which modules are imported. I plan to have a list of modules stored in a config file, which the code will loop through and import.

Lonely Wolf
Jan 20, 2003

Will hawk false idols for heaps and heaps of dough.

Janin posted:

If the plugin metadata is stored in a documented format, it can be used to load a class that implements the documented interface, which is then used as a plugin. No dependence on an external registration library is needed.

This is an issue I ran into when trying to write a cleaned-up version of Screenlets. Each plugin loaded the library directly, and subclassed the main Screenlet class to auto-register. Thus, every plugin required the same version of the library to be active or it would fail to load. If they had used a separate metadata file, I could have parsed that file to determine which version of the libraries to provide.

Okay, I thought you meant sharing plugins between different applications, not different versions or instances of the same application, which would be troublesome with the metaclass approach.

TOO SCSI FOR MY CAT
Oct 12, 2008

this is what happens when you take UI design away from engineers and give it to a bunch of hipster art student "designers"

Boblonious posted:

Thanks for the feedback everyone. I don't think I want to go with an external library, I don't want my code to depend on any other projects except the standard libraries.
Is this for a school homework assignment? If not, I highly advise that you use 3rd-party libraries as much as possible. They've been more thoroughly studied and tested than whatever system you'll write.

Boblonious posted:

I'm thinking that, while the metaclass property may auto register the class when it's imported, I'd still have control over which modules are imported. I plan to have a list of modules stored in a config file, which the code will loop through and import.
If you're going to have a manually updated list *anway*, just make it a list of plugins, not of modules.

code:
# config file
PLUGIN_NAMES = ["myapp.plugins.pmod:PluginA",
                "myapp.plugins.pmod:PluginB",
                "myapp.plugins.otherplugin:OtherPlugin",
                # etc
                ]

# plugin loader
def load (plugin_path):
    module_name, plugin_name = plugin_path.split(":")
    mod = __import__(module_name, {}, {}, [plugin_name])
    return getattr(mod, plugin_name)

plugins = map(load, names)

Lonely Wolf posted:

Okay, I thought you meant sharing plugins between different applications, not different versions or instances of the same application, which would be troublesome with the metaclass approach.
There's not much difference, technically speaking, between different applications and very different versions of the same application.

Lonely Wolf
Jan 20, 2003

Will hawk false idols for heaps and heaps of dough.

Janin posted:

There's not much difference, technically speaking, between different applications and very different versions of the same application.

True enough, but you could also add a class variable requires_version="1.0.1" to check in the metaclass. Of course add enough of those and at some point you do have a config file, just in a class instead of a file. . . .

Boblonius, I'd go with the code that Janin just provided. Metaclasses are fun and all but if you don't feel comfortable with them you shouldn't be using them yet. It would be easier to add error-checking to Janin's code than the metaclass code for you.

You should look into metaclasses though just because they're so much fun and will give you a better idea of how Python works.

Boblonious
Jan 14, 2007

I HATE PYTHON IT'S THE SHITTIEST LANGUAGE EVER PHP IS WAY BETTER BECAUSE IT IS MADE FOR THE WEB AND IT USES MYSQL WHICH IS ALSO USED BY GOOGLE AND THATS WHY ITS SO MUCH BETTER THAN PYTHON EVEN HTML IS A BETTER PROGRAMMING LANGUAGE THAN PYTHON
Thinking more about this, I'm less and less keen on doing something magical to load and register plugins. You're right Janin, if I'm going to have a list of modules, I can have a list of classes instead.

I'll take a look at some of the existing implementations linked earlier to gather more ideas.

And by the way, this isn't for school. But regardless, I still don't want to have this project depend on an external library. I feel that this problem is small enough to write myself and not pull another dependency into it. I just wanted to make sure I was doing it right and not doing something that would end up in the coding horrors thread.

tbradshaw
Jan 15, 2008

First one must nail at least two overdrive phrases and activate the tilt sensor to ROCK OUT!

Boblonious posted:

And by the way, this isn't for school. But regardless, I still don't want to have this project depend on an external library. I feel that this problem is small enough to write myself and not pull another dependency into it. I just wanted to make sure I was doing it right and not doing something that would end up in the coding horrors thread.

Reinventing the wheel should always be a coding horror. :(

Regardless of people's alleged problems with easy_install, that has really no bearing on the fitness of the plugin services that setuptools provides. Notably, if you're using setuptools already for installation ease (which many, many applications do) then it's not an additional dependancy at all.

I've always found the Python community's love/hate relationship with setuptools to be strange. Of course it has issues, but it is still the best we've got. The fact that setuptools is the reference implementation for moving the "good parts" into the standard library seems to be a huge admission of that fact. But package management is always this religious issue for developer and systems administrators. And for some reason, unlike every other library and application we use day to day, some people are adamant that people should avoid setuptools wholesale instead of leveraging what's good and minimizing what's bad.

I know I'm excited for what we're going to be using after setuptools. But until it's here, setuptools is still a very nice, standardized, way to handle things.

TOO SCSI FOR MY CAT
Oct 12, 2008

this is what happens when you take UI design away from engineers and give it to a bunch of hipster art student "designers"

tbradshaw posted:

I've always found the Python community's love/hate relationship with setuptools to be strange. Of course it has issues, but it is still the best we've got. The fact that setuptools is the reference implementation for moving the "good parts" into the standard library seems to be a huge admission of that fact. But package management is always this religious issue for developer and systems administrators. And for some reason, unlike every other library and application we use day to day, some people are adamant that people should avoid setuptools wholesale instead of leveraging what's good and minimizing what's bad.

I know I'm excited for what we're going to be using after setuptools. But until it's here, setuptools is still a very nice, standardized, way to handle things.
For many people, their only interaction with anything setuptools-related is when easy_install decides to zip up their module/poo poo .pth everywhere/break tracebacks/break __file__/<insert easy_install behavior here>. And encouraging developers to package ez_setup.py is awful, because "doot de doot gonna install this library -- wait, why is it downloading internet things???"

Also, because setuptools/pkg_resources/easy_install are all developed by the same organization and distributed together, it's easy to become confused and think they're a monolithic system. If it'd been split into 4-5 separate packages on PyPI I think resistance to migration into the stdlib would've been greatly lessened.

m0nk3yz
Mar 13, 2002

Behold the power of cheese!

Janin posted:

For many people, their only interaction with anything setuptools-related is when easy_install decides to zip up their module/poo poo .pth everywhere/break tracebacks/break __file__/<insert easy_install behavior here>. And encouraging developers to package ez_setup.py is awful, because "doot de doot gonna install this library -- wait, why is it downloading internet things???"

Also, because setuptools/pkg_resources/easy_install are all developed by the same organization and distributed together, it's easy to become confused and think they're a monolithic system. If it'd been split into 4-5 separate packages on PyPI I think resistance to migration into the stdlib would've been greatly lessened.

If it was broken up, better maintained, less confusing, better engineered, yes - most people's criticisms would have been lessened. As it is, it's none of those things. Tarek deserves a medal for picking up the packaging flag and pep-ifying all of the stuff he wants to bring into distutils, but while setuptools may be the reference implementation, none of setuptools is being brought in wholesale, for a reason. I sincerely doubt any code will be copy and pasted into the stdlib from setuptools, at all. Having dug into the code for various reasons, I would never, ever accept it as-is. Modularizing it would not solve the fundamental problems within the code.

And it's not "one organization" - it's one guy. It *is* a monolith, sure, you don't need to use all of it every time (that's what's great about frameworks in general) but it encourages you to use it for everything packaging-wise. Yes, it has some good. Yes, it's in widespread use. No, I wouldn't bring in setuptools just for a plugin framework, or just to distribute a package (distutils is good enough for me, and will be better when tarek is done).

I'm really looking forward to Tarek distilling the useful parts of setuptools (and others) and bringing them into core. At the language summit, this was a pretty sensitive topic, and yeah - there's a lot of different opinions. Some people love it, a lot of people dislike it, and some outright hate it.

awesomepanda
Dec 26, 2005

The good life, as i concieve it, is a happy life.
is it possible to call methods in a chain?

like

list.sort().reverse()?

hlfrk414
Dec 31, 2008
Why not just ask sort to reverse it? list.sort(reverse=True)

Lonely Wolf
Jan 20, 2003

Will hawk false idols for heaps and heaps of dough.
Short answer: no. Long answer: yes, but only if the methods in the chain return self, which in the case of list is a (see Short Answer).

bitprophet
Jul 22, 2004
Taco Defender
You guys are focusing on the literal example he gave; if he's asking about the general syntactical legality of such a thing, the answer is yes, absolutely. Python (or any decent language, really) lets you treat most expressions as building blocks that can be composed ad infinitum.

So object.function().anotherfunction().yetanotherfunction() is legal, as is function().attribute.attribute.function().attribute and so on and so forth.

The main deal, as the previous posters were getting at, is that you need to be relatively certain as to what the individual expressions in your "chain" result in, namely, that they have to return some kind of object which can be operated on by the next function. For example:
code:
class Bar(object):
    def get_string():
        return "a_string"

class Foo(object):
    def get_bar():
        b = Bar()
        return b

>>> f = Foo()
>>> f.get_bar().get_string().replace('_', ' ')
a string
In that last call, we have a Foo object, its get_bar method is called, returning a Bar object, whose get_string method is called, returning a string, whose replace method is called (which returns a modified copy of the string) and the final result is "a string".

Your exact example won't work because list_object.sort() sorts in-place and doesn't return anything (or rather it returns None, Python's "null" value). This is inconsistent with the behavior of string.replace, and that's a bit of a wart IMO (not sure if it was changed in Python 3.)


For a real-world example of something that leverages call chaining, see Django's query syntax.

bitprophet fucked around with this message at 14:59 on Jun 7, 2009

king_kilr
May 25, 2007

bitprophet posted:

You guys are focusing on the literal example he gave; if he's asking about the general syntactical legality of such a thing, the answer is yes, absolutely. Python (or any decent language, really) lets you treat most expressions as building blocks that can be composed ad infinitum.

So object.function().anotherfunction().yetanotherfunction() is legal, as is function().attribute.attribute.function().attribute and so on and so forth.

The main deal, as the previous posters were getting at, is that you need to be relatively certain as to what the individual expressions in your "chain" result in, namely, that they have to return some kind of object which can be operated on by the next function. For example:
code:
class Bar(object):
    def get_string():
        return "a_string"

class Foo(object):
    def get_bar():
        b = Bar()
        return b

>>> f = Foo()
>>> f.get_bar().get_string().replace('_', ' ')
a string
In that last call, we have a Foo object, its get_bar method is called, returning a Bar object, whose get_string method is called, returning a string, whose replace method is called (which returns a modified copy of the string) and the final result is "a string".

Your exact example won't work because list_object.sort() sorts in-place and doesn't return anything (or rather it returns None, Python's "null" value). This is inconsistent with the behavior of string.replace, and that's a bit of a wart IMO (not sure if it was changed in Python 3.)


For a real-world example of something that leverages call chaining, see Django's query syntax.

I beg to differ. It's not a wart. It's simply the by-product of some datastructures being mutable and others not. Once one understands that lists, dicts, and sets are mutable, while strings, ints, bools, and tuples aren't everything falls neatly into place. For this reason *no* list methods that change the list return itself, and *every* string method returns a new string. You can also see this in play when people ask if python is pass-by-value or pass-by-ref. It's exclusively pass-by-ref, but some people don't see it this way due to the immutable objects.

Also, for chaining in practice see: http://simonwillison.net/2008/May/1/orm/

bitprophet
Jul 22, 2004
Taco Defender

king_kilr posted:

It's simply the by-product of some datastructures being mutable and others not.

This is true, but I'm still not sure whether it really justifies the resulting schizophrenic API. Obviously immutable objects can't be operated on in-place, and must return a modified copy, but I don't quite see how it follows that mutable objects must return None instead of returning, say, self (i.e. modify self, then return self).

If there's a design doc that explains this I'd be interested in seeing the reasoning :) As with most things in Python I'm sure it is well thought out, but the "because it's mutable" explanation seems too simplistic and doesn't answer how that approach is worth the downside of having to remember "right, this is a list, so I have to use sorted() not .sort() if I want a non None return value".

And no, I'm not arguing against the mutability split in general, and I realize good Python coders have to keep it in the back of their mind often, e.g. when setting default kwarg values. I just don't see why returning None is in any way preferable to returning a real value, especially when returning 'self' would (AFAICT) avoid any duplication of values in memory (which WOULD be a reason to avoid returning a copy as is done with immutable objects.)

tbradshaw
Jan 15, 2008

First one must nail at least two overdrive phrases and activate the tilt sensor to ROCK OUT!

bitprophet posted:

This is true, but I'm still not sure whether it really justifies the resulting schizophrenic API. Obviously immutable objects can't be operated on in-place, and must return a modified copy, but I don't quite see how it follows that mutable objects must return None instead of returning, say, self (i.e. modify self, then return self).

There is no reason to obfuscate what data structures are immutable and mutable. This isn't a schizophrenic API. Mutable data structures and immutable data structures behave differently. This is a good thing! They shouldn't be treated the same by developers, either.

Edit:

Maybe it would make more sense if one considers that the traditional/conventional definitions of these operations have exactly this behavior. Operations like sort on mutable data types change values in place. When extending those concepts to immutable data types, operations are similar but more expensive and now return new objects.

It might seem odd when coming from the "top down" and looking at two data types and just focusing on their similarities in syntax, but any moderately serious study in data types shows that this is exactly the behavior expected. In fact, on mutable types "None" is only returned as a convenience and isn't necessary from a data structure design standpoint.

tbradshaw fucked around with this message at 18:31 on Jun 7, 2009

bitprophet
Jul 22, 2004
Taco Defender

tbradshaw posted:

Mutable data structures and immutable data structures behave differently. They shouldn't be treated the same by developers, either.

They're treated the same when used as e.g. iterables, though, and I guess I see this as sort of the same thing (as you said, a "top down" view.)

quote:

[..] the traditional/conventional definitions of these operations have exactly this behavior.

A language like Python has to balance the underlying realities of programming, with the desire to simplify and save time. The opinion that mutable objects should always behave differently falls on the former side, and the opinion that it would be nicer for "collections" to have sort() return something useful, falls on the latter side.

This can be taken to extremes in either direction, of course -- on one side, we end up back at C because traditionally, pointers are pointers and memory allocation SHOULD require some work because it matters, etc; or you end up with e.g. Ruby where the time-saving stuff is at war with readability and clarity.

I am not arguing that we should change this about Python, mind you, simply that I'm not convinced by the "it's mutable == it behaves differently" argument and wonder if that was really the only reason for the design decision. I feel that I should only have to do extra thinking in cases where mutability actually matters (e.g. argument passing) and returning None instead of self to "remind me" that I'm dealing with a mutable object doesn't strike me as mattering enough to get in my way.

Sorry for the :words: and the arguing-on-the-Internet :downs:

quote:

In fact, on mutable types "None" is only returned as a convenience and isn't necessary from a data structure design standpoint.

But doesn't every function have a return value? I thought that None was the implicit return value for functions not explicitly returning something else; is that inaccurate? (edit: if you're talking very generally and not just Python syntax, I definitely understand your point, sorry.)

bitprophet fucked around with this message at 19:47 on Jun 7, 2009

TOO SCSI FOR MY CAT
Oct 12, 2008

this is what happens when you take UI design away from engineers and give it to a bunch of hipster art student "designers"
list.sort() is a method because it only works on lists. Traditionally in Python, algorithms that operate on a generic interface such as "all iterables" or "all mappings" are functions.

Thus the `reversed()` and `sorted()` functions, which can work on a list, string, generator, whatever.

This is really the same discussion as "len should be a method", and the same arguments apply.

king_kilr
May 25, 2007

bitprophet posted:

But doesn't every function have a return value? I thought that None was the implicit return value for functions not explicitly returning something else; is that inaccurate? (edit: if you're talking very generally and not just Python syntax, I definitely understand your point, sorry.)

I would interpret the None return value in the:

code:
def f():
    do_stuff()

f() is None
Rather than the

code:
def f():
    return None

f() is None
sense.

bitprophet
Jul 22, 2004
Taco Defender

Janin posted:

list.sort() is a method because it only works on lists. Traditionally in Python, algorithms that operate on a generic interface such as "all iterables" or "all mappings" are functions.

Thus the `reversed()` and `sorted()` functions, which can work on a list, string, generator, whatever.

This is really the same discussion as "len should be a method", and the same arguments apply.
No, I understand and agree with the method vs function argument, that's not really what I'm driving at. I was more comparing list.sort() to string.replace(), which are both methods, but one returns None and the other returns a string.


king_kilr posted:

I would interpret the None return value in the:

[implicit]

Rather than the

[explicit]

sense.
Which is why I explicitly (:haw:) stated it as "the implicit return value for functions not explicitly returning something else", whereas tbradshaw seemed (at first, before I reread the 2nd half of his sentence) to be stating that it was somehow possible to return an empty, not-None value.

tbradshaw
Jan 15, 2008

First one must nail at least two overdrive phrases and activate the tilt sensor to ROCK OUT!
edit: removed extra exposition that really didn't help the conversation go anywhere

bitprophet posted:

But doesn't every function have a return value? I thought that None was the implicit return value for functions not explicitly returning something else; is that inaccurate? (edit: if you're talking very generally and not just Python syntax, I definitely understand your point, sorry.)

I was talking very generally, I didn't mean to raise that as a point of contention or anything. Yes, I'm pretty sure that every callable in Python has a return value.

edit:

Also, the choice of making strings immutable is a performance one. Strings as a mutable data type makes sense too, it just isn't implemented that way in Python.

Additionally, I see where you're coming from. Your suggestions make sense, they just come from a different perspective. These are the sorts of opinions on language design that lead developers to prefer one language over the other. Neither option is "wrong", just different. I was just trying to show a bit more fundamental/academic look at the data structures that reinforces the choice. Janin has a great "from definition" rationale for why it's the way it is. I imagine if you want more/better explanations, the python-dev mailing list is where you'd need to go. I would imagine it's one of the older threads in the archive.

tbradshaw fucked around with this message at 21:12 on Jun 7, 2009

TOO SCSI FOR MY CAT
Oct 12, 2008

this is what happens when you take UI design away from engineers and give it to a bunch of hipster art student "designers"

bitprophet posted:

No, I understand and agree with the method vs function argument, that's not really what I'm driving at. I was more comparing list.sort() to string.replace(), which are both methods, but one returns None and the other returns a string.

There's only two options if you have list.sort() return a list, neither of which is very useful:

1. If mutating methods returned themselves, then you can end up with very weird/misleading behavior:
code:
a = [5, 4, 3, 2, 1]
b = a.sort()
a == b
2a. If list.sort() doesn't mutate the underlying list, you'd need a separate method to perform an in-place sort:
code:
a = [5, 4, 3, 2, 1]
b = a.sort()
a != b
a.sort_in_place()
a == b
2b. And you'll have duplication of functionality between list.sort() and sorted(), which goes against "one obvious way to do it".

bitprophet
Jul 22, 2004
Taco Defender

tbradshaw posted:

Neither option is "wrong", just different.
Totally, I wasn't meaning to imply that the official POV on this topic was wrong -- simply bringing a possible rebuttal and curious whether anyone else had raised the same argument in the past. I'm sure there's some saying along the lines of "when all the big things are done right, the only things left to bitch about are small ones" and that's certainly the case with Python.

quote:

I imagine if you want more/better explanations, the python-dev mailing list is where you'd need to go.

If it still bugs me I'll go check out the archives, yea. It's actually not something that troubles me often, but since it came up it made me think "why are things the way they are?".

Janin posted:

1. If mutating methods returned themselves, then you can end up with very weird/misleading behavior:
code:
a = [5, 4, 3, 2, 1]
b = a.sort()
a == b

At first I was thinking this was the whole point (that a would "obviously" equal b in this scenario) but now that I see it printed out, you're right that it makes the code harder to read in a different way.

quote:

2a. If list.sort() doesn't mutate the underlying list, you'd need a separate method to perform an in-place sort

Do note that I was never arguing for this, though I understand you're probably putting it out there to complement the other options in your example :)


Anyway, thanks for the discussion, hope this wraps up the derail somewhat.

TOO SCSI FOR MY CAT
Oct 12, 2008

this is what happens when you take UI design away from engineers and give it to a bunch of hipster art student "designers"

bitprophet posted:

Do note that I was never arguing for this, though I understand you're probably putting it out there to complement the other options in your example :)
Also because it's the way used in Ruby -- obj.sort returns the sorted list, and obj.sort! sorts in-place.

king_kilr
May 25, 2007

Janin posted:

Also because it's the way used in Ruby -- obj.sort returns the sorted list, and obj.sort! sorts in-place.

bitprophet does code in ruby by day, maybe it's getting to his brain!

bitprophet
Jul 22, 2004
Taco Defender

king_kilr posted:

bitprophet does code in ruby by day, maybe it's getting to his brain!

Eh, not all that much. Yes, I do get wires crossed sometimes, but I still spend at least 50% of my day job doing non coding sysadmin stuff, or working with Fabric and/or Django.

Janitor Prime
Jan 22, 2004

PC LOAD LETTER

What da fuck does that mean

Fun Shoe

Janin posted:

There's only two options if you have list.sort() return a list, neither of which is very useful:

1. If mutating methods returned themselves, then you can end up with very weird/misleading behavior:
code:
a = [5, 4, 3, 2, 1]
b = a.sort()
a == b
What's misleading about this? If I understand correctly b and a are the same list, it's what I would expect.

hlfrk414
Dec 31, 2008

MEAT TREAT posted:

What's misleading about this? If I understand correctly b and a are the same list, it's what I would expect.

Because it confuses new python programmers. When sort returns None, if you try to use its return you learn it doesn't work, and learn to live with it and use sorted when you want a copy. Or you write something like this:
code:
a = [5, 4, 3, 2, 1]
sorted_a = a.sort()
for stuff in sorted_a: do_stuff(stuff)
Find that it works, but much later problems arise from doing this:
code:
a = [5, 4, 3, 2, 1]
mutate_and_do_stuff(a.sort())
do_other_stuff(a.sort())
The latter problem takes much longer to debug for someone and they end up having to write code that doesn't use the return value of a.sort and they make copies anyways! Why not just make it obvious what it does and prevent hard to detect errors, with some mild inconvenience because sort doesn't return the list sorted?

Avenging Dentist
Oct 1, 2005

oh my god is that a circular saw that does not go in my mouth aaaaagh
Do you want chaining for arbitrary types? Here (not complete but you get the idea):

code:
class chain:
    class closure:
        def __init__(self, ref, fcn):
            self.ref = ref
            self.fcn = fcn

        def __call__(self, *args):
            self.fcn(*args)
            return self.ref

    def __init__(self, object):
        self.object = object

    def __getattr__(self, key):
        return self.closure(self.object, self.object.__getattribute__(key))

    def __str__(self):
        return str(self.object)
code:
>>> from chain import *
>>> list = chain([9,8,7,6,5])
>>> list.append(1).sort()
>>> print list
[1, 5, 6, 7, 8, 9]

Scaevolus
Apr 16, 2007

Janin posted:

1. If mutating methods returned themselves, then you can end up with very weird/misleading behavior:
code:
a = [5, 4, 3, 2, 1]
b = a.sort()
a == b
This also has to do with the fact that it's simpler if 1) methods don't have side-effects on themselves, but return modified copies or 2) methods have side-effects, but don't return self. If you combine both, it's easy to get confused.

Django's query system is a special case-- the magic is worth it.

tl;dr: list.sort works that way for the same reason that you can't do foo = bar(baz = buttz())

TOO SCSI FOR MY CAT
Oct 12, 2008

this is what happens when you take UI design away from engineers and give it to a bunch of hipster art student "designers"

Scaevolus posted:

Django's query system is a special case-- the magic is worth it.
Queryset methods return copies, they're not in-place

Habnabit
Dec 30, 2007

lift your skinny fists like
antennas in germany.
I know it's from the last page, but this bugs me every drat time.

checkeredshawn posted:

I'm trying to parse the results from onelook.com for words related to a given word, and then store them in a list, but I can't figure out what the regexp should be to match the lines to grab the words.
This is unfortunately impossible, since regular expressions can only parse regular languages and HTML is not a regular language! Fortunately, there's BeautifulSoup for actually really parsing HTML. The API is dead simple; you'd search for A tags where the 'href' attribute matches a particular pattern.

dorkanoid posted:

I think you want p.findall(source) :)
Why do you have to enable bad behavior. :(

Seriously, guys! Even (actual, real) perl programmers will tell you to use an HTML parser for parsing HTML.

dorkanoid
Dec 21, 2004

Habnabit posted:

I know it's from the last page, but this bugs me every drat time.

This is unfortunately impossible, since regular expressions can only parse regular languages and HTML is not a regular language! Fortunately, there's BeautifulSoup for actually really parsing HTML. The API is dead simple; you'd search for A tags where the 'href' attribute matches a particular pattern.

Why do you have to enable bad behavior. :(

Seriously, guys! Even (actual, real) perl programmers will tell you to use an HTML parser for parsing HTML.

:woop: REGULAR EXPRESSIONS! :woop:

And yes, BeautifulSoup is nice when it works, but the latest version is annoying (breaks on invalid/strange HTML).

I still use it when I can, but regular expressions typically work (until the page changes slightly ;))

tef
May 30, 2004

-> some l-system crap ->

Habnabit posted:

Seriously, guys! Even (actual, real) perl programmers will tell you to use an HTML parser for parsing HTML.

Which is why you should use lxml, and not beautiful soap.

supster
Sep 26, 2003

I'M TOO FUCKING STUPID
TO READ A SIMPLE GRAPH
xml.html has worked pretty nicely for me even with bad html.

Smugdog Millionaire
Sep 14, 2002

8) Blame Icefrog

dorkanoid posted:

And yes, BeautifulSoup is nice when it works, but the latest version is annoying (breaks on invalid/strange HTML).

Isn't the point of Beautiful Soup that it parses lovely tag soup into a semi-coherent document?

tef
May 30, 2004

-> some l-system crap ->

Free Bees posted:

Isn't the point of Beautiful Soup that it parses lovely tag soup into a semi-coherent document?

the little parser that could.

it's not that good, and from experience I have found lxml to be faster and better with all forms of html.

TOO SCSI FOR MY CAT
Oct 12, 2008

this is what happens when you take UI design away from engineers and give it to a bunch of hipster art student "designers"

dorkanoid posted:

And yes, BeautifulSoup is nice when it works, but the latest version is annoying (breaks on invalid/strange HTML).

The parser BeautifulSoup depends on was removed in Python 3; luckily, there's an even better one already available named "html5lib", which works flawlessly with BeautifulSoup on all sorts of invalid pages:
code:
import html5lib
builder = html5lib.treebuilders.getTreeBuilder ("beautifulsoup")
parser = html5lib.HTMLParser (tree = builder)
soup = parser.parse ("<html>tag soup goes here</html>")

Free Bees posted:

Isn't the point of Beautiful Soup that it parses lovely tag soup into a semi-coherent document?
The "tag soup" parsing was provided by sgmllib; BeautifulSoup provides unicode decoding, tree traversal, and builds a semi-valid HTML tree from whatever input it gets from its parser.

Adbot
ADBOT LOVES YOU

A A 2 3 5 8 K
Nov 24, 2003
Illiteracy... what does that word even mean?

Janin posted:

The parser BeautifulSoup depends on was removed in Python 3; luckily, there's an even better one already available named "html5lib", which works flawlessly with BeautifulSoup on all sorts of invalid pages:

Some sorts of invalid pages. When you have a database of 400,000 links from all different sources you want to parse, you will find much more that can go wrong with HTML than the authors of any of these packages considered.

  • Locked thread