Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
QuarkJets
Sep 8, 2008

D34THROW posted:

It's ridiculous how :neckbeard: I get over a simple half-functional login page in Flask that centers the form on the screen and pins the header to the bottom. If I can get this poo poo on a Heroku instance and spread it around the company I'm gonna be quite happy and make a new name for myself other than "the Excel toucher" and "the NAV geek".

Does it help to point out that you definitely have a bunch of other names in the company but just don't know what they are yet?

Adbot
ADBOT LOVES YOU

Falcon2001
Oct 10, 2004

Eat your hamburgers, Apollo.
Pillbug

D34THROW posted:

It's ridiculous how :neckbeard: I get over a simple half-functional login page in Flask that centers the form on the screen and pins the header to the bottom. If I can get this poo poo on a Heroku instance and spread it around the company I'm gonna be quite happy and make a new name for myself other than "the Excel toucher" and "the NAV geek".

This was me earlier today when I managed to automate the manual generation of some reports for work.

Until about 4 hours later when I discovered a minor discrepancy in a 110mb csv and now I'm having to go back to the datasource all over again.

BeastOfExmoor
Aug 19, 2003

I will be gone, but not forever.
I've been working on the code I was asking about a few pages back based on the solid recommendations the thread gave me. Wanted to get some feedback on how I'm approaching some things though.

Here's some very abbreviated code from the main class:

Python code:
class EbirdLocation():
    def __init__(self, ebird_html: Union[bytes, str]) -> None:
        """Class for storing parsed data from an ebird region or hotspot"""
        if not ebird_data_is_valid(ebird_html):
            raise ValueError  # Consider changing this to a custom error type

        soup = BeautifulSoup(ebird_html, 'html.parser')

        self._set_location_code(soup)
        # Wondering about if this is worse than changing _set_location_code to _get_location_code 
        # and having it return the location code.

        if self.location_type == "region":
            self._set_total_ebirders(soup)
            self._set_hotspots_count(soup)
        else:
            self.total_ebirders = None
            self.hotspots = None

    @classmethod
    def get_region(self, region: str, period: str = "all", sightings_type: str = SightingsType.first_seen.value):
        """Retrieve /region html and create a EbirdLocationData object from it"""

        validated_period = Period(period)  # Exception will occur if invalid period is passed
        sightings_type_validated = SightingsType(sightings_type)  # Exception will occur if invalid period is passed

        url = create_ebird_url(region, page_type=PageType.region.value, period=validated_period.value, sightings_type=sightings_type_validated.value)
        logging.debug(f"URL = {url}")
        ebird_html = get_ebird_html(url)
        return EbirdLocation(ebird_html)

    def _set_location_code(self, soup: BeautifulSoup):
        """Set the ebird location code (Ex. US-WA-061)"""
        canonical_location = soup.find("link", rel="canonical").attrs['href']
        last_non_location_code_character = re.search("/(hotspot|region)/", canonical_location).end()
        # Slice the end off the canonical location to get the location code
        self.location_code = str(canonical_location)[last_non_location_code_character:]
Questions:
1. Is there any advantage to having a function which returns data which a variable can be set to vs just having the function set self.location_code based on what it's parsed from the HTML?
2. The classmethod is written to give an easy was to create an object with one line of code, while still giving the flexibility to generate the object from HTML which allows for a lot of flexibility in testing and use. Any issues with how this is written?
3. I omitted the actual function definitions for brevity, but currently this class can generate off either a region page or a hotspot page. Hotspot pages don't contain total_ebirders or hotspot counts, so currently I'm just setting those to None for hotspots, but I'm weighing the advantages of making region and hotspot subclasses. I guess the biggest issue would be that I'd need to do the detection before creating the class rather than allowing it to be done based on what BeautifulSoup finds in the HTML.

D34THROW
Jan 29, 2012

RETAIL RETAIL LISTEN TO ME BITCH ABOUT RETAIL
:rant:

QuarkJets posted:

Does it help to point out that you definitely have a bunch of other names in the company but just don't know what they are yet?

Oh, I'm quite sure. But I work remote so I don't know them :v:


So to make sure I'm understanding this right, Flask acts as the backend to populate a webpage template. I can, say, generate a list of admin functions for a control panel and stick it in a sidebar programatically, rather than hardcoding it into the page itself. It's similar to PHP but does more of the backend stuff than PHP does? Or am I off the mark. I ask because my next step is the admin panel.

duck monster
Dec 15, 2004

D34THROW posted:

Oh, I'm quite sure. But I work remote so I don't know them :v:


So to make sure I'm understanding this right, Flask acts as the backend to populate a webpage template. I can, say, generate a list of admin functions for a control panel and stick it in a sidebar programatically, rather than hardcoding it into the page itself. It's similar to PHP but does more of the backend stuff than PHP does? Or am I off the mark. I ask because my next step is the admin panel.

Traditional PHP is an older model that packed logic and presentation in the same box. It made sense in the 1990s because we really didnt know better, but it doesnt make a lot of sense in 2021.

Well other than the fact that this terrible line of thinking is dominant again thanks to React and Vue. Hooray for Web devs.

But back on topic, Flask is 2/3 of a model/view/controller system. The View (templates) are fed through a controller along with request parameters and session data (The python logic. By the way, MVC is pretty ill-defined, so other systems have different definitionsl.) to turn them into the page thats transmitted to your browser. The model (the representation of business objects, traditionally an object oriented representation of a database. Different systems have different norms for how much logic goes in these) is something you have to provide yourself, but theres a couple of good options here, SQLAlchemy, and Peewee which is a lightweight look-alike of Djangos brilliant ORM. SQLAlchemy is the one normally represented, but Peewee can be a drat useful library for more lightweight projects due to its simplicity. That simplicity has a tradeoff of less flexibility compared to the 700 tonne gorilla that is SQLAlchemy.

In fairness also, most PHP is done using MVC frameworks these days, well that and Wordpress (Ugh). None of them are fun to use. Even Laravel, a sort of lobotomized version of Ruby on Rails. Laravel attempts to combine the magicness of Rails with the bureacracy of Java, in a sort of worst-of-all-worlds chimera, that a lot of PHP people seem to think is representative of the current state of the art outside of PHP.

If your doing anything bigger than a few simple pages, I strongly implore you to check out Django. It can be a little bit of work to set up (Though high end IDEs like Pycharm can autogenerate a basic setup) but its amazingly powerful. Do the tutorial however, or it might end up a bit too much to learn in one sitting.

duck monster fucked around with this message at 03:52 on Jan 21, 2022

CarForumPoster
Jun 26, 2013

⚡POWER⚡
Django+DjangoCMS gets you up and going ng with a pretty good admin panel, ability to have non technical people making content, and ability to make templates of all kinds pretty quickly.

Phobeste
Apr 9, 2006

never, like, count out Touchdown Tom, man

BeastOfExmoor posted:

I've been working on the code I was asking about a few pages back based on the solid recommendations the thread gave me. Wanted to get some feedback on how I'm approaching some things though.

Here's some very abbreviated code from the main class:

Python code:
class EbirdLocation():
    def __init__(self, ebird_html: Union[bytes, str]) -> None:
        """Class for storing parsed data from an ebird region or hotspot"""
        if not ebird_data_is_valid(ebird_html):
            raise ValueError  # Consider changing this to a custom error type

        soup = BeautifulSoup(ebird_html, 'html.parser')

        self._set_location_code(soup)
        # Wondering about if this is worse than changing _set_location_code to _get_location_code 
        # and having it return the location code.

        if self.location_type == "region":
            self._set_total_ebirders(soup)
            self._set_hotspots_count(soup)
        else:
            self.total_ebirders = None
            self.hotspots = None

    @classmethod
    def get_region(self, region: str, period: str = "all", sightings_type: str = SightingsType.first_seen.value):
        """Retrieve /region html and create a EbirdLocationData object from it"""

        validated_period = Period(period)  # Exception will occur if invalid period is passed
        sightings_type_validated = SightingsType(sightings_type)  # Exception will occur if invalid period is passed

        url = create_ebird_url(region, page_type=PageType.region.value, period=validated_period.value, sightings_type=sightings_type_validated.value)
        logging.debug(f"URL = {url}")
        ebird_html = get_ebird_html(url)
        return EbirdLocation(ebird_html)

    def _set_location_code(self, soup: BeautifulSoup):
        """Set the ebird location code (Ex. US-WA-061)"""
        canonical_location = soup.find("link", rel="canonical").attrs['href']
        last_non_location_code_character = re.search("/(hotspot|region)/", canonical_location).end()
        # Slice the end off the canonical location to get the location code
        self.location_code = str(canonical_location)[last_non_location_code_character:]

quote:

Questions:
1. Is there any advantage to having a function which returns data which a variable can be set to vs just having the function set self.location_code based on what it's parsed from the HTML?

There's two good reasons to have a function calculating the value of a variable. First, it makes it easier to read, and not joking in something like 95% of code use cases the single most important thing to do with the code is make it clear and easy to read. You'll read it more often than you write it. Second, it's more efficient if the value won't change very often to just calculate it once and put it in your pocket.

That said, you typically don't want to have functions that create member variables that you call in initializers. Python doesn't really have separate ahead of time class member definitions - there's not a single unified place to look, required by the language, where you'll see all the member variables of a class. This can be a tremendous pain. The overwhelmingly common idiom is therefore to set all member variables at least once in __init__. You can change them later, but make sure that every member variable shows up in __init__ somehow, even if it's just to set it to None to mark that it's there, so you can always scan through __init__ and see all the member variables. Doing this also makes it more friendly to the type system, which is not good at tracking side effects like "this method happens to set this member variable". If you ran mypy on what you have I think it would complain that location_type was referenced before being created in __init__.

Summing that all up, I'd keep _set_location_code, but make it end with
pre:
return str(canonical_location)[last_location_code_character:]
and have __init__ do
pre:
self.location_code = self._set_location_code(soup)
That's both a) easier to read and b) friendlier to mypy

quote:

2. The classmethod is written to give an easy was to create an object with one line of code, while still giving the flexibility to generate the object from HTML which allows for a lot of flexibility in testing and use. Any issues with how this is written?
Love it.

quote:

3. I omitted the actual function definitions for brevity, but currently this class can generate off either a region page or a hotspot page. Hotspot pages don't contain total_ebirders or hotspot counts, so currently I'm just setting those to None for hotspots, but I'm weighing the advantages of making region and hotspot subclasses. I guess the biggest issue would be that I'd need to do the detection before creating the class rather than allowing it to be done based on what BeautifulSoup finds in the HTML.

Well, nothing's stopping you from having __init__ take a BeautifulSoup object instead of the raw html; you're doing a lot of work in that get_region method after all.

D34THROW
Jan 29, 2012

RETAIL RETAIL LISTEN TO ME BITCH ABOUT RETAIL
:rant:
I've looked into Django but it's way more involved than what I need it for. The most complicated functionality I need is the ability to draw on a canvas, and that's way down the line. It's a bunch of calculators to make calculating materials 100 times easier, so just simple forms and pertinent reports, I'm shoving it all in a SQLAlchemy DB for later reference - an earlier CLI version used JSON files, but I've since learned Python and DBs.

I'm having a bit of a meltdown with Flask though. I have my FLASK_APP variable set, python-dotenv installed with .flaskenv defining FLASK_APP as well, but when I do flask shell it doesn't pick up on...anything. Specifically, I'm trying to define my personal user account so that I can do the admin poo poo on the actual site, but it keeps telling me (no matter what I do) that "User" is not defined when I try to do >>> u = User(blahblahblah).


EDIT: I missed the whole part about the imports to create the shell context. Carry on. :downs:

D34THROW fucked around with this message at 18:56 on Jan 21, 2022

CarForumPoster
Jun 26, 2013

⚡POWER⚡

D34THROW posted:

I've looked into Django but it's way more involved than what I need it for. The most complicated functionality I need is the ability to draw on a canvas, and that's way down the line. It's a bunch of calculators to make calculating materials 100 times easier, so just simple forms and pertinent reports, I'm shoving it all in a SQLAlchemy DB for later reference - an earlier CLI version used JSON files, but I've since learned Python and DBs.

I'm having a bit of a meltdown with Flask though. I have my FLASK_APP variable set, python-dotenv installed with .flaskenv defining FLASK_APP as well, but when I do flask shell it doesn't pick up on...anything. Specifically, I'm trying to define my personal user account so that I can do the admin poo poo on the actual site, but it keeps telling me (no matter what I do) that "User" is not defined when I try to do >>> u = User(blahblahblah).


EDIT: I missed the whole part about the imports to create the shell context. Carry on. :downs:

Sounds like you want plotly dash (which is built on Flask). I have made many calculators with it. It is much easier and faster to build and deploy, particularly because theres no HTML needed. Whole thing in python.

Can even do "draw on a canvas" https://dash.plotly.com/canvas

punk rebel ecks
Dec 11, 2010

A shitty post? This calls for a dance of deduction.
So, suddenly debugging my Python code in VS Code stopped working.
Whenever I run a script I get this error:

code:
C:\Users\nomar\Desktop\Python\python.exe: can't open file 'C:\Users\nomar\Dropbox\Code\manage.py': [Errno 2] No such file or directoryrno 2] No such file or directory
PS C:\Users\nomar\Dropbox\Code>
I can still run stuff in the "Python" Terminal and Windows "Powershell", but not in "Python Debugger Console":


I tried following the directions in this website, however upon running get this error:

code:
cd : Cannot find path 'C:\Users\nomar\Dropbox\Code\mysite' because it does not exist.
At line:1 char:1
+ cd mysite
+ ~~~~~~~~~
    + CategoryInfo          : ObjectNotFound: (C:\Users\nomar\Dropbox\Code\mysite:String) [Set-Location], ItemNotFoundExc  
   eption
    + FullyQualifiedErrorId : PathNotFound,Microsoft.PowerShell.Commands.SetLocationCommand
I'm a bit in a loss of what I should even do. Any suggestions?

punk rebel ecks fucked around with this message at 06:14 on Jan 23, 2022

QuarkJets
Sep 8, 2008

idk but that kind of looks like a dropbox issue, e.g. when you're in the powershell terminal in VS Code are you actually able to navigate to C:\Users\nomar\Dropbox\Code?

What happens if you try to run the debugger on some code that just lives locally, like in your Documents folder?

QuarkJets
Sep 8, 2008

BeastOfExmoor posted:

I've been working on the code I was asking about a few pages back based on the solid recommendations the thread gave me. Wanted to get some feedback on how I'm approaching some things though.

Here's some very abbreviated code from the main class:

Python code:
class EbirdLocation():
    def __init__(self, ebird_html: Union[bytes, str]) -> None:
        """Class for storing parsed data from an ebird region or hotspot"""
        if not ebird_data_is_valid(ebird_html):
            raise ValueError  # Consider changing this to a custom error type

        soup = BeautifulSoup(ebird_html, 'html.parser')

        self._set_location_code(soup)
        # Wondering about if this is worse than changing _set_location_code to _get_location_code 
        # and having it return the location code.

        if self.location_type == "region":
            self._set_total_ebirders(soup)
            self._set_hotspots_count(soup)
        else:
            self.total_ebirders = None
            self.hotspots = None

    @classmethod
    def get_region(self, region: str, period: str = "all", sightings_type: str = SightingsType.first_seen.value):
        """Retrieve /region html and create a EbirdLocationData object from it"""

        validated_period = Period(period)  # Exception will occur if invalid period is passed
        sightings_type_validated = SightingsType(sightings_type)  # Exception will occur if invalid period is passed

        url = create_ebird_url(region, page_type=PageType.region.value, period=validated_period.value, sightings_type=sightings_type_validated.value)
        logging.debug(f"URL = {url}")
        ebird_html = get_ebird_html(url)
        return EbirdLocation(ebird_html)

    def _set_location_code(self, soup: BeautifulSoup):
        """Set the ebird location code (Ex. US-WA-061)"""
        canonical_location = soup.find("link", rel="canonical").attrs['href']
        last_non_location_code_character = re.search("/(hotspot|region)/", canonical_location).end()
        # Slice the end off the canonical location to get the location code
        self.location_code = str(canonical_location)[last_non_location_code_character:]
Questions:
1. Is there any advantage to having a function which returns data which a variable can be set to vs just having the function set self.location_code based on what it's parsed from the HTML?
2. The classmethod is written to give an easy was to create an object with one line of code, while still giving the flexibility to generate the object from HTML which allows for a lot of flexibility in testing and use. Any issues with how this is written?
3. I omitted the actual function definitions for brevity, but currently this class can generate off either a region page or a hotspot page. Hotspot pages don't contain total_ebirders or hotspot counts, so currently I'm just setting those to None for hotspots, but I'm weighing the advantages of making region and hotspot subclasses. I guess the biggest issue would be that I'd need to do the detection before creating the class rather than allowing it to be done based on what BeautifulSoup finds in the HTML.

1. For readability it's best that a function/method not do unexpected things. So if the method is named "set_pie_threshold" then it'd better be setting something called "pie_threshold" and it doesn't really matter whether or not "pie_threshold" is returned. But if the method is named "return_pie_threshold" then it'd better not be setting an internal state because that would be a hosed up thing for a method with that name to do. Between the two, imo it's better to return content than to set it internally, I think code written this way flows better

2. Usually a classmethod uses "cls" instead of "self" as the name of its first argument, to differentiate what is accessible when it's called (the function has access to the class, not to an instance of the class). Using "self" as the name is confusing for that reason.
Also, that classmethod doesn't actually use the class. It's really a static method; use @staticmethod instead and drop the first argument. A static method is basically just a function that exists in a class namespace, it's a method without the "self" or "cls" argument. This won't change the behavior of your code at all, it just drops the extraneous "self" variable from that method

3. Sublcassing may sound tempting but modern development has identified OOP as a whole as kind of a trap. You won't really be improving the functionality, the readability, or the performance of your software by doing this, and in fact it can make any/all of these things worse. This is not the right thread to talk about OOP, though.

On to specific suggestions:

Python code:
class EbirdLocation():
    def __init__(self, ebird_html: Union[bytes, str]) -> None:
        """Class for storing parsed data from an ebird region or hotspot"""
        if not ebird_data_is_valid(ebird_html):
            raise ValueError  # Consider changing this to a custom error type
If you're going to raise a ValueError then you should at least specify what the problem is, e.g. "ValueError('ebird data is invalid')". Hopefluly you're using logging to identify why the data is invalid.

Python code:
        soup = BeautifulSoup(ebird_html, 'html.parser')

        self._set_location_code(soup)
        # Wondering about if this is worse than changing _set_location_code to _get_location_code 
        # and having it return the location code.
imo it is better to set the location code to a local variable here. "self.location_code = self.get_location_code(soup)". Then "get_location_code(self, coup)" just figures out and returns the location code. Note that I dropped the leading underscore because it's not necessary that such a method be private.

Python code:
        if self.location_type == "region":
            self._set_total_ebirders(soup)
            self._set_hotspots_count(soup)
        else:
            self.total_ebirders = None
            self.hotspots = None
First big problem: where was self.location_type created? This is the first that we're seeing of this attribute. It was created inside of _set_location_code presumably, but it would be better if this initialization was more explicit. If you're setting a ton of location-based attributes then define a new dataclass to hold them, and set it explicitly with something like "self.location_data = return_location_data(soup)". If you're just setting a couple of attributes then have get_location_data return those explicitly: "self.location_type, self.location_code = return_location_data(soup)"

You can kind of intuit from this if block that _set_total_ebirders and _set_hotspots_count set the total_ebirders and hotspots variables, but it should would be better if that was more explicit. You can also do away with the else statement.

Putting these together, here's what I'd want to see:

Python code:
            self.location_type, self.location_code = return_location_data(soup)
            self.total_ebirders = None
            self.hotspots_count = None
            if self.location_type == "region":
                self.total_ebirders = self.return_total_ebirders(soup)
                self.hotspots_count = self.return_hotspots_count(soup)
Python code:
    @classmethod
    def get_region(self, region: str, period: str = "all", sightings_type: str = SightingsType.first_seen.value):
        """Retrieve /region html and create a EbirdLocationData object from it"""

        validated_period = Period(period)  # Exception will occur if invalid period is passed
        sightings_type_validated = SightingsType(sightings_type)  # Exception will occur if invalid period is passed
Classmethods use "cls" as the first variable name, not "self", by convention; see PEP8. But even better, this method should use @staticmethod; you can drop self/cls that way.

This method could be named better. If you're returning an EbirdLocationData then that should be alluded to in the name somehow.

The docstring is a lie, this method returns an EbirdLocation object, not a EbirdLocationData object. Docstrings lie all the time due to documentation rot, as a result of code always changing faster than documentation. Remove the object type from the docstring, and insert a type annotation for it instead, then a linter can catch if your type annotation matches the type that's returned

Is SightingsType an enum? I don't think that you're using enums very effectively here, if that's the case. sightings_type shouldn't be a string, it should be a SightingsType.

Python code:
        url = create_ebird_url(region, page_type=PageType.region.value, period=validated_period.value, sightings_type=sightings_type_validated.value)
        logging.debug(f"URL = {url}")
        ebird_html = get_ebird_html(url)
        return EbirdLocation(ebird_html)
"url" should be "ebird_url"

Python code:
    def _set_location_code(self, soup: BeautifulSoup):
        """Set the ebird location code (Ex. US-WA-061)"""
        canonical_location = soup.find("link", rel="canonical").attrs['href']
        last_non_location_code_character = re.search("/(hotspot|region)/", canonical_location).end()
        # Slice the end off the canonical location to get the location code
        self.location_code = str(canonical_location)[last_non_location_code_character:]
Turn this into a "return" method and update the docstring accordingly.

Add an annotation for the returned type (looks like str)

Drop the last comment, it's extraneous. I can tell right away that the end of the canonical location is being sliced off just by reading that line of code; "don't repeat yourself". And if you ever change this code then the comment may start to rot

I thought that location_type would be set here, but clearly that's not the case. Where is location_type being set? Is that just a renamed form of location_code? Have this function return location_code, then set self.location_code in __init__ by calling this method. Since self becomes extraneous, you can turn this into a @staticmethod

former glory
Jul 11, 2011

QuarkJets posted:


3. Sublcassing may sound tempting but modern development has identified OOP as a whole as kind of a trap. You won't really be improving the functionality, the readability, or the performance of your software by doing this, and in fact it can make any/all of these things worse. This is not the right thread to talk about OOP, though.


Mind expanding on this or pointing to some supporting sources? This is something I feel is right in my bones, but my formal education was all Java back in the early 2000s and everything back then was subclasses out the rear end, getters and setters, encapsulation above all. I didn’t write much code between 2008-2020 and coming back to it now, it seems like a lot of that was thrown out the window, especially in Python.

punk rebel ecks
Dec 11, 2010

A shitty post? This calls for a dance of deduction.

QuarkJets posted:

idk but that kind of looks like a dropbox issue, e.g. when you're in the powershell terminal in VS Code are you actually able to navigate to C:\Users\nomar\Dropbox\Code?
I guess so since it works. I just run it in powershell/terminal in VSCode.

QuarkJets posted:

What happens if you try to run the debugger on some code that just lives locally, like in your Documents folder?

The same error.

boofhead
Feb 18, 2021

former glory posted:

Mind expanding on this or pointing to some supporting sources? This is something I feel is right in my bones, but my formal education was all Java back in the early 2000s and everything back then was subclasses out the rear end, getters and setters, encapsulation above all. I didn’t write much code between 2008-2020 and coming back to it now, it seems like a lot of that was thrown out the window, especially in Python.

I would also be curious to read some material on this.. is the idea to use mixins rather than subclassing ad nauseam? Personally I don't subclass a huge amount but I feel like I've see a lot of people aggressively using it

cinci zoo sniper
Mar 15, 2013




boofhead posted:

I would also be curious to read some material on this.. is the idea to use mixins rather than subclassing ad nauseam? Personally I don't subclass a huge amount but I feel like I've see a lot of people aggressively using it

former glory posted:

Mind expanding on this or pointing to some supporting sources? This is something I feel is right in my bones, but my formal education was all Java back in the early 2000s and everything back then was subclasses out the rear end, getters and setters, encapsulation above all. I didn’t write much code between 2008-2020 and coming back to it now, it seems like a lot of that was thrown out the window, especially in Python.

https://hynek.me/articles/python-subclassing-redux/

former glory
Jul 11, 2011

That link is a pro click.

Falcon2001
Oct 10, 2004

Eat your hamburgers, Apollo.
Pillbug

former glory posted:

That link is a pro click.

I don't have a compsci background but am a dev, and I think that link is pretty great. I've been watching a lot of stuff around the arguments around OOP and this article does a great job of encapsulating some of my confusion around those arguments.

QuarkJets
Sep 8, 2008

former glory posted:

Mind expanding on this or pointing to some supporting sources? This is something I feel is right in my bones, but my formal education was all Java back in the early 2000s and everything back then was subclasses out the rear end, getters and setters, encapsulation above all. I didn’t write much code between 2008-2020 and coming back to it now, it seems like a lot of that was thrown out the window, especially in Python.

There are a lot of blog posts, articles, and conference presentations that go into it, even some textbooks. Plenty of forum discussion too, here and elsewhere. Here's a pretty good one
https://suzdalnitski.medium.com/oop-will-make-you-suffer-846d072b4dce

In my own experience codebases that lean heavily on inheritance tend to be harder to parse than codebases that do not. Even the kindest view of inheritance has to admit to this downside. So as a general rule, you should try to avoid superfluous inheritance; don't make your codebase harder to read without a good reason.

QuarkJets fucked around with this message at 23:55 on Jan 23, 2022

12 rats tied together
Sep 7, 2006

Lightbulb on moment for me was Sandi Metz' "Nothing is Something" talk which should be on youtube, and is also referenced throughout other links provided so far.

QuarkJets
Sep 8, 2008

Even "Clean Code", which was literally written for Java, talks about how inheritance negatively impacts readability and often winds up being superfluous. Know how to use it, then don't use it unless you really need it.

QuarkJets
Sep 8, 2008

12 rats tied together posted:

Lightbulb on moment for me was Sandi Metz' "Nothing is Something" talk which should be on youtube, and is also referenced throughout other links provided so far.

Link: https://www.youtube.com/watch?v=29MAL8pJImQ

BeastOfExmoor
Aug 19, 2003

I will be gone, but not forever.

Phobeste posted:

There's two good reasons to have a function calculating the value of a variable. First, it makes it easier to read, and not joking in something like 95% of code use cases the single most important thing to do with the code is make it clear and easy to read. You'll read it more often than you write it. Second, it's more efficient if the value won't change very often to just calculate it once and put it in your pocket.

That said, you typically don't want to have functions that create member variables that you call in initializers. Python doesn't really have separate ahead of time class member definitions - there's not a single unified place to look, required by the language, where you'll see all the member variables of a class. This can be a tremendous pain. The overwhelmingly common idiom is therefore to set all member variables at least once in __init__. You can change them later, but make sure that every member variable shows up in __init__ somehow, even if it's just to set it to None to mark that it's there, so you can always scan through __init__ and see all the member variables. Doing this also makes it more friendly to the type system, which is not good at tracking side effects like "this method happens to set this member variable". If you ran mypy on what you have I think it would complain that location_type was referenced before being created in __init__.

Thanks, that makes a lot of sense and I appreciate the clear explanation. I've changed the __init__ so it's now all statements like: self.location_type = self._get_location_type(soup)


QuarkJets posted:

2. Usually a classmethod uses "cls" instead of "self" as the name of its first argument, to differentiate what is accessible when it's called (the function has access to the class, not to an instance of the class). Using "self" as the name is confusing for that reason.
Also, that classmethod doesn't actually use the class. It's really a static method; use @staticmethod instead and drop the first argument. A static method is basically just a function that exists in a class namespace, it's a method without the "self" or "cls" argument. This won't change the behavior of your code at all, it just drops the extraneous "self" variable from that method

I appreciate the explanation of the difference between classmethod's and staticmethod's in this context. It was a little confusing reading up on them so I obviously need to do more reading.


QuarkJets posted:

Python code:
    @classmethod
    def get_region(self, region: str, period: str = "all", sightings_type: str = SightingsType.first_seen.value):
        """Retrieve /region html and create a EbirdLocationData object from it"""

        validated_period = Period(period)  # Exception will occur if invalid period is passed
        sightings_type_validated = SightingsType(sightings_type)  # Exception will occur if invalid period is passed
Classmethods use "cls" as the first variable name, not "self", by convention; see PEP8. But even better, this method should use @staticmethod; you can drop self/cls that way.

This method could be named better. If you're returning an EbirdLocationData then that should be alluded to in the name somehow.

The docstring is a lie, this method returns an EbirdLocation object, not a EbirdLocationData object. Docstrings lie all the time due to documentation rot, as a result of code always changing faster than documentation. Remove the object type from the docstring, and insert a type annotation for it instead, then a linter can catch if your type annotation matches the type that's returned

Is SightingsType an enum? I don't think that you're using enums very effectively here, if that's the case. sightings_type shouldn't be a string, it should be a SightingsType.

Agreed on all points. And yes, it's a Enum. My Enum's stuff is indeed really messy and I can't quite figure out how to be more effective with them while still allowing some flexibility.

QuarkJets posted:

Python code:
        url = create_ebird_url(region, page_type=PageType.region.value, period=validated_period.value, sightings_type=sightings_type_validated.value)
        logging.debug(f"URL = {url}")
        ebird_html = get_ebird_html(url)
        return EbirdLocation(ebird_html)
"url" should be "ebird_url"

Python code:
    def _set_location_code(self, soup: BeautifulSoup):
        """Set the ebird location code (Ex. US-WA-061)"""
        canonical_location = soup.find("link", rel="canonical").attrs['href']
        last_non_location_code_character = re.search("/(hotspot|region)/", canonical_location).end()
        # Slice the end off the canonical location to get the location code
        self.location_code = str(canonical_location)[last_non_location_code_character:]
Turn this into a "return" method and update the docstring accordingly.

Add an annotation for the returned type (looks like str)

Drop the last comment, it's extraneous. I can tell right away that the end of the canonical location is being sliced off just by reading that line of code; "don't repeat yourself". And if you ever change this code then the comment may start to rot

I thought that location_type would be set here, but clearly that's not the case. Where is location_type being set? Is that just a renamed form of location_code? Have this function return location_code, then set self.location_code in __init__ by calling this method. Since self becomes extraneous, you can turn this into a @staticmethod

As mentioned above, all _set methods are now _get methods which return a value. I omitted a bunch of the _set (now _get) methods from what I posed because they were the first code I ever wrote on this project and extremely ugly and in need of refactoring (done now).

Regarding the comment, I probably I've cut down my comments quite a bit for the reasons you stated, but I still like to comment on things like this when I feel like the syntax might not be clear to to me or others in the future. I know the slice syntax is pretty standard for a lot you, but it's really confusing for me since I haven't dealt with it much.

I like leaving the _get methods with the underscore if only I can't think of any reason a user of this class would ever want to invoke this method and it helps focus your attention on class properties that should be used by future code.

The class definition is now:
Python code:
class EbirdLocation():
    def __init__(self, ebird_html: Union[bytes, str]) -> None:
        """Class for storing parsed data from an ebird region or hotspot"""
        if not ebird_data_is_valid(ebird_html):
            raise ValueError("Invalid eBird HTML input detected.")  # Consider changing this to a custom error type

        soup = BeautifulSoup(ebird_html, 'html.parser')

        self.location_type = self._get_location_type(soup)
        self.location_code = self._get_location_code(soup)
        self.location_string = self._get_location_string(soup)
        self.total_checklists = self._get_total_checklists(soup)
        self.species_details = self._get_species_details(soup)
        self.species_names = self._get_species_names(soup)
        self.species_count = len(self.species_names)
        self.nonspecies_names = self._get_nonspecies_names(soup)
        self.total_ebirders = None
        self.hotspots = None
        if self.location_type.lower() == "region":
            self.total_ebirders = self._get_total_ebirders(soup)
            self.hotspots = self._get_hotspots_count(soup)
With the applicable _get methods retrieving data below.

QuarkJets
Sep 8, 2008

BeastOfExmoor posted:

I like leaving the _get methods with the underscore if only I can't think of any reason a user of this class would ever want to invoke this method and it helps focus your attention on class properties that should be used by future code.

This convention is for marking methods private, as in "please do not call this method, it may break stuff". That usually implies that the methods are modifying state in ways that may not be obvious. If someone else comes along and sees your code, that's what they're going to think: that these methods modify internal state; but getter methods should not be doing that, so there's no reason for getter methods to ever be private. That's a good reason to not use a leading underscore for those method names

quote:

The class definition is now:
Python code:
class EbirdLocation():
    def __init__(self, ebird_html: Union[bytes, str]) -> None:
        """Class for storing parsed data from an ebird region or hotspot"""
        if not ebird_data_is_valid(ebird_html):
            raise ValueError("Invalid eBird HTML input detected.")  # Consider changing this to a custom error type

        soup = BeautifulSoup(ebird_html, 'html.parser')

        self.location_type = self._get_location_type(soup)
        self.location_code = self._get_location_code(soup)
        self.location_string = self._get_location_string(soup)
        self.total_checklists = self._get_total_checklists(soup)
        self.species_details = self._get_species_details(soup)
        self.species_names = self._get_species_names(soup)
        self.species_count = len(self.species_names)
        self.nonspecies_names = self._get_nonspecies_names(soup)
        self.total_ebirders = None
        self.hotspots = None
        if self.location_type.lower() == "region":
            self.total_ebirders = self._get_total_ebirders(soup)
            self.hotspots = self._get_hotspots_count(soup)
With the applicable _get methods retrieving data below.

species_count if superfluous, if someone needs the length of a list they can just call len() on it themselves

hotspots kind of sounds like an iterable, like a list or tuple. Is that the case? If so, it may be better to use an empty iterable instead of None as the default

Likewise, what is total_ebirders? An integer? Might be better off as 0 for the default if it's an integer

D34THROW
Jan 29, 2012

RETAIL RETAIL LISTEN TO ME BITCH ABOUT RETAIL
:rant:

QuarkJets posted:

In my own experience codebases that lean heavily on inheritance tend to be harder to parse than codebases that do not. Even the kindest view of inheritance has to admit to this downside. So as a general rule, you should try to avoid superfluous inheritance; don't make your codebase harder to read without a good reason.

For me, I look at inheritance as a means to an end, with that end being "fewer keystrokes".

Part of my current project is calculators for fasteners, caulk, and buck cutlists for windows and doors. All of these are going to have the properties width, height, screws, caulk, and buck_list. Sliding glass doors are going to have mud, so I would subclass the Opening object and add the mud property, for example.

Or a roofover - they're all gonna have width, projection, gutter, fascia, and caulk. A composite roof will also have poly_lags and poly_tape for instance, while a pan roof will have foam and covers. So I'd have a dedicated Roof object that I subclass into CompositeRoof(Roof) and PanRoof(Roof).

I suppose for me it comes down to DRY coding? If I see the same routine used more than once, I'll try to stuff it in a function so I can simplify things, and if I see the same properties popping up in multiple classes, that tells me that I could probably use a superclass and inherit from that.


On the topic of dash-canvas, it looks like that's intended more for graphs? What I'm aiming more for is elevation- or floor-plan type stuff - continuing with the example of a composite roof, something that could draw, to rough scale, the panel layout given the dimensions.

D34THROW fucked around with this message at 16:46 on Jan 24, 2022

ExcessBLarg!
Sep 1, 2001

former glory posted:

Mind expanding on this or pointing to some supporting sources? This is something I feel is right in my bones, but my formal education was all Java back in the early 2000s and everything back then was subclasses out the rear end, getters and setters, encapsulation above all. I didn’t write much code between 2008-2020 and coming back to it now, it seems like a lot of that was thrown out the window, especially in Python.
Without getting into the weeds on this, I'd make the general observation that inheritence hierarchies often make a lot of sense for the kind of systems-level code that make up API toolkits (for which OO design was initially applied) and make less sense (especially vs. composition) in application level code. After all, the purpose of APIs/standard libraries/etc. is to provide you with a set of widgets to use in a generalized manner, whereas application level code tends to involve a lot of highly-specific logic that doesn't generalize well outside the immediate intended use of the class.

So if you're writing a library to expose some awesome new widget to the world, you may well end up using a lot of inheritence/mixins/whatever. If you're writing an application that wires up a bunch of widgets in a highly-specific way, many of your classes are probably "final" with a relatively small visible API.

The problem is when green coders inspired by Java Collections (or whatever is the gold-standard is today) get inspired and then try to write an application with highly-specific business logic with a class model based on, that.

D34THROW
Jan 29, 2012

RETAIL RETAIL LISTEN TO ME BITCH ABOUT RETAIL
:rant:
Someone wanna explain to me why the gently caress WTForms is trying to validate a SelectField() even though the form with a SelectField() isn't loving referenced anywhere in either /login.html or /index.html? Won't let me past the login page without a TypeError: Choices cannot be None bullshit that I traced back to validation in the SelectField() stuff in choices.py :psypop:


EDIT: gently caress it. Here's the stack trace.
code:
127.0.0.1 - - [25/Jan/2022 15:07:20] "GET /login HTTP/1.1" 200 -
127.0.0.1 - - [25/Jan/2022 15:07:21] "GET /static/css/styles.css HTTP/1.1" 304 -
127.0.0.1 - - [25/Jan/2022 15:07:26] "POST /login HTTP/1.1" 500 -
Traceback (most recent call last):
  File "C:\Users\D34THROW\Documents\pyproj\flask\RedactedAppName\env\Lib\site-packages\flask\app.py", line 2091, in __call__
    return self.wsgi_app(environ, start_response)
  File "C:\Users\D34THROW\Documents\pyproj\flask\RedactedAppName\env\Lib\site-packages\flask\app.py", line 2076, in wsgi_app
    response = self.handle_exception(e)
  File "C:\Users\D34THROW\Documents\pyproj\flask\RedactedAppName\env\Lib\site-packages\flask\app.py", line 2073, in wsgi_app
    response = self.full_dispatch_request()
  File "C:\Users\D34THROW\Documents\pyproj\flask\RedactedAppName\env\Lib\site-packages\flask\app.py", line 1518, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "C:\Users\D34THROW\Documents\pyproj\flask\RedactedAppName\env\Lib\site-packages\flask\app.py", line 1516, in full_dispatch_request
    rv = self.dispatch_request()
  File "C:\Users\D34THROW\Documents\pyproj\flask\RedactedAppName\env\Lib\site-packages\flask\app.py", line 1502, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**req.view_args)
  File "C:\Users\D34THROW\Documents\pyproj\flask\RedactedAppName\app\routes.py", line 17, in login
    if form.validate_on_submit():
  File "C:\Users\D34THROW\Documents\pyproj\flask\RedactedAppName\env\Lib\site-packages\flask_wtf\form.py", line 86, in validate_on_submit
    return self.is_submitted() and self.validate()
  File "C:\Users\D34THROW\Documents\pyproj\flask\RedactedAppName\env\Lib\site-packages\wtforms\form.py", line 329, in validate
    return super().validate(extra)
  File "C:\Users\D34THROW\Documents\pyproj\flask\RedactedAppName\env\Lib\site-packages\wtforms\form.py", line 146, in validate
    if not field.validate(self, extra):
  File "C:\Users\D34THROW\Documents\pyproj\flask\RedactedAppName\env\Lib\site-packages\wtforms\fields\core.py", line 231, in validate
    self.pre_validate(form)
  File "C:\Users\D34THROW\Documents\pyproj\flask\RedactedAppName\env\Lib\site-packages\wtforms\fields\choices.py", line 136, in pre_validate
    raise TypeError(self.gettext("Choices cannot be None."))
TypeError: Choices cannot be None.
The relevant section of routes.py. The error gets thrown on clicking the Submit button for the field, both with valid and invalid username/password combos - in the case of an invalid username/password what it should do at this point is kick the user back to the login page:
Python code:
@app.route('/login', methods=['GET','POST'])
def login():
    title = f'{title_base} Log In'
    form = LoginForm()
    if current_user.is_authenticated:
        return redirect(url_for('index'))
    if form.validate_on_submit():
        app.logger.info(f'{form.username.data} logged in, remember_me=' +
            f'{form.remember_me.data}')
        user = User.query.filter_by(username=form.username.data).first()
        if user is None or not user.check_password(form.password.data):
            flash('Invalid username or password.')
            return redirect(url_for('login'))
        session['USERNAME'] = user.username
        session['EMAIL'] = user.email
        session['FIRST_NAME'] = user.first_name
        session['LAST_NAME'] = user.last_name
        session['FULL_NAME'] = ' '.join(user.first_name, user.last_name)
        login_user(user, remember=form.remember_me.data)
        next_page = request.args.get('next')
        if not next_page or url_parse(next_page).netloc != '':
            next_page = url_for('index')
        return redirect(next_page)
    return render_template('login.html', title=title, form=form)

@app.route('/')
@app.route('/index')
@login_required
def index():
    title = f'{title_base} Home'
    return render_template('index.html', title=title, 
        admin=current_user.is_admin)
And the only form that references a SelectField().
Python code:
class RegistrationForm(FlaskForm):
    first_name = StringField("First Name ", validators=[DataRequired()]) 
    last_name = StringField("Last Name ", validators=[DataRequired()])
    email = StringField("Company Email ", validators=[DataRequired(), Email()])
    username = StringField("Username ", validators=[DataRequired()])
    password = StringField("Password ", validators=[DataRequired()])
    repeat_pwd = StringField("Repeat Password ", validators=[DataRequired(),
        EqualTo('password')])
    branch = SelectField("Branch ", coerce=int, validators=[InputRequired()],
        choices=[''])

    def validate_username(self, username: str):
        user = User.query.filter_by(username=username.data).first()
        if user is not None:
            raise ValidationError('This username is already in use.')

    def validate_email(self, email: str):
        user = User.query.filter_by(email=email.data).first()
        if user is not None:
            raise ValidationError('This email is already in use.')
The only thing that references RegistrationForm() is a page, register.html that's not even in use or linked yet.


EDIT 2: It's loving happening whenever I do it now. I commented out the register routing and RegistrationForm() for the time being and now I am going to scream. Took this thing from working to nonfunctional with a couple dozen lines of code that I can't seem to backtrack :bang:


EDIT 3: I found the issue. I stuck a branch = SelectField('Branch', coerce=int) into LoginForm() by mistake somehow, and because the routing for login.html didn't populate the field and the field wasn't even on the loving HTML it was passing as None. Please proceed to have a complete giggle at my expense. :downs:

D34THROW fucked around with this message at 21:09 on Jan 25, 2022

Data Graham
Dec 28, 2009

📈📊🍪😋



https://www.youtube.com/watch?v=rX7wtNOkuHo

No seriously I thought "WTForms" was a joke to begin with :v:

D34THROW
Jan 29, 2012

RETAIL RETAIL LISTEN TO ME BITCH ABOUT RETAIL
:rant:
I love the feeling of realizing there's an easier solution to a problem.

My original plan was to have a class Report() tied to a table, which had fields that would be common among all reports - report number, user, timestamp, report type, etc. I was going to subclass this with joined table inheritance for individual report types, but then I just realized that I'll be returning JSON data from the actual workhorse classes that do the math behind the reports.

I think my new plan is to have id, user, timestamp, report_type, and common fields in table report, then a report_data field that holds the dumped JSON dictionary data. The flask routing will read report_type and select a template based on that.

This seems a lot more feasible than report subclassed to report_panroof, report_polyroof, etc.

Zoracle Zed
Jul 10, 2001
Probably overkill for you because you're using bare dicts, but the generic support in mypy is pretty cool for composition like that

code:
from typing import TypeVar, Generic

class ReportType:  ...
class Foo(ReportType):  ...
class Bar(ReportType): ...
    
RT = TypeVar('RT', bound=ReportType)

class Report(Generic[RT]):
    def __init__(self, report_type: RT):
        self.report_type = report_type
then you can annotate report variable types as Report, Report[Foo], or Report[Bar], etc. Pycharm's type infererence will autosuggest correct methods for report.report_type

12 rats tied together
Sep 7, 2006

That's probably the point where I would start breaking out Roof into its own class and compose a BuildingReport from valid instances of Roof, Wall, Window, etc. PolyRoof and PanRoof are both reasonable to intuit as "specializations of Roof". Tracking what kind of roof is on the building in the name of the building report's class feels like it will result in subclass explosion almost immediately.

D34THROW
Jan 29, 2012

RETAIL RETAIL LISTEN TO ME BITCH ABOUT RETAIL
:rant:
Following and adapting Miguel Grinberg's Flask mega-tutorial and there's a post on refactoring into a better structure. I've never done a major refactor like this before and I don't want to break everything but it's gonna make it sooooo much prettier :ohdear:

12 rats tied together posted:

That's probably the point where I would start breaking out Roof into its own class and compose a BuildingReport from valid instances of Roof, Wall, Window, etc. PolyRoof and PanRoof are both reasonable to intuit as "specializations of Roof". Tracking what kind of roof is on the building in the name of the building report's class feels like it will result in subclass explosion almost immediately.

Oh, poo poo, you misunderstand. No, we don't build whole buildings, we remodel. Poly/pan roofs go over porches and poo poo, windows and doors go into existing holes that we tear the old windows out of, retrofit shutters on, etc. I'm not tracking a whole fuckin' building like that :v:

75%-80% of use cases will involve one and only one calculator being used per job. I'm just tired personally of being tied down to Excel for reports that nobody really gives a poo poo about the formatting of, and I have more of a clue about how to do certain things in Python than Excel. Also a bit of a hobby project too since I've been spending 90% of my workday on here and want to contribute something small to the company. (Ha. Right.)

Falcon2001
Oct 10, 2004

Eat your hamburgers, Apollo.
Pillbug
I just discovered https://www.attrs.org/, has anyone used this in a shared environment? I'm torn between 'wow this looks amazing' and 'huh, I wonder if this is going to make readability weird for folks that aren't familiar with attrs'.

QuarkJets
Sep 8, 2008

Falcon2001 posted:

I just discovered https://www.attrs.org/, has anyone used this in a shared environment? I'm torn between 'wow this looks amazing' and 'huh, I wonder if this is going to make readability weird for folks that aren't familiar with attrs'.

What kind of functionality are you looking for specifically? I'm looking at https://www.attrs.org/en/stable/examples.html and https://www.attrs.org/en/stable/why.html

It sounds like attrs was extremely useful before dataclass was defined, but now it's sort of extraneous unless you need very specific extension features that dataclass doesn't provide. That's edge-case territory, so you may need attrs for a specific project but most of the time you'd be better off just using dataclass.

e: Obligatory link to dataclass in Python docs.
attrs would be very useful if you were forced to use older versions of Python. Please do not use older version of Python

QuarkJets fucked around with this message at 23:17 on Jan 30, 2022

Falcon2001
Oct 10, 2004

Eat your hamburgers, Apollo.
Pillbug

QuarkJets posted:

What kind of functionality are you looking for specifically? I'm looking at https://www.attrs.org/en/stable/examples.html and https://www.attrs.org/en/stable/why.html

It sounds like attrs was extremely useful before dataclass was defined, but now it's sort of extraneous unless you need very specific extension features that dataclass doesn't provide. That's edge-case territory, so you may need attrs for a specific project but most of the time you'd be better off just using dataclass.

e: Obligatory link to dataclass in Python docs.
attrs would be very useful if you were forced to use older versions of Python. Please do not use older version of Python

To start, nope, not using older versions of python, we're in late 3.X territory (3.7 I think?) and security won't let us fall behind into the terrible world.

https://hynek.me/articles/import-attrs/ this goes into some of the reason why attrs still exists in a world with dataclasses, which apparently were built from attrs in the first place, but I suppose the other point is that in most cases dataclasses probably does most of what I'd use attrs for and is standard library; I've even used dataclasses on a project recently.

QuarkJets
Sep 8, 2008

Falcon2001 posted:

To start, nope, not using older versions of python, we're in late 3.X territory (3.7 I think?) and security won't let us fall behind into the terrible world.

https://hynek.me/articles/import-attrs/ this goes into some of the reason why attrs still exists in a world with dataclasses, which apparently were built from attrs in the first place, but I suppose the other point is that in most cases dataclasses probably does most of what I'd use attrs for and is standard library; I've even used dataclasses on a project recently.

It's a fine post, but I think it reaffirms what I said: use dataclass unless you need one of the various additional features that attrs defines. If you're at that point then don't worry about what your team has experience with; every project is going to have new conventions/tools that some people have to learn about, and it may require sitting down people to talk to them about it. This is what pull requests are for

The March Hare
Oct 15, 2006

Je rêve d'un
Wayne's World 3
Buglord
Does anyone have a good solution for packaging internal/local libraries for deployment to something like AWS Lambda?

I've got some internal libraries that we do not have on pypi or anything, and I'd like to have something like `poetry build` include them in the zipped up code from the local build path.

So far I've found this blog post https://chariotsolutions.com/blog/post/building-lambdas-with-poetry/ which details the problem well, and the solution would probably work, but it is a little bit cumbersome still.

Is anyone aware of a flag for poetry or a 3rd party tool that would smooth this out for me?

necrotic
Aug 2, 2005
I owe my brother big time for this!

The March Hare posted:

Does anyone have a good solution for packaging internal/local libraries for deployment to something like AWS Lambda?

I've got some internal libraries that we do not have on pypi or anything, and I'd like to have something like `poetry build` include them in the zipped up code from the local build path.

So far I've found this blog post https://chariotsolutions.com/blog/post/building-lambdas-with-poetry/ which details the problem well, and the solution would probably work, but it is a little bit cumbersome still.

Is anyone aware of a flag for poetry or a 3rd party tool that would smooth this out for me?

The easy method is local path references, as stated in that link. If local path references can’t work then you need that custom artifact repository it can pull the dep from (after you publish to it).

What would the ideal behavior (flag or not) be for you?

The March Hare
Oct 15, 2006

Je rêve d'un
Wayne's World 3
Buglord

necrotic posted:

The easy method is local path references, as stated in that link. If local path references can’t work then you need that custom artifact repository it can pull the dep from (after you publish to it).

What would the ideal behavior (flag or not) be for you?

Yeah, I think the difficulty is maybe with Poetry. Doing "poetry add ../libs/package" adds a dep to the pyprojevt.toml but building that with poetry results in a setup.py with something like "package @ ../libs/package". You can't pip install the package like that, and it isn't clear to me how to get poetry build to result in a setup.py that actually works without further modification.

Ideally I'm running "poetry build && poetry run pip install path to wheel" and that works, rather than not working.

I've got everything in docker, so I can happily use absolute paths if there's some way to take advantage of that which I'm not seeing.

The March Hare fucked around with this message at 15:45 on Jan 31, 2022

Adbot
ADBOT LOVES YOU

CarForumPoster
Jun 26, 2013

⚡POWER⚡

The March Hare posted:

Does anyone have a good solution for packaging internal/local libraries for deployment to something like AWS Lambda?

I've got some internal libraries that we do not have on pypi or anything, and I'd like to have something like `poetry build` include them in the zipped up code from the local build path.

So far I've found this blog post https://chariotsolutions.com/blog/post/building-lambdas-with-poetry/ which details the problem well, and the solution would probably work, but it is a little bit cumbersome still.

Is anyone aware of a flag for poetry or a 3rd party tool that would smooth this out for me?


The March Hare posted:

Yeah, I think the difficulty is maybe with Poetry. Doing "poetry add ../libs/package" adds a dep to the pyprojevt.toml but building that with poetry results in a setup.py with something like "package @ ../libs/package". You can't pip install the package like that, and it isn't clear to me how to get poetry build to result in a setup.py that actually works without further modification.

Ideally I'm running "poetry build && poetry run pip install path to wheel" and that works, rather than not working.

I've got everything in docker, so I can happily use absolute paths if there's some way to take advantage of that which I'm not seeing.

I deploy python to lambda using docker container images via the SAM CLI. I highly recommend. SAM CLI makes it easy to build and test locally, and if poo poo gets fucky I just add a time.sleep(900) and then use the docker CLI to SSH in and see whats going on. Can even run the REPL to test code. This greatly sped up my dev cycle with AWS Lambda.

Do you actually need to pip install or do you just need the directory in the right place? i.e. can you just move your package in to the base directory with your Dockerfile?

Alternatively, you could make the /opt/ folder that lambda layers get installed to to have that file, using it as a layer.

I havent used poetry but it seems like you could also run the commend you mentioned in the dockerfile after moving wherever needed if you want...though not sure if permissioning in the lambda allows it. If you find yourself fighting permissions, consider some of the solutions above.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply