Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Foxfire_
Nov 8, 2010

Falcon2001 posted:

Python code:
class ProjectFactory:
    def get_ext_dependency() -> ExtDependency:
        return ExtDependency()
        
    def get_diff_dependency(dep_override: DependencyThisThingNeeds = None) -> DiffDependency:
        if dep_override is None:
            dep_override = default_setting_for_this_option
        return DependencyThisThingNeeds(dep_override)

# Where you need to use it
import ProjectFactory
 
 ext = ProjectFactory.get_ext_dependency()
 diff_dep = ProjectFactory.get_diff_dependency()
When testing, it's easy enough to import and overwrite the ProjectFactory get* functions to return whatever mock for external dependencies you need, and your code doesn't need to have a ton of signatures for passing everything in one at a time.

Is this just a matter of 'when reading the code, you won't easily be able to see that a downstream function of this function uses this external dependency?' like an implicit vs explicit problem? Or are there other downsides of this pattern I'm not seeing due to lack of experience?
You're still using global state to choose the dependencies, just with friendlier names that are more explicit about it.

One of the big downsides is that you can't run tests in parallel (e.g Test A wants ProjectFactory.get_ext_dependency() to return MockThingA and Test B wants it to return MockThingB), but python test frameworks can't easily do that anyway. You'll also want some machinery that makes it hard to accidentally not clean up after yourself.

(In a different language, it would also be forcing calls to the dependency to be indirect through a vtable, which occasionally matters for performance/is ugly if test is the only thing forcing virtual calls. In python all calls are string lookups anyway so that doesn't matter)

A way to do more-or-less the same thing without the global state would be to pass an instance of a factory/container class in the constructor, then use that instance to create/access the other dependencies.

Adbot
ADBOT LOVES YOU

Falcon2001
Oct 10, 2004

Eat your hamburgers, Apollo.
Pillbug

Foxfire_ posted:

A way to do more-or-less the same thing without the global state would be to pass an instance of a factory/container class in the constructor, then use that instance to create/access the other dependencies.

Could you provide an example of how this would look? Would it be something like this?

Python code:
def function_that_needs_dep(factory: ProjectFactory):
    ext = factory.get_ext_dependency()

samcarsten
Sep 13, 2022

by vyelkin

12 rats tied together posted:

Worth noting that Python has a pretty good interactive experience, right out of the box. If I run your code:
Python code:
>>> data = r.json()
Traceback (most recent call last):
  File "/home/rob/.pyenv/versions/3.10.4/lib/python3.10/site-packages/requests/models.py", line 910, in json
    return complexjson.loads(self.text, **kwargs)
  File "/home/rob/.pyenv/versions/3.10.4/lib/python3.10/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/home/rob/.pyenv/versions/3.10.4/lib/python3.10/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/home/rob/.pyenv/versions/3.10.4/lib/python3.10/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

# what is r?
>>> type(r)
<class 'requests.models.Response'>

# how can we talk to it?
>>> dir(r)
['__attrs__', '__bool__', '__class__', [...], 'content',  [...]  'status_code', 'text', 'url']

# r.content sounds promising
>>> r.content
b'<!DOCTYPE html>\n<!--[if IEMobile 7 ]> <html lang="en-US" class="no-js iem7"> <![endif]-->\n<!--[if lt IE 7]> <html class="ie6 lt-ie10 lt-ie9 lt-ie8 lt-ie7 no-js" lang="en-US"> <![endif]-->\n<!--[if IE 7]> # snip
This looks like HTML to me, so it's pretty reasonable that it would fail to parse as JSON.

right, but this assignment is assuming a json output. How do I get a json output?

saintonan
Dec 7, 2009

Fields of glory shine eternal

samcarsten posted:

right, but this assignment is assuming a json output. How do I get a json output?

The format is a part of the request string. Look at my example a page ago for how that looks on the actual request line.

samcarsten
Sep 13, 2022

by vyelkin
ok, got it to output json. Now I need to parse it to turn "related topics" into an array or list I can search.
edit: did that. now to write a pytest test

samcarsten fucked around with this message at 12:30 on Nov 3, 2022

Foxfire_
Nov 8, 2010

Falcon2001 posted:

Could you provide an example of how this would look? Would it be something like this?

Python code:
def function_that_needs_dep(factory: ProjectFactory):
    ext = factory.get_ext_dependency()

Basically yeah. If you have an instance that needs to either accept a bunch of other instances to work with or know how to make new instances of some other type and you're grouping them up to reduce the number of parameters, group them in a variable that's passed in instead of a global location it always accesses.

Making up a concrete-er example, suppose we're making a PrinterConnector class to talk to a printer that could be attached via USB, TCP, or RS-232. It has a function that takes a string describing the connection, parses it to figure out the transport, tries to connect, does some configuring, then returns a AbstractPrinterConnection instance (that is actually some concrete transport-specific class that PrinterConnector needs to make a new instance for). We want to mock out the transport-specific classes during test so when it tries to make one, it actually gets a mock that we can check things about/make simulate errors.

Python code:
class PrinterConnecter:
    def connect_to_printer(self, address: string) -> AbstractPrinterConnection:
        """Connect to address, configure the printer, and return the connection.  Or throw something on error"""
        ...
The most straightforward way to do it is to tell each instance the specific types it should create when it's constructed:
Python code:
class PrinterConnecter:
    def __init__(self, usb_connection_type: type, tcp_connection_type: type, rs232_connection_type: type):
        ...
but if that's too wordy, invent a factory class and pass an instance of that instead:
Python code:
class PrinterConnectionFactory:
    def __init__(self):
        """ Doesn't have any instance data """
        pass

    def make_usb_connection(self, whatever_params) -> UsbPrinterConnection:
        ...

    def make_tcp_connection(self, whatever_params) -> TcpPrinterConnection:
        ...

    def make_rs232_connection(self, whatever_params) -> Rs232PrinterConnection:
        ...

class PrinterConnector:
    def __init__(self, connection_factory: PrinterConnectionFactory)
Then in the tests, pass a MockPrinterConnectionFactory instead that makes MockUsbPrinterConnection/MockTcpPrinterConnection/MockRs232PrinterConnection instead.

Because the mocking isn't global anymore, you can do things like have Test #1 using a PrinterConnector instance with MockTcpPrinterConnectionThatReportsConnectionTimeout and Test #2 using another instance with MockTcpPrinterConnectionThatDropsTheConnectionAfter20Bytes executing simultaneously without stomping on each other


(I personally don't typically bother with a factory and would just pass the types in the PrinterConnector constructor)

e:

One thing that is nicer with the factory is accessing/controlling the mock instances that are used. Like if connect_to_print() was supposed to auto-retry a few times with a fresh *Connection instance each time and you wanted to test an error->error->success sequence, it's easier to do that by making the factory instance's 3rd make_tcp_connection() call do something different than it is to make the 3rd MockTcpPrinterConnection object constructed do something different

Foxfire_ fucked around with this message at 03:51 on Nov 5, 2022

Seventh Arrow
Jan 26, 2005

I previously mentioned a csv analysis/cleaning thing I was working on. The idea was that there would be a webpage frontend where the client could click on a "browse" button and submit their csv. Then on the backend, pandas would do the analysis and display the results on the webpage, as well as dump the results into a spreadsheet. I'm not really good with Flask/Django so I had someone help out there. Unfortunately he did the backend with a module called 'tablib' instead of pandas. Maybe this wouldn't be so bad, but he calls the csv columns by position instead of name, and the columns can change position depending on the client.

You can get an idea of what it looks like here: https://csvchecker786.pythonanywhere.com/

So I want to try to convert the backend to pandas, not only because of the column positioning thing but I have to add more stuff and I'm way more familiar with pandas than I am with tablib.

It seems to me that the first step should be to read the file and convert it into a dataframe, which would be something like "df = pd.read_csv(f'{filename}')" - the filename cannot be hard-coded and needs to come from whatever file they browse for and submit.

I rummaged around the existing code, and I think this is where it grabs the file:

code:
def post(self, request, *args, **kwargs):
        type = request.POST["type"]
	file = request.FILES["file"].read().decode("utf-8")
        name = request.POST["name"]
        dataset = Dataset().load(file, format="csv")

        if type == "accounts":
            errors = accounts_checker(dataset)
        elif type == "contacts":
            errors = contacts_checker(dataset)
        elif type == "membership":
            errors = member_ship_checker(dataset)
        context = {
            "results": errors,
            "name": name,
        }
        return render(request, "myapp/results.html", context=context)
I tried having it read the filename by replacing files with: file = request.FILES['file'].name
and then turning it into a dataframe by changing dataset to dataset = pd.read_csv(f'{file}', encoding = "ISO-8859-1"), but this does not work.

Given the way that he uses "dataset" in the functions, I'm not even sure that I can swap that out with a dataframe, depending on what "Dataset()" is supposed to do. Here's an example of one of the functions he wrote:

code:
def contacts_checker(dataset):
    errors = []
    contacts_lagacy_num = []
    for data in dataset:
        if data[0]:
            contacts_lagacy_num.append(data[1])
    for i in dataset:
        if i[1]:
            count = 0
            for num in contacts_lagacy_num:
                if num == i[1]:
                    count = count + 1
            if count > 1:
                errors.append("Legacy Contact Number must be unique")
        else:
            errors.append("Legacy Contact Number is required.")
        if not i[2]:
            errors.append("LASTNAME is required.")
        if not i[7] in contacts_mailing_state_code:
            errors.append(f"{i[7]} is not a correct format for MailingStateCode")
        if not i[9] in contacts_mailing_country_code:
            errors.append(f"{i[9]} is not a correct format for MailingCountryCode")
        if not i[20].upper() in contacts_email_type:
            errors.append(f"{i[20]} is not a valid PREFERRED_EMAIL_TYPE")
        if not i[21].upper() in contacts_phone_type:
            errors.append(f"{i[21]} is not a valid PREFERRED_PHONE_TYPE")
        if not i[28] in contacts_salutation:
            errors.append(f"{i[28]} is not a valid Salutation")
    return errors
I already have most of the pandas analysis written out, if I can just find a way to convert the submitted csv into a dataframe I think I might be able to handle the rest.

Any suggestions would be appreciated. If anyone is willing to pore over the code, please PM me and I will do what I can to compensate. With $$$, I mean.

Zugzwang
Jan 2, 2005

You have a kind of sick desperation in your laugh.


Ramrod XTreme
The file variable is calling read() and decode() methods. I’m not familiar with tablib either, but it sure looks like that line is converting the file into text and then using that to construct a Dataset.

If the method you tried didn’t work to construct the DataFrame, you should still inspect the contents of the file variable, since I’m not sure where else the data would be coming from. pandas’s read_csv accepts a path or buffer with a read() method, so have you tried just chopping off read() and decode() from file and passing that into read_csv?

Zugzwang fucked around with this message at 03:43 on Nov 8, 2022

Seventh Arrow
Jan 26, 2005

Thanks for the input. I forgot to mention that when I look at the traceback, it does seem to read the filename properly but it doesn't seem to see the file as being 'there' so to speak.

quote:

FileNotFoundError at /
[Errno 2] No such file or directory: 'Test_Data_Account.csv'
Request Method: POST
Request URL: http://127.0.0.1:8000/
Django Version: 4.1.3
Exception Type: FileNotFoundError
Exception Value:
[Errno 2] No such file or directory: 'Test_Data_Account.csv'
Exception Location: /usr/lib/python3/dist-packages/pandas/io/common.py, line 702, in get_handle
Raised during: myapp.views.HomeView
Python Executable: /usr/bin/python3
Python Version: 3.10.6
Python Path:
['/home/vs/Documents/python/csv_analyzer/production',
'/usr/lib/python310.zip',
'/usr/lib/python3.10',
'/usr/lib/python3.10/lib-dynload',
'/home/vs/.local/lib/python3.10/site-packages',
'/usr/local/lib/python3.10/dist-packages',
'/usr/lib/python3/dist-packages',
'/usr/lib/python3.10/dist-packages']
Server time: Tue, 08 Nov 2022 02:47:26 +0000

So it knows that the name of the file is 'Test_Data_Account.csv' but then just kind of nopes out.

Edward IV
Jan 15, 2006

I'm not terribly well versed with Django but I think this is applicable with your problem.
https://stackoverflow.com/a/53116876

Python code:
pd.read_csv(request.FILES['file'])
What that POST function is doing is with the file request is uploading the csv file to the server where it resides in memory as the "file" object. There is no actual file unless you call a function to write it to disc. That "file" object can be read by pandas which is what that line of code should do.

(Apologies if I'm using incorrect terminology. I may have just gotten a MSCS but I'm still kind of new at this whole programming thing (I started out as a ME) and we weren't explicitly taught Python.)

Also that contacts_checker code is really something to behold. I don't want to sound all high and mighty but it looks so amateurish. I suppose it "works" but I can't imagine it'll be very fast especially with large datasets. Having spent time converting preliminary pandas code from using for loops and iterators to apply functions, I couldn't help but gawk.

Seventh Arrow
Jan 26, 2005

Thanks, I'll give that a try!

And yes I'm not crazy about his code, but I question whether mine is much better. I took one of his functions and updated it with a pandas way of doing it and this is what I came up with:

code:
def accounts_checker(dataset):
    errors = []
    
    for i in dataset:
        if i:
            ## col 0
            if not i[0]:
                errors.append("Account Name is required")
            ## col 1
            duplicate_lan = dataset.duplicated(['Legacy Account Number']).any()
            if duplicate_lan:
                errors.append("Legacy Account Number must be unique")
            lan_null = dataset['Legacy Account Number'].isnull().values.any()
            if lan_null:
                errors.append("Legacy Account Number is required.")
            #if not i[5] in account_billing_state_code:
            for cc in dataset['BillingCOUNTRYCODE']:
                if cc not in account_billing_country_code:
                    errors.append(f"{cc} is not a correct format for BillingCOUNTRYCODE")
            ## col 6
            # if not i[6] in account_billing_country_code:
            for sc in dataset['BillingSTATECODE']:
                if sc not in account_billing_state_code:
                    errors.append(f"{sc} is not a correct format for BillingSTATECODE")
    return errors
I do like the django functionality though, it's pretty much how I wanted it. Although I feel that it doesn't even need the "Accounts/Contacts/Membership" dropdown box, it should really just be able to read the filename and send it to the correct function.

Seventh Arrow
Jan 26, 2005

Hey that worked, thanks again! I tested it and the pandas stuff works as intended :stoked:

QuarkJets
Sep 8, 2008

Seventh Arrow posted:

Thanks, I'll give that a try!

And yes I'm not crazy about his code, but I question whether mine is much better. I took one of his functions and updated it with a pandas way of doing it and this is what I came up with:

code:
def accounts_checker(dataset):
    errors = []
    
    for i in dataset:
        if i:
            ## col 0
            if not i[0]:
                errors.append("Account Name is required")
            ## col 1
            duplicate_lan = dataset.duplicated(['Legacy Account Number']).any()
            if duplicate_lan:
                errors.append("Legacy Account Number must be unique")
            lan_null = dataset['Legacy Account Number'].isnull().values.any()
            if lan_null:
                errors.append("Legacy Account Number is required.")
            #if not i[5] in account_billing_state_code:
            for cc in dataset['BillingCOUNTRYCODE']:
                if cc not in account_billing_country_code:
                    errors.append(f"{cc} is not a correct format for BillingCOUNTRYCODE")
            ## col 6
            # if not i[6] in account_billing_country_code:
            for sc in dataset['BillingSTATECODE']:
                if sc not in account_billing_state_code:
                    errors.append(f"{sc} is not a correct format for BillingSTATECODE")
    return errors
I do like the django functionality though, it's pretty much how I wanted it. Although I feel that it doesn't even need the "Accounts/Contacts/Membership" dropdown box, it should really just be able to read the filename and send it to the correct function.

Some thoughts:

Python code:
            duplicate_lan = dataset.duplicated(['Legacy Account Number']).any()
            if duplicate_lan:
                errors.append("Legacy Account Number must be unique")
Why is this in the for loop? It doesn't depend on the iteration at all. Same with lan_null.

Python code:
            for cc in dataset['BillingCOUNTRYCODE']:
                if cc not in account_billing_country_code:
                    errors.append(f"{cc} is not a correct format for BillingCOUNTRYCODE")
Same issue here; this is iterating over the entire dataset inside of a for loop that's iterating over the entire dataset. You only need one of these loops.

Where does account_billing_country_code come from?

There doesn't seem to be any identifying information in the errors, so it'd be impossible to go back and figure out where a problem is in a dataset.

Seventh Arrow
Jan 26, 2005

QuarkJets posted:

Some thoughts:

Why is this in the for loop? It doesn't depend on the iteration at all. Same with lan_null.

Same issue here; this is iterating over the entire dataset inside of a for loop that's iterating over the entire dataset. You only need one of these loops.

I guess mainly because I didn't really understand the code that much. Now that you mention it though, this does make sense - I only want to iterate over the column, so I guess everything after ## col 1 could be moved out of the for loop.

quote:

Where does account_billing_country_code come from?

It's on a separate page and then that page has an import statement at the top of this script. I actually like this better than my original solution, which was to put everything in a big honkin' list like so:

code:
country_code = ["AF", "AX", "AL", "DZ", "AD", "AO", "AI", "AQ", "AG", "AR", "AM", "AW", "AU", "AT", "AZ", "BS", "BH", "BD", "BB", "BY", "BE", "BZ", "BJ", "BM", "BT", "BO", "BQ", "BA", "BW", "BV", "BR", "IO", "BN", "BG", "BF", "BI", "KH", "CM", "CA", "CV", "KY", "CF", "TD", "CL", "CN", "CX", "CC", "CO", "KM", "CG", "CD", "CK", "CR", "CI", "HR", "CU", "CW", "CY", "CZ", "DK", "DJ", "DM", "DO", "EC", "EG", "SV", "GQ", "ER", "EE", "ET", "FK", "FO", "FJ", "FI", "FR", "GF", "PF", "TF", "GA", "GM", "GE", "DE", "GH", "GI", "GR", "GL", "GD", "GP", "GT", "GG", "GN", "GW", "GY", "HT", "HM", "VA", "HN", "HU", "IS", "IN", "ID", "IR", "IQ", "IE", "IM", "IL", "IT", "JM", "JP", "JE", "JO", "KZ", "KE", "KI", "KP", "KR", "KW", "KG", "LA", "LV", "LB", "LS", "LR", "LY", "LI", "LT", "LU", "MO", "MK", "MG", "MW", "MY", "MV", "ML", "MT", "MQ", "MR", "MU", "YT", "MX", "MD", "MC", "MN", "ME", "MS", "MA", "MZ", "MM", "NA", "NR", "NP", "NL", "NC", "NZ", "NI", "NE", "NG", "NU", "NF", "NO", "OM", "PK", "PS", "PA", "PG", "PY", "PE", "PH", "PN", "PL", "PT", "QA", "RE", "RO", "RU", "RW", "BL", "SH", "KN", "LC", "MF", "PM", "VC", "WS", "SM", "ST", "SA", "SN", "RS", "SC", "SL", "SG", "SX", "SK", "SI", "SB", "SO", "ZA", "GS", "SS", "ES", "LK", "SD", "SR", "SJ", "SZ", "SE", "CH", "SY", "TW", "TJ", "TZ", "TH", "TL", "TG", "TK", "TO", "TT", "TN", "TR", "TM", "TC", "TV", "UG", "UA", "AE", "GB", "US", "UY", "UZ", "VU", "VE", "VN", "VG", "WF", "EH", "YE", "ZM", "ZW"]
        for cc in df['BillingCOUNTRYCODE']:
            if cc not in country_code:
                print(f"Country code {cc} not valid.")
I highly suspect that there is a module somewhere out there that handles country and/or state codes, but these ones have to conform to whatever SalesForce uses.

quote:

There doesn't seem to be any identifying information in the errors, so it'd be impossible to go back and figure out where a problem is in a dataset.

There is for the state and country codes. The legacy account number stuff could be more specific though, I will need to look into that. I may also search and see if there's a way to include the row/column number in the error statement. Thanks for the help!

Ihmemies
Oct 6, 2012

Annoyingly our school won't teach testing anytime soon for first year students.

Anyways, I'm trying to do AoC and I'm at 2021/05. It would be nice to test my code with the simple tests the page gives.

I just can't figure out how to compare program's printed out data with data in a text file with pytest.

Like, my .py script prints out some gargbage, I want to test it looks the same as in sample.txt or something. At least our school's all testers seem to work out like this, so it should be a basic use case.

Are there any decent guides for this? I don't want to do anything fancy. Just compare that the program prints with print() what it's supposed to print.

Ihmemies fucked around with this message at 18:10 on Nov 8, 2022

QuarkJets
Sep 8, 2008

pytest has built-in fixtures for capturing stdout and stderr, so you can test examine the output of your print statements. This is very useful for logging statements too, since those are singletons it doesn't matter that a logging instance has no stdout handler, you can add your own!

I believe that the fixture you want is capsys
https://docs.pytest.org/en/7.1.x/how-to/capture-stdout-stderr.html

icantfindaname
Jul 1, 2008


Is there any way to make a terminal application in python that's crossplatform? Ncurses seems to be mac/linux only, there are some versions of it for windows but they are old and unsupported. I want to recreate the Sokoban game from nethack and want it to have the terminal feel for aesthetics, and then expand it with more levels. Should I just make it in C instead?

Raygereio
Nov 12, 2012

icantfindaname posted:

Is there any way to make a terminal application in python that's crossplatform? Ncurses seems to be mac/linux only, there are some versions of it for windows but they are old and unsupported. I want to recreate the Sokoban game from nethack and want it to have the terminal feel for aesthetics, and then expand it with more levels. Should I just make it in C instead?
Maybe check out Textual/Rich?

icantfindaname
Jul 1, 2008


Raygereio posted:

Maybe check out Textual/Rich?

That looks like it should work, thanks

QuarkJets
Sep 8, 2008

icantfindaname posted:

Is there any way to make a terminal application in python that's crossplatform? Ncurses seems to be mac/linux only, there are some versions of it for windows but they are old and unsupported. I want to recreate the Sokoban game from nethack and want it to have the terminal feel for aesthetics, and then expand it with more levels. Should I just make it in C instead?

Have you tried installing this package? https://pypi.org/project/windows-curses/

icantfindaname
Jul 1, 2008


QuarkJets posted:

Have you tried installing this package? https://pypi.org/project/windows-curses/

Oh. That works even better. I had found this version

https://pypi.org/project/UniCurses/

From this docs file, couldn’t get it to work

https://docs.python.org/3/howto/curses.html

necrotic
Aug 2, 2005
I owe my brother big time for this!
There’s also python prompt toolkit.

duck monster
Dec 15, 2004

Zed Shaw is a weird dude. Saw him ranting at someone on twitter the other day about why we shouldnt use django for webapps, when C++ is available and much faster.

Yeah dude we went through that phase in the 1990s, it didn't work out well. I don't get why this guy is considered a high expert without understanding for most businesses the cost of hosting is a tiny fraction of the costs of development. Django might be a little crusty in its old age, but it's a hella productive environment, especially now the continuous cavalcade of suffering that python 2's unicode handling brought is largely a thing of the past.

KICK BAMA KICK
Mar 2, 2009

duck monster posted:

Zed Shaw is a weird dude. Saw him ranting at someone on twitter the other day about why we shouldnt use django for webapps, when C++ is available and much faster.

Yeah dude we went through that phase in the 1990s, it didn't work out well. I don't get why this guy is considered a high expert without understanding for most businesses the cost of hosting is a tiny fraction of the costs of development. Django might be a little crusty in its old age, but it's a hella productive environment, especially now the continuous cavalcade of suffering that python 2's unicode handling brought is largely a thing of the past.
Also just as an untrained hobbyist am I wrong that even if you're stipulating speed as your goal aren't like the network or the end user's browser bigger contributors than the server code for a lot of use cases?

Seventh Arrow
Jan 26, 2005

I remember seeing a python learning site once that had a page that was something like "Why We Don't Recommend Zed Shaw/Learn Python The Hard Way", or something like that. I thought it was Reddit, but apparently not. Their points, if true, seemed to be pretty good reasons not to use his book(s).

Hed
Mar 31, 2004

Fun Shoe
Pretty sure there was a good post about it in the thread… but might have been from like 2012

QuarkJets
Sep 8, 2008

ITT I saw a good video posted about the pitfalls of OOP, it was a talk by this woman who I think was a Ruby guru? Does anyone else remember this? I feel like I'd like to watch it again, it was very good

Zugzwang
Jan 2, 2005

You have a kind of sick desperation in your laugh.


Ramrod XTreme

duck monster posted:

Zed Shaw is a weird dude. Saw him ranting at someone on twitter the other day about why we shouldnt use django for webapps, when C++ is available and much faster.

Yeah dude we went through that phase in the 1990s, it didn't work out well. I don't get why this guy is considered a high expert without understanding for most businesses the cost of hosting is a tiny fraction of the costs of development. Django might be a little crusty in its old age, but it's a hella productive environment, especially now the continuous cavalcade of suffering that python 2's unicode handling brought is largely a thing of the past.
Maybe his idea of learning Python "the hard way" is to first learn C++, try to use it for a web app, then realize that your substantial time investment would've gone better if you'd just gone with Python in the first place?

ExcessBLarg!
Sep 1, 2001

duck monster posted:

Zed Shaw is a weird dude. Saw him ranting at someone on twitter the other day about why we shouldnt use django for webapps, when C++ is available and much faster.
That's a name I haven't seen in a long time.

He's not entirely wrong. Once you reach the size of Twitter (circa 2011) it might make sense to rewrite your Ruby backend in something more performant. But that's not to say that Ruby was the wrong choice for Twitter in 2006. IIRC Shaw wrote the Ruby-based web server that Twitter originally used so I don't know if he's salty about this.

duck monster posted:

I don't get why this guy is considered a high expert without understanding for most businesses the cost of hosting is a tiny fraction of the costs of development. Django might be a little crusty in its old age, but it's a hella productive environment, especially now the continuous cavalcade of suffering that python 2's unicode handling brought is largely a thing of the past.
There's a popular misconception that tech startups either grow to astronomical heights or crash and burn spectacularly. The reality is that there's lots of tech companies that operate in niche markets and have been running as effective small businesses for a decade or two now. As you say, hosting costs could be reduced but that's not usually your greatest efficiency gain and may even have negative effects elsewhere.

Josh Lyman
May 24, 2009


Why would I get a Pandas KeyError during a long-running script, but when I restart it at the point where it stopped, it's fine?

edit: Added code snippet. mydf_in has multiple rows and 3 columns. The 1st and 2nd columns just contain floats, but the 3rd column contains a list of lists of varying length. The function expand_list is meant to expand out the list of lists and turn them into additional rows. I typed this since the code is on a different computer so please excuse any typos.
Python code:
def expand_list(df, list_column, new_column):
	lens_of_lists = df[list_column].apply(len)
	origin_rows = range(df.shape[0])
	destination_rows = np.repeat(origin_rows, lens_of_lists)
	non_list_cols = (
		[idx for idx, col in enumerate(df.columns)
		if col != list_column]
	)
	expanded_df = df.iloc[destination_rows, non_list_cols].copy()
	expanded_df[new_column] = (
		[item for items in df[list_column] for item in items]
		)
	expanded_df.reset_index(inplace=True, drop=True)
	return expanded_df

mydf_out = expand_list(mydf_in,2, "ColName")
When the KeyError occurs, the traceback points to the expand_list call and then to the line starting with lens_of_lists.

Josh Lyman fucked around with this message at 18:54 on Nov 14, 2022

samcarsten
Sep 13, 2022

by vyelkin
hey, I'm trying to use Selenium to input text into a search box and I keep getting an error and googling hasn't helped. The error is "AttributeError: 'list' object has no attribute 'sendkeys'" and my code is:

code:
import pytest
from selenium import webdriver
from selenium.webdriver.common.by import By

driver = webdriver.Chrome()
driver.get("http://en.wikipedia.org/wiki/Main_Page")
search = driver.find_elements(By.NAME, 'title')
search.sendkeys('wake tech')
button = driver.find_elements(By.CLASS_NAME, 'searchButton mw-fallbackSearchButton')
button.click()


def wake_test():
    assert driver.findelements(By.Class, 'mw-page-title-main').getText() == "Wake Technical Community College"

boofhead
Feb 18, 2021

That error means you're trying to treat a list like an object

Look for sendkeys, you can see it in this line - that's what's triggering the error

code:
search.sendkeys('wake tech')
So follow that to where search is defined

code:
search = driver.find_elements(By.NAME, 'title')
Well from the name alone you can guess it'll return a list or some type of iterable, you should be either looping through the results or calling a different function that returns only one object rather than a list

Also, I didn't look too closely at the rest but there's likely the same error as well as a typo in your test wake_test() -- findelements not find_elements, and again it treats the list like an object

e: I haven't used selenium in years but in your case I'd check if there's find_element rather than find_elements since you seem relatively certain there's only going to be one result

boofhead fucked around with this message at 18:32 on Nov 14, 2022

samcarsten
Sep 13, 2022

by vyelkin
changing it to find_element() did nothing, even when I cycled through all the relevant html objects. I keep getting errors.

QuarkJets
Sep 8, 2008

Josh Lyman posted:

Why would I get a Pandas KeyError during a long-running script, but when I restart it at the point where it stopped, it's fine?

edit: Added code snippet. mydf_in has multiple rows and 3 columns. The 1st and 2nd columns just contain floats, but the 3rd column contains a list of lists of varying length. The function expand_list is meant to expand out the list of lists and turn them into additional rows. I typed this since the code is on a different computer so please excuse any typos.
Python code:
def expand_list(df, list_column, new_column):
	lens_of_lists = df[list_column].apply(len)
	origin_rows = range(df.shape[0])
	destination_rows = np.repeat(origin_rows, lens_of_lists)
	non_list_cols = (
		[idx for idx, col in enumerate(df.columns)
		if col != list_column]
	)
	expanded_df = df.iloc[destination_rows, non_list_cols].copy()
	expanded_df[new_column] = (
		[item for items in df[list_column] for item in items]
		)
	expanded_df.reset_index(inplace=True, drop=True)
	return expanded_df

mydf_out = expand_list(mydf_in,2, "ColName")
When the KeyError occurs, the traceback points to the expand_list call and then to the line starting with lens_of_lists

Your function call received a list_column key that wasn't already in the table :shrug:

Josh Lyman
May 24, 2009


QuarkJets posted:

Your function call received a list_column key that wasn't already in the table :shrug:
Yeah that's what I initially thought, but mydf_in always has 3 columns.

If that was the only issue, I should be getting the KeyError again when I restart the script from the same point, i.e. same mydf_in, but that hasn't happened yet. It executes fine and then moves onto the next mydf_in. I might get a KeyError later but it'll be on a different mydf_in.

edit: It happened just now. I have multiple Jupiter notebooks going, and a few of them had a KeyError with the same system timestamp (they were processing different mydf_in's).

Josh Lyman fucked around with this message at 19:06 on Nov 14, 2022

Edward IV
Jan 15, 2006

samcarsten posted:

changing it to find_element() did nothing, even when I cycled through all the relevant html objects. I keep getting errors.

I think the function name is send_keys() for Python instead of sendKeys()

Python code:
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
driver = webdriver.Firefox()

    # Navigate to url
driver.get("http://www.google.com")

    # Enter "webdriver" text and perform "ENTER" keyboard action
driver.find_element(By.NAME, "q").send_keys("webdriver" + Keys.ENTER)
https://www.selenium.dev/documentation/webdriver/elements/interactions/#send-keys

QuarkJets
Sep 8, 2008

Josh Lyman posted:

Yeah that's what I initially thought, but mydf_in always has 3 columns.

Always 3 columns but with different names, right? Hook this code up to a debugger so that you can check out the data yourself when the issue occurs. That error is occurring because you're accessing a column name that's not in the table; I don't know what your data source is but there's some disconnect between the actual data and your code's model of the data, or there's a simple typo (run your code through flake8 to help rule this out)

12 rats tied together
Sep 7, 2006

QuarkJets posted:

ITT I saw a good video posted about the pitfalls of OOP, it was a talk by this woman who I think was a Ruby guru? Does anyone else remember this? I feel like I'd like to watch it again, it was very good

Ruby guru + woman sounds like Sandi Metz, who I have posted about ITT, but she is pro-OOP in the form of Alan Kay's "OOP is mostly message passing", and against the improper use of inheritance.

The video I always link is from a talk she does called "nothing is something", and it is very good. The talk is basically an example of inheritance gone wrong, how to recognize it, and how to fix it.

https://www.youtube.com/watch?v=OMPfEXIlTVE

Apologies if this is not what you were thinking of. :)

Josh Lyman
May 24, 2009


QuarkJets posted:

Always 3 columns but with different names, right? Hook this code up to a debugger so that you can check out the data yourself when the issue occurs. That error is occurring because you're accessing a column name that's not in the table; I don't know what your data source is but there's some disconnect between the actual data and your code's model of the data, or there's a simple typo (run your code through flake8 to help rule this out)
The columns always have the same names because they're converted from a numpy array in a previous statement with mydf_in = pd.DataFrame(myNumpyArray). When I print mydf_in, the columns are just "named" 0 1 2.

Doesn't the randomness of the KeyError preclude a typo?

Josh Lyman fucked around with this message at 19:32 on Nov 14, 2022

Adbot
ADBOT LOVES YOU

samcarsten
Sep 13, 2022

by vyelkin
Ok, got that working, new error when I try to click the button.

Message: stale element reference: element is not attached to the page document

Tried google, didn't help.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply