Python

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Python

Necrobama: Aug 4, 2006; by the sex ghost

I think I've come to a bit of a decision point on a project I've been working on to make loading CSV files into a database not a lovely, manual process.

My code needs to be able to do 3 things:

1 - scan the data directory for all .csv files and generate pre-set file log parameters based on user input (database statename, client name, and data source)
2 - generate CREATE TABLE statements based on 1 and the names of the columns in the csv file (additionally, strip out any characters from column names that would force SSMS to use brackets around the names)
3 - take the data IN the csv files and INSERT into the created table

1 and 2 are working swimmingly. pd is, as would be expected, doing most of the heavy lifting reading the csv files and spitting out valid SQL statements for the file log proc and my CREATE TABLE statements but as I sit and plan out the implementation of item 3, I'm wondering if there's a better tool for the job than python for taking anywhere from a couple thousand to half a million or more lines. The things that concern me (and this is very much a 'learn as I go' type project) is that my column names in my create statements have to be sanitized and are very often different from how they're initially going to be read into a dataframe if I choose to continue with python - "Organization?" becomes "Organization", "Date of Birth" becomes "DateOfBirth" etc so I've got to keep some running way of matching the sanitized column names up with their source column names so my first thought would be to use a dictionary, something like

code:

cols = {'Organization?':'Organization', 'Date of Birth':'DateOfBirth'}

where as I iterate through the column names to sanitize them for the INSERT statement, I append the old and new values to the dictionary with each iteration.

My current impulse is to create the first part of the INSERT statements using the colname from the DF once the CSV has been loaded into a DF and its matching value in the appropriate dictionary, and populate the VALUES (, , , ,...) portion of the query with the result set of the entire data frame but I'm not sure that's the best way to get the data from the CSV files into the tables I create.

Does this sound nuts and there's a better way, or am I on the right path here?

Part of me is screaming at me "JUST LEARN POWERSHELL!"? Is that part right?

Necrobama fucked around with this message at 15:07 on Mar 30, 2023

# ? Mar 30, 2023 15:03

Adbot: ADBOT LOVES YOU

# ? May 14, 2024 16:51

The Fool: Oct 16, 2003

Necrobama posted:

Part of me is screaming at me "JUST LEARN POWERSHELL!"? Is that part right?

No.

I am normally a powershell evangelist, but python/pandas is the right tool for the job here.

# ? Mar 30, 2023 15:12

The Fool: Oct 16, 2003

When I was doing something similar, I don't remember having to do anything with actual SQL statements, pandas did it all for me.

Are you just loading the csv's into the tables as is?

df.to_sql should create and insert for you

The only real issue I ran into was managing types, and the solution I went with was to have a dict with a list of column names and types.

Would require knowing that info ahead of time.

# ? Mar 30, 2023 15:18

Necrobama: Aug 4, 2006; by the sex ghost

The Fool posted:

When I was doing something similar, I don't remember having to do anything with actual SQL statements, pandas did it all for me.

Are you just loading the csv's into the tables as is?

df.to_sql should create and insert for you

The only real issue I ran into was managing types, and the solution I went with was to have a dict with a list of column names and types.

Would require knowing that info ahead of time.

Yeah, the column typing is 100% why I went with this method - the SQL statements i'm putting out are NVARCHAR(MAX) type fields, and then there's an internal proc we call that evaluates the data in the fields and runs the appropriate sp_rename statement to change them to whatever they should be, I wanted that level of manual control over the field generation so I went with creating the create and/or insert statements to then pass along to the query engine.

I'll dig a big more into the documentation on to_sql though, I recall looking briefly at that as an option at one point but my desire for absolute control over the table creation I think won out; thanks for the suggestion!

# ? Mar 30, 2023 15:25

The Fool: Oct 16, 2003

Necrobama posted:

I'll dig a big more into the documentation on to_sql though, I recall looking briefly at that as an option at one point but my desire for absolute control over the table creation I think won out; thanks for the suggestion!

the dtype argument lets you specify column types and take a dict

# ? Mar 30, 2023 15:34

Necrobama: Aug 4, 2006; by the sex ghost

The Fool posted:

the dtype argument lets you specify column types and take a dict

Hey this was exactly the nudge in the right direction I needed, I had to fight with the connection parameters a bit but i've got everything (in at least one file!) going in without having to touch the loving import data wizard -- thanks a bunch!!

# ? Mar 30, 2023 17:20

Seventh Arrow: Jan 26, 2005

You can also predefine a schema with pandas...I can't find the original article I used as a guide, but the information is out there.

# ? Mar 30, 2023 20:32

Falcon2001: Oct 10, 2004; Eat your hamburgers, Apollo.; Pillbug

Speaking of crimes against coding I think I'm building something insane.

I'm working at some big tech and we have a bunch of internal clients we use for various services. I'm working on setting up testing for these, and so I'm wondering what the normal/best way to handle large mock clients is.

For example, it has a series of functions we can call which in turn make API calls; there's a pytest-httpserver package that would let me mock the actual API calls, but that doesn't work in our environment thanks to complex authentication stuff expected by the client.

So instead I'm mocking the client itself and then just creating new mocks for each function called; the returned objects are basically complex objects but they're all generally serializable so it's nothing too fancy and there's no fucntions/etc to worry about, just lots of stuff where it's like

Python code:

api_response_obj.id: int
api_response_obj.name: str
api_response_obj.children_ids: List[int]

This is all fine and dandy; I wrote some code where I store valid full API responses in YAML and then return them, which works great for any client function with only *one* argument, because I can store them as multilookups:

YAML code:

func_name:
  key1: response1
  key2: response2

# Once mocked, client.func_name(key1) == response1, and this works for complex nested objects

the problem is when you have multiple arguments, because tuples don't really serialize/deserialize well, and it's hard to use them as a key in YAML (or JSON, or any other serializable thing). My solution is literally to store the whole signature as a string and use that as the key, but...that's where I stopped and went 'wait, am I insane?'

YAML code:

func_name:
  "arg1=key1,arg2=key2": response1
  "arg1=key1,arg2=key3": response2

I've tried googling this concept and haven't really run into anything, and looking around internally everyone else is just...writing a bunch of one-off mocks that make it hard to update later, or start a new test suite; my goal was to have a basic 'standard setup' for this API so you can get off to the races on a new test case reasonably fast, and individual tests can modify the mocked responses independantly.

The other part I forgot about is that..sometimes an argument doesn't matter, and having to hardcode it in makes it a little wonky to test later (like a description field, or something else).

I honestly wonder if just making a fake version of the client is the right answer, since this is starting to get a little insane.

Falcon2001 fucked around with this message at 22:07 on Mar 30, 2023

# ? Mar 30, 2023 21:53

CompeAnansi: Feb 1, 2011; I respectfully decline
the invitation to join
your hallucination

Necrobama posted:

3 - take the data IN the csv files and INSERT into the created table

...

but as I sit and plan out the implementation of item 3, I'm wondering if there's a better tool for the job than python for taking anywhere from a couple thousand to half a million or more lines.

Since you're already loading the CSVs into memory via a pandas dataframe to create your DDL, you can use that dataframe to load the data as loading data from a dataframe is normally pretty straightforward. I have some high-performance code snippets for this that I can share if you're using postgres (the problem is that each db uses different high performance loading methods).

# ? Mar 30, 2023 22:31

nullfunction: Jan 24, 2005; Nap Ghost

Falcon2001 posted:

I honestly wonder if just making a fake version of the client is the right answer, since this is starting to get a little insane.

Would something like responses help? Check into the registry / dynamic responses stuff.

I've found it to be extremely needs-suiting for mocking APIs in test clients.

# ? Mar 30, 2023 22:38

Falcon2001: Oct 10, 2004; Eat your hamburgers, Apollo.; Pillbug

nullfunction posted:

Would something like responses help? Check into the registry / dynamic responses stuff.

I've found it to be extremely needs-suiting for mocking APIs in test clients.

I'll look into it; the biggest issue would be that there's a client (that isn't our code and shouldn't be tested) between us and the API, and sometimes that client makes multiple calls/etc so it might be a little complex to mock there.

On the plus side, it'd be simpler to handle as raw JSON since that's what the client/API communication is done in.

# ? Mar 30, 2023 23:28

StumblyWumbly: Sep 12, 2007; Batmanticore!

I'm making some code where I want to put in a target sensor and setting, and get out the channel ID and parameters it should be tested against. So I put in {"Microphone": {"sample_rate": "20000", "gain": 1.4}}, and I get 17: {"noise": 0.07, "rate_tolerance": 0.001}, where 17 is the channel ID. I have some sensors that can have multiple channel ID's associated with them.

I have the system working using just dicts and functions that go through and map from one format to the other. The dicts work great as long as you remember all the keys, I'm trying to move them to objects to help structure the data more.
Moving this into objects works great when one target sensor becomes one set of test parameters, but I feel like I'm missing an easy way to handle one target sensor becoming multiple sets of parameters. What I have now is:

code:

param_dict = {}
for name, values in sensor_list.items():
    test_param = TestObject(name, values)
    param_dict{test_param.id} = test_param

But this falls apart if a sensor can generate multiple test params, which is a special case but it happens so I need to handle it.

I feel like there's a Pythonic way to do this that I'm missing. Maybe TestObject should be an object generator function that always returns a list of the test objects (which is normally going to be length 1), and I always iterate through that list and add the elements to the param_dict? Maybe I should ditch the dict and have an object that holds all the test parameters so I can manage access?

Am I missing something clever here?

# ? Apr 1, 2023 16:34

Zoracle Zed: Jul 10, 2001

My first guess is something like

Python code:

@dataclass
class TestParams:
    noise: float
    rate_tolerance: float


def query(...) -> Iterator[Tuple[int, TestParams]]:
    """yields (channel_id, TestParams) tuples"""

    yield 17, TestParams(noise=0.07, rate_tolerance=0.001)
    yield 18, TestParams(noise=0.08, rate_tolerance=0.002)

but tbh I don't really understand your data model.

# ? Apr 1, 2023 17:20

StumblyWumbly: Sep 12, 2007; Batmanticore!

Thanks, I did not remember dataclasses exist, so that's useful.
I guess the no-background question is: I'd like to be able to parse through a dict, and generate a list of objects, all the same type, for each item. These lists will all get merged into one master dict of the resulting objects.
Should I just use a function to generate that list of objects and merge them into a single dict, or could I do something with the object design itself so it is a bit more self contained?

The more I think about it, the more I think the generator function that puts out a list of objects is the way to go.

# ? Apr 1, 2023 17:51

QuarkJets: Sep 8, 2008

StumblyWumbly posted:

Thanks, I did not remember dataclasses exist, so that's useful.
I guess the no-background question is: I'd like to be able to parse through a dict, and generate a list of objects, all the same type, for each item. These lists will all get merged into one master dict of the resulting objects.
Should I just use a function to generate that list of objects and merge them into a single dict, or could I do something with the object design itself so it is a bit more self contained?

The more I think about it, the more I think the generator function that puts out a list of objects is the way to go.

What does it look like when one sensor becomes multiple sets of parameters? Do the parameters all have their own channel IDs?

Python code:

def parse_output(output):
    for channel_id, values in output.items():
        yield channel_id, TestParams(**values)

This would yield things like (17, TestParams(noise=0.07, rate_tolerance=0.001)).

Or you could pass in the name of the sensor, and yield (sensor_name, TestParams(channel_id, **values)). Then you could work with sensor names instead of channel IDs

If you know the channel IDs beforehand you could also map them to an Enum

# ? Apr 1, 2023 18:04

StumblyWumbly: Sep 12, 2007; Batmanticore!

Neat, Enum looks very similar to the mapping system I have, but I'm also not sure why I would prefer Enum over the current option. What I have is pretty much:

code:

sensor_to_target_map = {
  "MIC_1" : { "chan_id": 10, "noise": 0.008, "freq": 20000},
  "MULTI_MIC" : { "chan_id": [7,8,9], "noise": [0.008, 0.008. 0.1], "freq": 44000}
}

There are other settings associated with the sensors that get fed in, with the net result being that when someone runs a test with the MULTI_MIC sensor, they get 3 objects with the channel IDs 7, 8, 9, and the test targets are calculated based on the noise and frequency provided by the user and the values in this target map.

I think I could replace that system with an Enum, and I don't think I'd need to change anything that much but I'm also not sure what that would improve? Is it just a matter if the map being more rigidly structured so I don't need to remember what strings to use, or is there something else it opens up?

# ? Apr 1, 2023 18:25

Chin Strap: Nov 24, 2002; I failed my TFLC Toxx, but I no longer need a double chin strap; Pillbug

So I'm an experienced python Data Science person but I've relied on all my company's tooling in order to have a good environment before. Let's say I just have a Chromebook, a windows desktop I only interact with remotely with Chrome Remote Desktop, and no current python environment installed anywhere.

I'm probably going to be using Jupiterhub or Colab for doing actual interactive work but would like to set up PyCharm to actually develop my code.

Would it make more sense to:

A) set it up on my windows desktop

Or

B) get some sort of remote hosting linux solution I can remotely log in to and develop on?

I would like it to be as turn key as possible because my experience with setting this stuff up is non-existent. I just know the ins and outs of using it in context of work.

# ? Apr 1, 2023 18:40

Precambrian Video Games: Aug 19, 2002

PyCharm can manage a Python environment on Windows just fine, though I'd certainly rather do it on Linux. WSL is also an option (that I've never tried), or some other VM (VMWare Workstation has been fine for me).

# ? Apr 1, 2023 21:33

12 rats tied together: Sep 7, 2006

WSL is pretty good but most of the "pretty good" is how easy it is to hook into vs code, so you just open code and your linux vm is sitting there in the integrated terminal waiting for you. It's probably also fine with PyCharm, but I've never used it so I can't vouch.

# ? Apr 1, 2023 21:45

The Fool: Oct 16, 2003

vs code has a whole remote workspace thing where you can point it at any vm or container with ssh

# ? Apr 1, 2023 22:01

QuarkJets: Sep 8, 2008

StumblyWumbly posted:

Neat, Enum looks very similar to the mapping system I have, but I'm also not sure why I would prefer Enum over the current option. What I have is pretty much:
code:
sensor_to_target_map = {
  "MIC_1" : { "chan_id": 10, "noise": 0.008, "freq": 20000},
  "MULTI_MIC" : { "chan_id": [7,8,9], "noise": [0.008, 0.008. 0.1], "freq": 44000}
}
There are other settings associated with the sensors that get fed in, with the net result being that when someone runs a test with the MULTI_MIC sensor, they get 3 objects with the channel IDs 7, 8, 9, and the test targets are calculated based on the noise and frequency provided by the user and the values in this target map.

I think I could replace that system with an Enum, and I don't think I'd need to change anything that much but I'm also not sure what that would improve? Is it just a matter if the map being more rigidly structured so I don't need to remember what strings to use, or is there something else it opens up?

Enums are full of excellent features, if you're going to define some kind of fixed mapping (such as between sensor names and unique channel id numbers) then it's a great choice for a data structure. It gives you explicit labeling of what would otherwise be a bunch of magic numbers that you'd have to label yourself anyway, so it's easier to write good code with enums than without.

Every entry in an enum type is also a singleton class instance of that type, so you can do all kinds of fun things with them

Sometimes you don't need any of those advantages, but it's also just nice knowing that your editor will immediately tell you if you fat-fingered MUTLI_MIC instead of MULTI_MIC somewhere.

# ? Apr 1, 2023 23:23

QuarkJets: Sep 8, 2008

Chin Strap posted:

So I'm an experienced python Data Science person but I've relied on all my company's tooling in order to have a good environment before. Let's say I just have a Chromebook, a windows desktop I only interact with remotely with Chrome Remote Desktop, and no current python environment installed anywhere.

I'm probably going to be using Jupiterhub or Colab for doing actual interactive work but would like to set up PyCharm to actually develop my code.

Would it make more sense to:

A) set it up on my windows desktop

Or

B) get some sort of remote hosting linux solution I can remotely log in to and develop on?

I would like it to be as turn key as possible because my experience with setting this stuff up is non-existent. I just know the ins and outs of using it in context of work.

The only reason I'd use a Windows system was if it was just running an IDE that's connecting to a beefy linux system with SSH, but if you're going to get a remote linux solution then I think it'd be better to just install the IDE directly on your chromebook? Maybe I'm misunderstanding, that just sounds more efficient than using a remote desktop client. Either way, I'd rather use option B unless the remote desktop instance was insanely seamless

Personally, I've switched from PyCharm to VS Code. They've got basically all of the same features, but the PyCharm Pro features that I'd normally have to pay for are all covered by free VS Code plugins, including development over SSH.

# ? Apr 1, 2023 23:42

Seventh Arrow: Jan 26, 2005

Would Google Colab be a possible solution?

# ? Apr 1, 2023 23:49

Chin Strap: Nov 24, 2002; I failed my TFLC Toxx, but I no longer need a double chin strap; Pillbug

Seventh Arrow posted:

Would Google Colab be a possible solution?

Jupyter/Colab notebooks suck for actual development of libraries though (I use it daily for interactive and plotting stuff). Encourages way too much bad code practice and isn't an IDE of any kind.

Guess since I use VSCode anyway might as well do that personally too. I'll probably just install Linux on my desktop. Is there any reason to do dual boot over a VM?

# ? Apr 2, 2023 00:03

Falcon2001: Oct 10, 2004; Eat your hamburgers, Apollo.; Pillbug

Nthing Remote SSH for VSCode; I'm actually evangelizing it at work where our standard setup is that code and dev environments live on a cloud VM and we have our individual laptops/etc.

Most of my team uses a remote desktop solution and installs PyCharm on their cloud desktop, which is perfectly workable if input lag doesn't feel like torture to you but it does to me, so I almost immediately moved to using VSCode and remote ssh; you have a local client and it syncs everything seamlessly over SSH, all build/test/etc command are run on the remote instance.

However, unless you're not paying for that remote development environment, I wouldn't recommend it; dedicated VMs are not cheap. In this case just use WSL and VSCode (or honestly I'm sure PyCharm works fine too; in fact with WSL2 on Win11 you can get X11 forwarding so you can run the whole program from your WSL install)

# ? Apr 2, 2023 00:04

QuarkJets: Sep 8, 2008

Chin Strap posted:

Jupyter/Colab notebooks suck for actual development of libraries though (I use it daily for interactive and plotting stuff). Encourages way too much bad code practice and isn't an IDE of any kind.

Guess since I use VSCode anyway might as well do that personally too. I'll probably just install Linux on my desktop. Is there any reason to do dual boot over a VM?

Probably not, maybe if you just don't feel like dealing with virtualization. Most hypervisors also don't support GPU passthrough, so if you need CUDA or OpenCL then that's a problem (e: not a problem for docker containers, although iirc GPU passthrough from Windows to Linux and vice versa might still not work?)

QuarkJets fucked around with this message at 02:47 on Apr 2, 2023

# ? Apr 2, 2023 02:41

SurgicalOntologist: Jun 17, 2004

Is your Chromebook at all powerful? Because it can run Linux just fine most likely. I develop on a Chromebook and use VSCode server (code-server) on localhost, although I've also used it remotely from time to time, as well as the regular Linux VSCode. All three options work pretty well for me and nearly identical usability.

# ? Apr 2, 2023 06:53

Son of Thunderbeast: Sep 21, 2002

I had to share this somewhere:

code:

        self.physics_engine.add_sprite_list(
            ...
            moment_of_inertia=0,
        )

code:

TypeError: PymunkPhysicsEngine.add_sprite_list() got an unexpected keyword argument 'moment_of_inertia'

Meanwhile:

code:

        self.physics_engine.add_sprite_list(
            ...
            moment_of_intertia=0,
        )

runs fine

# ? Apr 7, 2023 20:41

Data Graham: Dec 28, 2009; 📈📊🍪😋

Gotta love it. I maintain a codebase where major components are full of fundamental misspellings (like "deadent" instead of "deadnet") and there is no fixing it because it appears in like 5000 places

Also if I ran black on it I would be surprised if a single solitary line went unchanged

# ? Apr 7, 2023 23:47

Falcon2001: Oct 10, 2004; Eat your hamburgers, Apollo.; Pillbug

Data Graham posted:

Gotta love it. I maintain a codebase where major components are full of fundamental misspellings (like "deadent" instead of "deadnet") and there is no fixing it because it appears in like 5000 places

Also if I ran black on it I would be surprised if a single solitary line went unchanged

I feel pretty good because I just today did a wide (if minor) refactor across the program we're working on because the guy setting it up implemented a double negative as a commandline arg (think like "--dont-do-x" instead of "--do-y") AND then implemented the default wrong to boot so everyone was implementing this shared flag wrong. It's a good program framework, I think he just had a moment or something.

We had been putting it off because 'oh so many people have active PRs' but we're starting to write documentation for use and I'd rather eat a burrito with a fork than write docs I know are wrong, so I got uppity in slack until I got enough agreement and was able to explain it to my lead (which was a pain...since it's a grammatical error basically) and then fixed it across the entire codebase, PR reviewed, and merged same day. I felt so much better afterward.

I feel a little bad for the folks who are going to wake up on Monday and have to rebase, but that's a sacrifice I'm willing to make.

Ninja edit: Oh yeah and another thing where we had a property named 'y_fix_required' where if you returned True...it indicated that the fix in question was not required for some bizarre reason. Fixed that too after having to help two or three other folks to decipher what the hell it meant.

Falcon2001 fucked around with this message at 07:43 on Apr 8, 2023

# ? Apr 8, 2023 07:38

ziasquinn: Jan 1, 2006; Fallen Rib

I've decided to try and make another "run" (used loosely here) at learning Python, should I just go through Think Python and Python Docs or are there more recent kind of books/guides/docs for starting off? I know about "Automate the Easy Stuff" and kin, for example.

I always get really bored starting off cause the basics aren't that interesting (but I know this gets better as it builds on itself). For example, I know that tinkering with existing code is fun, but I eventually just hit a wall where I have a hard time determining the kinds of projects or goals to work on as a super novice that won't KILL me. So I lose steam and momentum without having like, a class, forcing me to do it?

I am sure I'm not alone in this?

# ? Apr 9, 2023 13:13

StumblyWumbly: Sep 12, 2007; Batmanticore!

ziasquinn posted:

I've decided to try and make another "run" (used loosely here) at learning Python, should I just go through Think Python and Python Docs or are there more recent kind of books/guides/docs for starting off? I know about "Automate the Easy Stuff" and kin, for example.

I always get really bored starting off cause the basics aren't that interesting (but I know this gets better as it builds on itself). For example, I know that tinkering with existing code is fun, but I eventually just hit a wall where I have a hard time determining the kinds of projects or goals to work on as a super novice that won't KILL me. So I lose steam and momentum without having like, a class, forcing me to do it?

I am sure I'm not alone in this?

Do you have a project you'd like to try doing? Like automating some spreadsheet work or renaming files or doing math?
And what's your programming background? There's no one size fits all.

# ? Apr 9, 2023 14:05

notwithoutmyanus: Mar 17, 2009

I actually wanted to query a bunch of api's periodically and then plot the results,l to eventually be a dashboard for folks to access via a login. I was debating if I should do it solely with python or rust with plotly, vs leveraging something like grafana to display the output. Not sure what's practical. Never truly got this off ground by myself.

# ? Apr 9, 2023 15:25

The Fool: Oct 16, 2003

that's something I end up doing a few times a year at work

use python to collect and sanitize your data, then deliver in whatever form your audience is more likely to use

sadly for me that is often power bi

# ? Apr 9, 2023 16:22

Ben Nerevarine: Apr 14, 2006

ziasquinn posted:

I've decided to try and make another "run" (used loosely here) at learning Python, should I just go through Think Python and Python Docs or are there more recent kind of books/guides/docs for starting off? I know about "Automate the Easy Stuff" and kin, for example.

I always get really bored starting off cause the basics aren't that interesting (but I know this gets better as it builds on itself). For example, I know that tinkering with existing code is fun, but I eventually just hit a wall where I have a hard time determining the kinds of projects or goals to work on as a super novice that won't KILL me. So I lose steam and momentum without having like, a class, forcing me to do it?

I am sure I'm not alone in this?

I usually do recommend the Automate the Boring Stuff book that you mentioned, but not necessarily reading it front to back. I think you�d have more success if you knew what you wanted out of a small project rather than try and learn a bunch of boring stuff THEN come up with a project idea. So what I recommend is skimming the book (and I�m talking, like, read the table of contents and peek at some chapters that stick out as interesting or useful) to at least get a sense of the kinds of things it touches on, things everyone does every day, like file manipulation for example. Moving files around, creating directories, appending to files, etc. are operations you�re likely to need across many of the projects you work on. Get a sense of the building blocks that are there, then think about something you would like to do (again, sticking to small projects at this stage), then think about how you�d put them together with those building blocks. That way you have reasons for reading particular chapters in depth rather than slogging through the whole thing at a go, which will likely only serve to bore and demotivate you

# ? Apr 9, 2023 16:29

Falcon2001: Oct 10, 2004; Eat your hamburgers, Apollo.; Pillbug

Ben Nerevarine posted:

I usually do recommend the Automate the Boring Stuff book that you mentioned, but not necessarily reading it front to back. I think you�d have more success if you knew what you wanted out of a small project rather than try and learn a bunch of boring stuff THEN come up with a project idea. So what I recommend is skimming the book (and I�m talking, like, read the table of contents and peek at some chapters that stick out as interesting or useful) to at least get a sense of the kinds of things it touches on, things everyone does every day, like file manipulation for example. Moving files around, creating directories, appending to files, etc. are operations you�re likely to need across many of the projects you work on. Get a sense of the building blocks that are there, then think about something you would like to do (again, sticking to small projects at this stage), then think about how you�d put them together with those building blocks. That way you have reasons for reading particular chapters in depth rather than slogging through the whole thing at a go, which will likely only serve to bore and demotivate you

Yeah; honestly it sounds like ziasquinn and I (two years ago) are probably in similar-ish boats. I tried learning development like...probably 3-4 times and https://automatetheboringstuff.com/#toc is what really finally made it all click for me, although it probably helped that I'd already gone through a lot of 101 level 'this is a variable' stuff over and over.

The thing that I like about ATBS is that it has a lot of small useful examples for things you can do that feel applicable in a way that a lot of algorithm-based books just don't. Most people in the workforce that want to mess around with programming already have been working with Excel most likely, so the chapter on working with openpyxl is *really useful* and probably applicable to people.

I never finished the book and never read it cover to cover, but it really was the final piece that made it all click for me, and from there the next 'trick' was Code Katas, like Codewars or Leetcode - at least for me, learning algorithms from a problem based workflow was much more interesting, once I had the ability to google stuff. I just sorted by 'most solvable' and started from there, even the super easy stuff, because it was good practice, and I'd just do like, one a day.

Edit:

ziasquinn posted:

I always get really bored starting off cause the basics aren't that interesting (but I know this gets better as it builds on itself). For example, I know that tinkering with existing code is fun, but I eventually just hit a wall where I have a hard time determining the kinds of projects or goals to work on as a super novice that won't KILL me. So I lose steam and momentum without having like, a class, forcing me to do it?

IMO as someone who was really terrible in school I found Code Katas like I mentioned above to be a very useful setup for this - you just have to solve one discrete problem to get the little 'I did a thing' feeling, not build a whole program and slog through all the stuff surrounding it. And they start off really easy, like 'reverse a list' easy.

Protip: Try and solve the problem, and if you get stuck, just google the answer. You're not taking a test, you're learning, and half of development is googling stuff or looking up the docs anyway.

Falcon2001 fucked around with this message at 06:05 on Apr 10, 2023

# ? Apr 10, 2023 06:01

FISHMANPET: Mar 3, 2007; Sweet 'N Sour
Can't
Melt
Steel Beams

I'm writing a Python module, and I've gotten myself stuck into some circular dependencies. I've found some "basic" stuff about fixing that, but I seem to have gotten myself into a deeper pickle. Basically, I have a User class, and that User class has methods that will return a Group. And I have a Group class that has methods that will return a User. And I'm trying to define these in separate files, otherwise this is gonna be some massive 2000 line single-file module. I've finally just "thrown in the towel" so to speak and stopped importing at the top of the file, and instead am importing only when a function is called, but that doesn't feel great either.

So any tips or links on how to "design" my way out of this?

# ? Apr 10, 2023 19:15

Armitag3: Mar 15, 2020; Forget it Jake, it's cybertown.

FISHMANPET posted:

I'm writing a Python module, and I've gotten myself stuck into some circular dependencies. I've found some "basic" stuff about fixing that, but I seem to have gotten myself into a deeper pickle. Basically, I have a User class, and that User class has methods that will return a Group. And I have a Group class that has methods that will return a User. And I'm trying to define these in separate files, otherwise this is gonna be some massive 2000 line single-file module. I've finally just "thrown in the towel" so to speak and stopped importing at the top of the file, and instead am importing only when a function is called, but that doesn't feel great either.

So any tips or links on how to "design" my way out of this?

look up lazy loading or lazy importing. Basically in this kind of circular dependency you�d want to only import the symbols when actually needed, i.e. when you call the methods.

# ? Apr 10, 2023 22:21

haruspicy: Feb 10, 2023

Whoever in here said that they didn�t really get Python until reading Fluent Python goddamn is that ever true, just picked it up

# ? Apr 10, 2023 22:23

Adbot: ADBOT LOVES YOU

# ? May 14, 2024 16:51

QuarkJets: Sep 8, 2008

FISHMANPET posted:

I'm writing a Python module, and I've gotten myself stuck into some circular dependencies. I've found some "basic" stuff about fixing that, but I seem to have gotten myself into a deeper pickle. Basically, I have a User class, and that User class has methods that will return a Group. And I have a Group class that has methods that will return a User. And I'm trying to define these in separate files, otherwise this is gonna be some massive 2000 line single-file module. I've finally just "thrown in the towel" so to speak and stopped importing at the top of the file, and instead am importing only when a function is called, but that doesn't feel great either.

So any tips or links on how to "design" my way out of this?

You need to clean up this data model. If a Group contains Users, then a User does not need to call Group methods. Code that needs to interact with Group and User simultaneously (for instance, "given a User belonging to some Group, call a Group method") goes into a third module that imports from users and groups.

# ? Apr 11, 2023 02:58

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > Python

«‹›230 »