Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
ArcticZombie
Sep 15, 2010

Twerk from Home posted:

It looks like projects using pyproject.toml will specify that you should try to install wheels in the [build-system] block: https://pip.pypa.io/en/stable/reference/build-system/pyproject-toml/#fallback-behaviour

If I have a package that's using the legacy setup.py, is there a way to make it use wheels when installing? When I pip install my package, I'm seeing a whole lot of:
"Using legacy 'setup.py install' for $DEPENDENCY, since package 'wheel' is not installed."

I know that I could hack around this by installing wheel manually first in an environment before running pip install, but there must be a way to tell everyone's pip to try to use wheels by default because I've seen other packages do it.

Your problem isn’t “how do I tell pip to use wheels by default” (it already is, or trying to at least), it’s “how do I make sure the wheel package is installed before attempting to install my package”. A requirements.txt is one solution here, but there are many possible solutions to setting up the environment your package needs before attempting to install it but they all very much depend on where/how your package is being installed.

Adbot
ADBOT LOVES YOU

Edward IV
Jan 15, 2006

I have a python script that uses pandas and pandas-related packages to convert some medium-sized datasets from one format to another. I want to deploy it for the engineers to use on their local machines as it's a script that's tacked on another program that ultimately calls it for it to use; the engineers don't need to call it directly.

Now I'm the only engineer that has Python installed on their computer and pretty much the only person there that knows how to use it. While I'm sure some of the engineers will understand what to do if I document the setup process, there are some that aren't super tech savvy and I want to try to avoid needing to manage each engineer's system.

I was thinking of deploying it as a Windows executable to try to avoid that issue. I've successfully used pyinstaller in the past but begrudgingly so given the large package size and initial slow startup. I've wanted to try to use py2exe but apparently I need to install and configure the official python installer instead of the Windows store version because I can't even get the simple "Hello World" demo to work.

In the interim, I tried to use pyinstaller but ran into some issues because of the aforementioned pandas-related package; specifically, dfply. The package allows for R's dplyr-style data manipulation on pandas dataframe. I learned R and dplyr during graduate school and it definitely helps makes the code look a lot cleaner. However, this is the first time I tried to use it with pyinstaller. Everything builds correctly but upon execution I get missing file errors for a bunch of dfply's files and folders. In fact, looking at a single folder build, I don't see a dfply folder that the program was trying to access. It appears that pyinstaller completely misses it during the build process.

Now I probably could go back and just use vanilla pandas but that's going to suck because I used dfply pretty much everywhere I could. And until I can get py2exe to work on my system, I can't be certain it'll work either. I'm not all that up to speed on proper terminology so I haven't figured out the correct way to Google it but is there a way to coerce pyinstaller to include dfply and why it's not being included?

Preferably, I would rather keep the code as Python so I can push code updates much more sleekly but I don't know what is the best way of managing everyone's system to make sure they have the correct packages and versions installed. I'm only just starting to familiarize myself with virtual environments but I think I could actually make this work. We have a data management system called Solidworks PDM that does version controls and managed files appear on the computer as a local directory. I could probably set up a virtual environment in a PDM directory that is set up for the python script to run in and it looks like there are batch scripts that I can have the main program call to activate and deactivate the environment prior to and after the python script gets called.

While it would be nice to know what's going on with pyinstaller, I just want to check if my understanding and use of virtual environments is correct and the right way to do this.

CarForumPoster
Jun 26, 2013

⚡POWER⚡
Are there actual requirements that preclude pyinstaller, eg this will waste 100s of man hours per year with the current pyinstaller exe?

Like…do you care for no reason?

QuarkJets
Sep 8, 2008

Edward IV posted:

I have a python script that uses pandas and pandas-related packages to convert some medium-sized datasets from one format to another. I want to deploy it for the engineers to use on their local machines as it's a script that's tacked on another program that ultimately calls it for it to use; the engineers don't need to call it directly.

Now I'm the only engineer that has Python installed on their computer and pretty much the only person there that knows how to use it. While I'm sure some of the engineers will understand what to do if I document the setup process, there are some that aren't super tech savvy and I want to try to avoid needing to manage each engineer's system.

I was thinking of deploying it as a Windows executable to try to avoid that issue. I've successfully used pyinstaller in the past but begrudgingly so given the large package size and initial slow startup. I've wanted to try to use py2exe but apparently I need to install and configure the official python installer instead of the Windows store version because I can't even get the simple "Hello World" demo to work.

In the interim, I tried to use pyinstaller but ran into some issues because of the aforementioned pandas-related package; specifically, dfply. The package allows for R's dplyr-style data manipulation on pandas dataframe. I learned R and dplyr during graduate school and it definitely helps makes the code look a lot cleaner. However, this is the first time I tried to use it with pyinstaller. Everything builds correctly but upon execution I get missing file errors for a bunch of dfply's files and folders. In fact, looking at a single folder build, I don't see a dfply folder that the program was trying to access. It appears that pyinstaller completely misses it during the build process.

Now I probably could go back and just use vanilla pandas but that's going to suck because I used dfply pretty much everywhere I could. And until I can get py2exe to work on my system, I can't be certain it'll work either. I'm not all that up to speed on proper terminology so I haven't figured out the correct way to Google it but is there a way to coerce pyinstaller to include dfply and why it's not being included?

Preferably, I would rather keep the code as Python so I can push code updates much more sleekly but I don't know what is the best way of managing everyone's system to make sure they have the correct packages and versions installed. I'm only just starting to familiarize myself with virtual environments but I think I could actually make this work. We have a data management system called Solidworks PDM that does version controls and managed files appear on the computer as a local directory. I could probably set up a virtual environment in a PDM directory that is set up for the python script to run in and it looks like there are batch scripts that I can have the main program call to activate and deactivate the environment prior to and after the python script gets called.

While it would be nice to know what's going on with pyinstaller, I just want to check if my understanding and use of virtual environments is correct and the right way to do this.

I've never used pyinstaller, but here are the docs:
https://pyinstaller.org/en/stable/operating-mode.html#analysis-finding-the-files-your-program-needs

It sounds like pyinstaller is grepping for "import" statements in your code and then recursively looks for more import statements in the imported modules, then everything is supposed to get packaged together. But by default it only supports eggs; it supports a number of additional packages that require customized recipes (so stuff like numpy and pyqt that need to also package compiled libraries) but that's it.

It seems like dfply is set up to use the egg format (which is standard), but you say that it's missing a dfply folder? What's in that folder, more imports or other stuff? You might look at this part of the documentation:
https://pyinstaller.org/en/stable/when-things-go-wrong.html#listing-hidden-imports

DoctorTristan
Mar 11, 2006

I would look up into your lifeless eyes and wave, like this. Can you and your associates arrange that for me, Mr. Morden?
I don’t have any specific advice on pyinstaller, but I will comment that ‘a bunch of people running a .exe I pass around’ is a solution that may be okay in the short term but absolutely will come back to bite you sooner or later (exactly how quickly depends a bit on how big the organisation is and how quickly requirements change).

Hard to give detailed advice on what to do instead without more details on what you’re doing, but it does sound like what you *really* need is a database and some proper ETL tools.

Bad Munki
Nov 4, 2008

We're all mad here.


I'm always looking to bolster my team's skills (and resumes, in case they decide to bail out), and for the most part they're all fundamentally competent python developers, although early in their careers, and I'm looking to have them go further without having to reinvent a lot of wheels. Anyone have recommendations for more advanced, formalized training? Some are going to re:Invent and they'll likely pick up some relevant sessions there, although that's going to be a fairly narrow focus. I guess I don't really have a solid idea of what I'm after, but that also means I'm open to a broad range of suggestions.

Maybe something along the lines of, "I wish my company would pay me to do <TRAINING>", anyone got any of those?

Jose Cuervo
Aug 25, 2004
I am looking through free text strings (short hospitalization reasons) for the word 'parathyroidectomy'. I am able to do simple string matching (e.g., looking for 'para' in example_string), but this type of searching assumes that the word has been spelled correctly and will not catch paarthyroidectomy, even though that would be a relevant result. Is there any library which would help me search these strings for misspelled matches?

DoctorTristan
Mar 11, 2006

I would look up into your lifeless eyes and wave, like this. Can you and your associates arrange that for me, Mr. Morden?

Jose Cuervo posted:

I am looking through free text strings (short hospitalization reasons) for the word 'parathyroidectomy'. I am able to do simple string matching (e.g., looking for 'para' in example_string), but this type of searching assumes that the word has been spelled correctly and will not catch paarthyroidectomy, even though that would be a relevant result. Is there any library which would help me search these strings for misspelled matches?

Fuzzywuzzy

CarForumPoster
Jun 26, 2013

⚡POWER⚡

+1

Edward IV
Jan 15, 2006

CarForumPoster posted:

Are there actual requirements that preclude pyinstaller, eg this will waste 100s of man hours per year with the current pyinstaller exe?

Like…do you care for no reason?

No actual requirement. It just doesn't feel optimal to me as opposed to having the appropriate Python interpreter set up or converting them into proper Windows binaries. I've only experimented with pyinstaller a few times but this will be my first time where I actually do need to deploy code to other employees.

DoctorTristan posted:

I don’t have any specific advice on pyinstaller, but I will comment that ‘a bunch of people running a .exe I pass around’ is a solution that may be okay in the short term but absolutely will come back to bite you sooner or later (exactly how quickly depends a bit on how big the organisation is and how quickly requirements change).

Hard to give detailed advice on what to do instead without more details on what you’re doing, but it does sound like what you *really* need is a database and some proper ETL tools.

Yeah I get that this is far from ideal. The biggest roadblock to using a more modern setup is that our purchasing and quoting system is this really old software that's not even Y2K compliant. (It stores years as two digits.) It's basically used to make quotes and manage pricing for line items. It's not even a database since there is no central repository that stores all orders that have been made; it just makes order forms in the form of proprietary files that are stored in a local network share. Far from ideal but we're stuck with it for now. The developer did state that they are making a more modern version using SQL but that was almost two years ago and we haven't heard of any updates since. I really should get in touch with them to see where they are with that.

What my program does is convert a formatted text dump of those quote files into a form that another program to read which handles automated model generation in Solidworks. That Solidworks automation program does have features to utilize databases so we're not held back on that end. For the time being, this conversion program was requested because having the engineers read and interpret the quotes and manually configuring the Solidworks automation program to build the models according to the quote was creating errors between what the quote says and what gets generated especially for really mundane things that no one notices before things go too far to become a problem. So this is something that needed to be done but I am aware that is is merely a bandaid on a large wound.

The company isn't large at all as there are only about half a dozen engineers that would be using this program so scaling for this project isn't a concern right now.


QuarkJets posted:

I've never used pyinstaller, but here are the docs:
https://pyinstaller.org/en/stable/operating-mode.html#analysis-finding-the-files-your-program-needs

It sounds like pyinstaller is grepping for "import" statements in your code and then recursively looks for more import statements in the imported modules, then everything is supposed to get packaged together. But by default it only supports eggs; it supports a number of additional packages that require customized recipes (so stuff like numpy and pyqt that need to also package compiled libraries) but that's it.

It seems like dfply is set up to use the egg format (which is standard), but you say that it's missing a dfply folder? What's in that folder, more imports or other stuff? You might look at this part of the documentation:
https://pyinstaller.org/en/stable/when-things-go-wrong.html#listing-hidden-imports

OK thanks. I'll look into it eventually.

That said, I've abandoned using pyinstaller for this project as I was able to set up a virtual environment on PDM that the other engineers should be able to access. If I understand how virtual environments work, everything needed for the Python interpreter to run is in the virtual environment folder. If so does that mean that, as long as I only need to call the Python executable in the venv folder, there is no need for anyone else to install Python on their system?

Another question is how problematic would it be for the files and folders in a virtual environment to not have write access? PDM basically works on the premise that any attempts to write to files requires setting the files to a state that will permit it to be written to, checking out the files to actually give write access, checking in the updated files to commit them to the server, and setting the state back to normal. Sometimes the state changes aren't required but that usually means that only admins can check in and out files which in this case is preferable anyhow.

However, I don't know if there are any config, temporary or other files that Python needs to be able to write to while running or else things will break horribly or break in unusual ways or silently.

QuarkJets
Sep 8, 2008

You could set that entire directory to read-only and should be fine, this is what basically all linux distros do with python. You only need write access to update the environment, not to use it (a user trying to use pip to install something should have pip complaining that it can't write to the destination)

Foxfire_
Nov 8, 2010

It is possible (but annoying to do) to set up an entirely self contained directory of a python install + packages where you can just copy it from machine to machine. It's different from a venv in that a venv is mixing a shared python installation with a venv-specific site-packages. The 'embeddable python' packages on python.org are binaries for the base of that. Then you (manually) add whatever else you need inside its site-packages.

QuarkJets
Sep 8, 2008

Foxfire_ posted:

It is possible (but annoying to do) to set up an entirely self contained directory of a python install + packages where you can just copy it from machine to machine. It's different from a venv in that a venv is mixing a shared python installation with a venv-specific site-packages. The 'embeddable python' packages on python.org are binaries for the base of that. Then you (manually) add whatever else you need inside its site-packages.

One trick is that you can reliably copy around anaconda (miniconda, miniforge, etc.) distributions between machines so long as the destination path is the same. So if you're user jdoe you can rsync your miniconda directory from /home/jdoe/miniconda to othermachine:/home/jdoe/miniconda and everything will Just Work on the destination machine. This has some uses but in my head it goes right up to the edge of me thinking that I should just build a docker container and move that instead

Seventh Arrow
Jan 26, 2005

Seventh Arrow posted:

So at my new job, they have data analysts who are manually cleaning CSV files in Excel whenever they arrive. Obviously this is gross and it feels like they want me to automate the process.

Now I'm pretty familiar with cleaning data in pandas, but I also want to have some sort of interface - like maybe a webpage or UI with a "browse" button that they can click on and upload their filthy CSV. They could click a "start" button and then out the other end pops a minty-fresh CSV for their consumption.

But I'm wondering about this part of it - would it be possible to have a webpage in flask or django with the aforementioned "browse" button, or will this require a tkinter-type interface?

Ok so my pandas csv-cleaning script is coming along, and I've had help with the flask side of things. But I'm eventually going to have to deploy it, and I'm not sure what the standard approach is. I also get the impression at work that they're not really familiar with python either.

Let's assume that they have a webserver available - is it sufficient to just post the .py and .html files (after installing python 3, of course) and have the clients access the webpage? What are the avilable options? I want to remain open to cloud options, in case they want to go in that direction, and this might work.

I know this is more of a DevOps question, but I've never had to deploy a python script before, much less a flask app.

a foolish pianist
May 6, 2007

(bi)cyclic mutation

It’s pretty easy to turn a flask app into a docker container, and you can deploy those almost anywhere (or have people run them locally).

Seventh Arrow
Jan 26, 2005

Thank you, I will check into that.

Falcon2001
Oct 10, 2004

Eat your hamburgers, Apollo.
Pillbug
Been working on a commandline script at work and man, https://inquirerpy.readthedocs.io/en/latest/ is really nice! It handles validation and other commandline prettification and seems to work just fine over ssh too.

Tacos Al Pastor
Jun 20, 2003

Having a bit of an argument with a friend regarding private functions in Python. He believes they cant be done. I indiciated that a class that contains a private copy of a method techincally is a private function, like the example in section 9.6 here: https://docs.python.org/3/tutorial/classes.html#tut-private.

Please convince me Im either right or wrong :D

PS this all goes back to an ADT argument where ADTs are not implemented the same on Python as say, C.

Falcon2001
Oct 10, 2004

Eat your hamburgers, Apollo.
Pillbug

Tacos Al Pastor posted:

Having a bit of an argument with a friend regarding private functions in Python. He believes they cant be done. I indiciated that a class that contains a private copy of a method techincally is a private function, like the example in section 9.6 here: https://docs.python.org/3/tutorial/classes.html#tut-private.

Please convince me Im either right or wrong :D

PS this all goes back to an ADT argument where ADTs are not implemented the same on Python as say, C.

At least in my knowledge, I don't think that the Python runtime allows enforcement of private functions/attributes in the way that exist in other languages, so while the example you point to technically sounds like private, it's more like a weird interaction, because you could still instantiate an instance of Mapping and access both update functions. Java and C# and others have much stricter setups where you will error on compilation.

So I think the answer is for all practical purposes: No, python doesn't allow for Private variables.

Edit: The flip side though is that there is a convention for private variables/functions , by starting your name with _, and in general it's a signal that you shouldn't be using that function if you don't know what you're doing.

necrotic
Aug 2, 2005
I owe my brother big time for this!
Yeah it’s not private in that you can’t access it, there is only convention to indicate “this is a private API” with the underscores.

All that example does is immediately make a “private” (only through underscore indication) copy.

You could still overwrite and use that copy to your hears content.

The top of the section you linked even explicitly says actual private variables do not exist!

necrotic fucked around with this message at 19:17 on Oct 21, 2022

DELETE CASCADE
Oct 25, 2017

i haven't washed my penis since i jerked it to a phtotograph of george w. bush in 2003
you could use a closure to hide private variables. don't, though

SurgicalOntologist
Jun 17, 2004

That reminds me, I have an ongoing argument with a colleague about private (by convention) methods and attributes in Python. He wants everything to be private unless there is a strong reason to make it public. His classes typically have only one or two public methods. I prefer everything to be public unless there is a good reason to make it private. I typically only make private methods for little helper methods that I refactor out of other methods.

Is the almost-everything-private convention common in Python? In my experience no, but when we debate he can find examples in relatively well-established open-source projects. We've pretty much agreed to disagree for years but the style is noticeably different depending on which of us started the package (we were two of the first developers at the company and now mostly work on separate projects).

His background is computer vision, not Java if that's what you're thinking.

StumblyWumbly
Sep 12, 2007

Batmanticore!

SurgicalOntologist posted:

That reminds me, I have an ongoing argument with a colleague about private (by convention) methods and attributes in Python. He wants everything to be private unless there is a strong reason to make it public. His classes typically have only one or two public methods. I prefer everything to be public unless there is a good reason to make it private. I typically only make private methods for little helper methods that I refactor out of other methods.

Is the almost-everything-private convention common in Python? In my experience no, but when we debate he can find examples in relatively well-established open-source projects. We've pretty much agreed to disagree for years but the style is noticeably different depending on which of us started the package (we were two of the first developers at the company and now mostly work on separate projects).

His background is computer vision, not Java if that's what you're thinking.

I see this kind of thing playing out in Rust (where, eg, variables are assumed to be constant unless declared differently) vs C (where do what you will is the whole of the law). IMO it comes down to "would you rather your code be safe or flexible?" I'm not hugely involved in the Python world outside my job, but I think most folks agree with your style, but if "I don't want to change" was taken out of the math we'd probably all follow your colleague.

QuarkJets
Sep 8, 2008

SurgicalOntologist posted:

That reminds me, I have an ongoing argument with a colleague about private (by convention) methods and attributes in Python. He wants everything to be private unless there is a strong reason to make it public. His classes typically have only one or two public methods. I prefer everything to be public unless there is a good reason to make it private. I typically only make private methods for little helper methods that I refactor out of other methods.

Is the almost-everything-private convention common in Python? In my experience no, but when we debate he can find examples in relatively well-established open-source projects. We've pretty much agreed to disagree for years but the style is noticeably different depending on which of us started the package (we were two of the first developers at the company and now mostly work on separate projects).

His background is computer vision, not Java if that's what you're thinking.

In my experience it's common to have a mix of private and public methods and attributes, and there tend to be more public ones than private ones. That's the whole point of properties, they let you present public interfaces that can call private methods. I think by simple inertia most projects aren't going to mark things private

My preference is to use mostly public stuff, only marking something private if its use would be detrimental to the user somehow. That's pretty uncommon. I also try to avoid defining a class unless I absolutely need one

OpenCV, PIL, and basically any other computer vision library I can recall have tons of public methods and attributes so it's not like that would be giving him this viewpoint, either.

Falcon2001
Oct 10, 2004

Eat your hamburgers, Apollo.
Pillbug
Another example of private functions is stuff like 'When doing X, I split out part of that into Y to make the code cleaner/more readable, but it's not so generic I can shove it off to a helper functions module', where it's less necessarily 'private' and more 'you don't...need to use this.'

QuarkJets
Sep 8, 2008

I just created a numpy ndarray instance because I assumed that it'd have at least a couple of private attributes or methods, but apparently it doesn't have any. So there you go, one of the most widely-used classes in python is entirely public

Tacos Al Pastor
Jun 20, 2003

Thanks for the feedback. I think I have a little better idea what is going on.

CarForumPoster
Jun 26, 2013

⚡POWER⚡
Entry-level Python Jerb

I'm hiring a remote, part time, Python intern or SW engineer 1 for our 12 person, legal tech, Y Combinator backed, startup. Hoping to start someone within 4 weeks. The role has no end date, prefer candidates who are looking for 6+ months.

Depending on your length of stay, you'll develop, maintain and deploy web scrapers, web apps and an ML model in Python. It's a full stack role, you'd be deploying code mostly to AWS Lambda w/docker.

Resume fodder wise, it's top notch. All my interns have gone on to 6 figure remote jobs or internships with FAANGs. This position will develop things that directly move revenue and you'll get to touch every part of the process using a modern tech stack. I can offer resume help for this too (see my post history in the The Resume and Interview ULTRATHREAD). Expectations are high, but it's a friendly atmosphere. Most of our prior engineers text me from time to time and like our company's crappy LinkedIn posts.

Pay is $25/hr and flexible 20-40 hours/wk depending on your availability. Interview process is phone interview, at home coding challenge to build and deploy a simple web app (can be done in 2 hrs), video interview.

The right candidate should:
- be authorized to work in the US
- have a GitHub with at least one developed and working python web app, web scraper, or other project that show where you're at coding skill wise.
- be junior or later if in school. Graduated non-STEM majors or career changes are okay too if you have some coding experience!
- have worked in a professional office for at least 1 year

If interested, PM me your resume.

Rocko Bonaparte
Mar 12, 2002

Every day is Friday!
Is there a project that made a best-effort at parsing a date and time in various formats? I know datetime has functions for this but you need to define a specific formatting string for it. I was hoping at-best I could just tell it to look for US-style dates and then be able to parse:

10/29/22 11:15
10-29-22 11:15
10/29/2022 11:15:30
Oct-29-2022 11:15
10/29/22 11:15 (insert time zone bullshit here)

... You get the drift.

My own scheme is to just split date, time, and timezones subsequences and just gun it but I'd rather not write and maintain it myself.

lazerwolf
Dec 22, 2009

Orange and Black

Rocko Bonaparte posted:

Is there a project that made a best-effort at parsing a date and time in various formats? I know datetime has functions for this but you need to define a specific formatting string for it. I was hoping at-best I could just tell it to look for US-style dates and then be able to parse:

10/29/22 11:15
10-29-22 11:15
10/29/2022 11:15:30
Oct-29-2022 11:15
10/29/22 11:15 (insert time zone bullshit here)

... You get the drift.

My own scheme is to just split date, time, and timezones subsequences and just gun it but I'd rather not write and maintain it myself.

Have you tried https://dateutil.readthedocs.io/en/stable ?

Rocko Bonaparte
Mar 12, 2002

Every day is Friday!

That parse function it has looks about right. I'll give that a go.

Falcon2001
Oct 10, 2004

Eat your hamburgers, Apollo.
Pillbug
I'm so glad I pushed my team to store all datettime strings in ISO 8601 format, full stop.

Seventh Arrow
Jan 26, 2005

I have a project where there are three CSV files being imported into pandas for cleaning. The trick is that the client might only import one file at a time. So I don't want the script to try cleaning the columns for contact.csv if there isn't even a dataframe for it.

Having an "if" loop for every command would be tedious, I'm wondering if it might be better to set up a separate Class for each file? Maybe the Class could then depend on the condition - something like IF the filename contains the word "contact", then the contact Class applies. I'm not fully learned on classes so I'm not sure if this would work.

QuarkJets
Sep 8, 2008

Three functions would be better, then just use dictionary dispatch to select the right function based on the type of file being cleaned. If you don't need persistent state then you don't need a class.

Think of these three functions as workflows. Workflow functions should not actually do any work themselves, they should call other functions. So if some specific input needs the operations "replace commas with dots", "round all values to the nearest integer", and "replace the number 2 with the number 4", you would separately define functions that do each of these things and then write a workflow function calling each of those steps.

When a file is given to the program, it should inspect the file and choose an appropriate workflow for it. This selection logic should also exist in a separate function.

Basically try to have each function you write just do 1 thing really well, that's a good goal

QuarkJets fucked around with this message at 23:25 on Oct 29, 2022

John DiFool
Aug 28, 2013

CarForumPoster posted:

Entry-level Python Jerb
Pay is $25/hr and flexible 20-40 hours/wk depending on your availability. Interview process is phone interview, at home coding challenge to build and deploy a simple web app (can be done in 2 hrs), video interview.

How do you post this with a straight face? This is less than entry level programming jobs offered 15 years ago.

CarForumPoster
Jun 26, 2013

⚡POWER⚡

John DiFool posted:

How do you post this with a straight face? This is less than entry level programming jobs offered 15 years ago.

Its internship wages. Have gotten plenty of applicants from several Ivy or big name schools. Pays more then Lockheed docs locally.

Bad Munki
Nov 4, 2008

We're all mad here.


Rocko Bonaparte posted:

That parse function it has looks about right. I'll give that a go.

Be aware that in some cases, if the string fails to parse, it’s not a fast failure, it might work at it for a bit.

But yeah, we love this for parsing trash dates. Everything from nicely formatted strings to “3 weeks ago”

John DiFool
Aug 28, 2013

CarForumPoster posted:

Its internship wages. Have gotten plenty of applicants from several Ivy or big name schools. Pays more then Lockheed docs locally.

In 2006 I made ~30k on a 6-month *internship*. So ~60k a year. ~85k in 2022 dollars. You are absolutely shafting your interns.

Seventh Arrow
Jan 26, 2005

QuarkJets posted:

Three functions would be better, then just use dictionary dispatch to select the right function based on the type of file being cleaned. If you don't need persistent state then you don't need a class.

Think of these three functions as workflows. Workflow functions should not actually do any work themselves, they should call other functions. So if some specific input needs the operations "replace commas with dots", "round all values to the nearest integer", and "replace the number 2 with the number 4", you would separately define functions that do each of these things and then write a workflow function calling each of those steps.

When a file is given to the program, it should inspect the file and choose an appropriate workflow for it. This selection logic should also exist in a separate function.

Basically try to have each function you write just do 1 thing really well, that's a good goal

That makes sense, thanks!

Adbot
ADBOT LOVES YOU

CarForumPoster
Jun 26, 2013

⚡POWER⚡

John DiFool posted:

In 2006 I made ~30k on a 6-month *internship*. So ~60k a year. ~85k in 2022 dollars. You are absolutely shafting your interns.

I don’t care what you experienced in the area you live. It’s above market here.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply