Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
StumblyWumbly
Sep 12, 2007

Batmanticore!

SurgicalOntologist posted:

That reminds me, I have an ongoing argument with a colleague about private (by convention) methods and attributes in Python. He wants everything to be private unless there is a strong reason to make it public. His classes typically have only one or two public methods. I prefer everything to be public unless there is a good reason to make it private. I typically only make private methods for little helper methods that I refactor out of other methods.

Is the almost-everything-private convention common in Python? In my experience no, but when we debate he can find examples in relatively well-established open-source projects. We've pretty much agreed to disagree for years but the style is noticeably different depending on which of us started the package (we were two of the first developers at the company and now mostly work on separate projects).

His background is computer vision, not Java if that's what you're thinking.

I see this kind of thing playing out in Rust (where, eg, variables are assumed to be constant unless declared differently) vs C (where do what you will is the whole of the law). IMO it comes down to "would you rather your code be safe or flexible?" I'm not hugely involved in the Python world outside my job, but I think most folks agree with your style, but if "I don't want to change" was taken out of the math we'd probably all follow your colleague.

Adbot
ADBOT LOVES YOU

StumblyWumbly
Sep 12, 2007

Batmanticore!
I have a Python memory management and speed question: I have a Python program where I'm going to be reading in a stream of data and displaying the data received over the last X seconds. Data outside that window can be junked, I'd like to optimize for speed first, and size second. I know how I'd do this in C, but I'm not sure about Python memory management.
The data are going to be in a live updating graph, so accessing it contiguously would be good.

I have my own ideas about circular buffers or offset pre-allocated buffers, but I have the feeling Python has something off the shelf that will handle this well. Does anything like that exist?

StumblyWumbly
Sep 12, 2007

Batmanticore!
That can give a divide by zero answer,, but otherwise is better. The question is easy enough but the ChatGPT answers are pretty cryptic.

Thanks everyone for the Python tips, I'll definitely play around with deque.

StumblyWumbly
Sep 12, 2007

Batmanticore!

Soylent Majority posted:

Thats the basic poo poo I need - now I have something far less dumb:

Lists and iterators are absolutely massive and central to Python. When you're trying to do something in Python, first say "how can I use lists here", and you'll probably come up with a better solution.

StumblyWumbly
Sep 12, 2007

Batmanticore!
I'd like to read streaming data in with Python and have it live update in a line graph. Does anyone have experience with this? Plotly and Matplotlib have animate functions, but I worry that my use case is a little outside what they normally handle. The Plotly examples I see use static buffers, and I need to handle streaming data

StumblyWumbly
Sep 12, 2007

Batmanticore!
Does anyone know any examples of well put together GUIs in Python? I'm mostly a Python hack, and I've got an older wx Python program, and it feels like the presenter (wx.App object) is doing too much work. Like I'll have a separate object that keeps a list of books, and then get_scifi_books() will be defined in the Presenter object. Maybe it doesn't matter because Presenter is the only one using the function, but it feels like I could add in more testing and stuff if get_scifi_books() was defined in a separate object that keeps the books. But then that leaves Presenter mostly empty.

How far does that rabbit hole go? If I want to have 3 buttons for get sci-fi, get history, get educational books, should I just have the book object return a list of button names and the associated functions, and the Presenter just blindly puts the list of buttons together?

StumblyWumbly
Sep 12, 2007

Batmanticore!
I've heard good things about Kivy but haven't tried it. PySmpleGUI sounds right up my alley for a lot of the basic, internal stuff I'm doing. This is all for running local programs so flask may not be the best.
I'm a little worried that WxPython may fall seriously behind. It didn't move to python 3 until 3.5 or something, it sounds like not many folks use it, and it just looks like Windows NT.
If I want to do fancy guis, is qt more the way to go or should I just not use python?

StumblyWumbly
Sep 12, 2007

Batmanticore!

Twerk from Home posted:

How much Python 2 are y'all living with? What's a sane expectation for carrying along Python 2 applications that don't have any change planned but are expected to keep working forever?

If we try and mandate Python 3 by decree but without getting everyone's hearts and minds onboard, there's going to be grumbling for all eternity. My own thoughts on it are that all code does end up being modified at some point, and the less often you encounter Python 2, the more likely you are to make mistakes in it. I'd like to port everything that runs in 2023 to Python 3, full stop.

I'm just tired. Tired of having to remember 2/3 differences, tired of collaboration failures where the two parties are on different Python versions and don't voice this at the start of a conversation, tired of ancient packages.
Haha, my company moved to Python3 like a year after support stopped for 2 and I thought we were way behind.

You are going to run into trouble hiring, and searching for answers will become harder and harder. You need to update. Interns should be able to do a big chunk of it, unless maybe you're doing something extra strange with low level objects or something.

StumblyWumbly
Sep 12, 2007

Batmanticore!

Zugzwang posted:

How reliable are modern Python-to-exe packages? I recently learned that a research tool I’ve been working on for my job might eventually need to be deployed to outside users, probably closed source. And due to IP reasons on their end, it can’t be a web app hosted on AWS or whatever because their data/results can’t be on a system outside of their company at any time.
I've used pyInstaller, there are times when it is a massive pain but once you get it working it is fine. Things to be aware of:
- Including non-Python files takes some effort, like if you need to load in some settings from a .json that you expect to be included in the EXE
- Create the exe from a venv that only has what you need, otherwise it may be huge
- It is fairly trivial to reverse "compile", the scripts are just zipped up with an Python executable, so don't expect to have any secrets there


E: ^^^ I'm curious, do you work in academia? What place has people that complain about needing to use Python 3?

StumblyWumbly fucked around with this message at 00:17 on Mar 17, 2023

StumblyWumbly
Sep 12, 2007

Batmanticore!
I'm making some code where I want to put in a target sensor and setting, and get out the channel ID and parameters it should be tested against. So I put in {"Microphone": {"sample_rate": "20000", "gain": 1.4}}, and I get 17: {"noise": 0.07, "rate_tolerance": 0.001}, where 17 is the channel ID. I have some sensors that can have multiple channel ID's associated with them.

I have the system working using just dicts and functions that go through and map from one format to the other. The dicts work great as long as you remember all the keys, I'm trying to move them to objects to help structure the data more.
Moving this into objects works great when one target sensor becomes one set of test parameters, but I feel like I'm missing an easy way to handle one target sensor becoming multiple sets of parameters. What I have now is:
code:
param_dict = {}
for name, values in sensor_list.items():
    test_param = TestObject(name, values)
    param_dict{test_param.id} = test_param
But this falls apart if a sensor can generate multiple test params, which is a special case but it happens so I need to handle it.

I feel like there's a Pythonic way to do this that I'm missing. Maybe TestObject should be an object generator function that always returns a list of the test objects (which is normally going to be length 1), and I always iterate through that list and add the elements to the param_dict? Maybe I should ditch the dict and have an object that holds all the test parameters so I can manage access?

Am I missing something clever here?

StumblyWumbly
Sep 12, 2007

Batmanticore!
Thanks, I did not remember dataclasses exist, so that's useful.
I guess the no-background question is: I'd like to be able to parse through a dict, and generate a list of objects, all the same type, for each item. These lists will all get merged into one master dict of the resulting objects.
Should I just use a function to generate that list of objects and merge them into a single dict, or could I do something with the object design itself so it is a bit more self contained?

The more I think about it, the more I think the generator function that puts out a list of objects is the way to go.

StumblyWumbly
Sep 12, 2007

Batmanticore!
Neat, Enum looks very similar to the mapping system I have, but I'm also not sure why I would prefer Enum over the current option. What I have is pretty much:
code:
sensor_to_target_map = {
  "MIC_1" : { "chan_id": 10, "noise": 0.008, "freq": 20000},
  "MULTI_MIC" : { "chan_id": [7,8,9], "noise": [0.008, 0.008. 0.1], "freq": 44000}
}
There are other settings associated with the sensors that get fed in, with the net result being that when someone runs a test with the MULTI_MIC sensor, they get 3 objects with the channel IDs 7, 8, 9, and the test targets are calculated based on the noise and frequency provided by the user and the values in this target map.

I think I could replace that system with an Enum, and I don't think I'd need to change anything that much but I'm also not sure what that would improve? Is it just a matter if the map being more rigidly structured so I don't need to remember what strings to use, or is there something else it opens up?

StumblyWumbly
Sep 12, 2007

Batmanticore!

ziasquinn posted:

I've decided to try and make another "run" (used loosely here) at learning Python, should I just go through Think Python and Python Docs or are there more recent kind of books/guides/docs for starting off? I know about "Automate the Easy Stuff" and kin, for example.

I always get really bored starting off cause the basics aren't that interesting (but I know this gets better as it builds on itself). For example, I know that tinkering with existing code is fun, but I eventually just hit a wall where I have a hard time determining the kinds of projects or goals to work on as a super novice that won't KILL me. So I lose steam and momentum without having like, a class, forcing me to do it?

I am sure I'm not alone in this?

Do you have a project you'd like to try doing? Like automating some spreadsheet work or renaming files or doing math?
And what's your programming background? There's no one size fits all.

StumblyWumbly
Sep 12, 2007

Batmanticore!
Using ChatGPT for regex is great, AIs are very good at that kind of translation.
But, if you can't figure out the rules you need for reformatting the data, you are pretty much doomed. Since you've been cleaning the data, you probably know the rules, you just haven't formalized them. Hopefully you can get a program to do most of the work, and then manually, eg, delete garbage data at the start of the recording.

Out of curiosity how are you controlling these instruments now? VISA bus? Or truly custom software?

StumblyWumbly
Sep 12, 2007

Batmanticore!

Falcon2001 posted:

All programming is an exercise in 'time saved vs time spent' - https://xkcd.com/1205/ is the classic example, but I'd say that sort of work would bother me personally.
I'd also add that time spent learning Python vs cleaning CSVs will pay future dividends, as will time spent thinking through the cleanup process enough to get someone else (or some _thing_ else) to do the work for you.

StumblyWumbly
Sep 12, 2007

Batmanticore!
This might be similar? Our board assembly house tests the hardware with some code we provide, and getting them updates has been a pain for multiple reasons, so I set up some stuff so they can pull the latest version of whatever they need from GitHub.

The steps are install Git, run a. bat that sets up the username and password (a fine grained PAT with access just to that repo), and clones the repo with the tool they and other scripts they may need (eg to pull the latest repo). Should work well, but the meeting to get them running is tomorrow.

If you definitely need a web based answer, I think the computer would need something special installed so the web app can interface with the serial or USB or w/e

StumblyWumbly
Sep 12, 2007

Batmanticore!
My company sells sensors, and we have some open source Python libraries for configuring and interfacing with them, but we'd like to make some advanced wireless features that we can sell as subscription software, because that's how the world works now. It seems like the way to paywall a Python library for Windows would be to have a separate server running out of an install/EXE, which does the real work including handling the licensing and communicating with the device, and the Python library would just be an API interface with the local server. Qoitech (https://www.qoitech.com/) does this for automating an interface to their OTII power supplies, and it seems to work ok with some issues (RasPi support seems janky, folks keep forgetting to log out and lose their license, etc).

The issue with just doing it all in Python, of course, is our secrets are completely exposed.

Has anyone seen a better way to do this?

StumblyWumbly
Sep 12, 2007

Batmanticore!

wolrah posted:

So you're saying you want to charge a subscription for a local device talking to a local network service that costs your company nothing ongoing?

If you're not incurring recurring costs but you want to charge them to your customers, that's bullshit.

It's fine to charge a subscription for a cloud service, for technical support, for ongoing software updates, etc. Things that actually have ongoing costs because they involve resources you're paying for, which you can then easily control access to.

Unless I'm reading you wrong what you want to do is not OK and should not be encouraged or assisted. Yes a bunch of lovely vendors already do it, but we don't need more of it in the world.
I don't want to get into the details, but this isn't "we'll sell you a computer but you pay each month if you want to use wi-fi". This would replace our cloud system in some places, give folks local control, and add new features to existing devices. Maybe we'll do a one time license, who knows. Plus we already offer ongoing support and development which is free right now.

There's a lot of details I'm not providing because this is just a noxious conversation to have. I hate capitalism too.

QuarkJets posted:

You said that the libraries are open source, so what secrets are you trying to protect? If you mean "secrets" as in authentication details then those shouldn't be in your codebase anyway.
That's true, we could just grab a key and use that key in the communication with the device. That's a much simpler idea. It will expose a lot of our packet and data format more clearly, but that should be fine.

StumblyWumbly
Sep 12, 2007

Batmanticore!
The main threat would be folks trying to access features on the device that they don't have a license for, so we can do the actual check on the device, which is very secure.

I'm not super strong on cryptography, it feels like we might be doing enough out in the open that the customer can figure out our magic crypto numbers. We'll look into that, but I'm not super worried about it since our customers are mostly businesses.

StumblyWumbly
Sep 12, 2007

Batmanticore!
After a lifetime of stashing stuff in dicts that are painful when I look at them 2 weeks later, I'm thinking of just going whole hog with dataclasses. I get into a lot of situations where I have 20 pieces of hardware, each with a sample rate, range, and enable/disable, or a block of data with a start time, end time, format, 10k samples, and a few other features.

Is there any drawback to using dataclasses more? It seems too good to be true.

Anything to consider in going with Pydantic vs dataclass?

StumblyWumbly
Sep 12, 2007

Batmanticore!
Exactly what I was hoping for, thanks!

StumblyWumbly
Sep 12, 2007

Batmanticore!
I've used multiple iterations of PyCharm, and they've all just worked for me. One drawback to PyCharm is that there are so many releases, sometimes advice doesn't always work for your particular version.
Is something weird about this computer? Did you install python 2.7 or muck around with the path? Can you run scripts and get them to just print stuff? Is there weird security stuff or some kind of Python framework thing?
Are you sure you are pushing the right buttons and setting break points in the right files?
Hopefully those questions spark something because without sitting at the computer it'll be hard to debug.

StumblyWumbly
Sep 12, 2007

Batmanticore!
Just today, I put in the most basic rear end updates into some code I wrote like a year ago. The file already had the most basic testing possible in it, and I saw it and said "sure, I guess I'll check my work" and found like 3 errors.
Because I wrote those tests a year ago, I was able to finish half a bottle of wine before my long weekend started.
So proud of who I was, that man is my hero

StumblyWumbly
Sep 12, 2007

Batmanticore!

huhu posted:

I have two classes Plotter and Layer. A Plotter consists of an array of layers. Each layer is an array of instructions.

code:
class Plotter:
    def add_layer(self, name: str):
        self._layers[name] = Layer(self)

    def update_layer(self, name: str) -> Layer:
        return self._layers[name]
code:
class Layer:
    def add_line(self, x1, y1, x2, y2):
        self.add_pen_down_point(x1, y1)
        self.add_point(x2, y2)
I've removed a bunch of code for brevity.

If I want to update a layer, I currently do the following:

code:
plotter.update_layer(RED_LAYER).add_line(0, 0, 50, -50)
This feels bad, is there a more pythonic way?


A pattern I like (could be bad, actually, and I'm interested in other opinions on it) is to have add_line return itself, so if you want to add multiple lines you could do:
code:
plotter.update_layer(RED_LAYER).add_line(0, 0, 50, -50) \
                                                           .add_line(1, 2, 3, -50) \
                                                           .add_line(10, 20, 30, -50)
That can be combined with the other ideas folks have mentioned.
Depending on how the code grows, this can make life better or worse.

StumblyWumbly
Sep 12, 2007

Batmanticore!
Gantt chart? Or you want something that covers stored values that should not overlap?

StumblyWumbly
Sep 12, 2007

Batmanticore!

BUUNNI posted:

Thanks for the suggestions! Right now I'm looking for help dealing with creating scatterplots using PyPlot (for instance, creating a plot that shows 'UV Index vs. Population Density in World's 20 Largest Cities' with the dots proportionate to city population), and handling dataframes using Pandas, including creating data columns, creating numeric variable from column data, etc...

I've found that Plotly makes better graphs more easily compared to Matplotlib.

StumblyWumbly
Sep 12, 2007

Batmanticore!

Falcon2001 posted:

Speaking of plots and datavis/etc, I'm curious how people would approach this problem:

I've been playing around with Advent of Code stuff, which has a lot of problems involving 2d maps, often on cartesian planes (so X/Y can be negative) - I've struggled quite a bit with storing/handling these. Here's one example (note that probably storing the actual full grid is not the right solution for this puzzle, but it's indicative of the style of problem): https://adventofcode.com/2022/day/15 - I'm trying to get some stuff in order for this year's advent.

I've used numpy ndarray, forcing the indexes to work by offsetting things, but that was super weird to work with and my code became very hard to read. I've recently moved over to using a pandas dataframe with a custom index ( as seen here: https://stackoverflow.com/questions/53494616/how-to-create-a-matrix-with-negative-index-position) but that has it's own weirdnesses.

You might say 'well just store the points in a list!' but there's a fair number of these problems where you're doing neighbor lookups regularly, and having the full grid populated is actually very useful; not to mention debugging/checking your work becomes a lot easier when you have a fully populated grid. I actually threw together my own ndarray -> image function using PIL to turn an ndarray into a graphical representation based on the integer value, but I'm increasingly thinking I'm doing it wrong.

So the question: What's the correct (or at least painless) way to store (and ideally visualize) a 2d cartesian grid where each coordinate can be an arbitrary data type? I'm not looking for code golf style solutions, as I'm using these problems as ways to explore concepts/etc, and so it's more useful to have something that I can dig through and visualize/debug/interact with than a perfect impenetrable one-line solution.

Have you tried using DataFrame.iloc calls with Pandas? That can let you have weirdo indices and also check neighbors

StumblyWumbly
Sep 12, 2007

Batmanticore!
Two tips for writing list comprehension: include a comment about what you're trying to do, and use descriptive variable names, even if it triples the number of characters.

List comprehension are often obvious only while you are writing them.

Also it seems like the kind of thing chatgpt would be good at, but I have not tested that

StumblyWumbly
Sep 12, 2007

Batmanticore!
It's frustrating because it almost fits in a comprehension, but I think the end condition means it won't.

StumblyWumbly
Sep 12, 2007

Batmanticore!
I'm writing a bunch of PyTest code to run some integration tests. We have a top level test dispatcher that sends out some JSON parameters to a system that runs the code, then we use PyTest to check that everything ran correctly. We need the PyTest code to have access to the parameters, so the parameters go into the execution environment, and in conftest.py we grab the test parameters and parameterize them to make it accessible to the individual tests.

This feels like an inelegant solution to a standard situation, so I wonder if there's a better solution. Specifically:
- Am I over using conftest.py? Is it really the best solution to manage input parameters or is there something I'm missing?
- As multiple people add tests, we have a bit of disagreement between philosophies of "Pass an object containing all parameters to each test, let the test grab what it wants" vs "Separate each test parameter to its own pytest parameter in conftest" Are there standards I should pay attention to here?
- Should I just not use PyTest at all? Our test case is more "Check that multiple files meet the parameters provided", not "check one thing against multiple parameters". I like PyTest in general but it definitely feels like an awkward fit here.

StumblyWumbly
Sep 12, 2007

Batmanticore!

monochromagic posted:

This does seem inelegant, and I'll freely admit I'm not quite sure what you're trying to solve here. I'm assuming that "parameters go into the execution environment" means something like passing them as environment variables. I'd use fixtures picking them up instead, rather than trying to use parametrize because that functionality is more for unit tests. With respect to separation, I prefer to separate/isolate tests as much as possible so I wouldn't pass an object around with all parameters in any case.

In general pytest is a super powerful testing suite, and probably has the best ergonomics for basically any language, and much of that power comes from its fixture concept. One thing I would add is to be careful when implementing fixtures as they can contain a lot of "magic behaviour" so isolation and good documentation is essential.

Right, the problem we're solving is testing out sensor hardware. The input parameters say, essentially, "take measurements on channels <A, B, C> at rate <X> for <Y> seconds, <Z> times". Other code runs those parameters on the sensors, this will give us Z files, we want to make sure channels A, B, C are present, with the specified rate and duration. In reality channels A, B, C will each have their own test criteria, like max and min values. We group all those channel specific values into the test parameter object, and in conftest.py we have code like:
Python code:
def pytest_generate_tests(metafunc):
    if "file_name" in metafunc.fixturenames and "channel" in metafunc.fixturenames:
        test_chans = get_channels_from_input_parameters()
        test_order = []
        for file in get_files_from_input_parameters():
            for chan in test_chans:
                test_order.append((file, chan))
        metafunc.parametrize(["ide_file_name", "channel_id"], test_order, scope="session")
    # Separately handle situations where a test wants just the file name or just the channel
    if "test_parameters" in metafunc.fixturenames:
        test_parameter_object = get_test_param_from_whatever()
        metafunc.parametrize("test_parameters", [test_parameter_object])
This does a few weird things:
- Set the test order so we open the file, test all channels on it, then close the file, to minimize file overhead (not sure if this is a real problem)
- Each test gets the file, channel id, and the test parameters. The test will use the channel ID to pull its data out of the file, and get its parameters from the test parameter object. Parameterizing it at a higher level seems like it would increase overhead a lot
- I _think_ we parameterize the test parameters here because we may change them with command line arguments, or maybe we wanted to just generate the parameter object once?

Since these tests are linked to hardware, they all run on Raspis, so compute power is not huge, but I am going through some steps that reduce isolation, so maybe I should re-evaluate that choice.
In general, I might be better off if I just always write the test parameters to a file, and use a standard fixture to pull it out, instead of trying to handle command line vs environment.
I'm not sure on the best way to handle splitting up the file data or input parameter data without making more of a mess.

StumblyWumbly
Sep 12, 2007

Batmanticore!
Ok, thanks, sounds like things are a little ugly but I'm not missing anything big. Changing the file format is a neat idea but not possible because the format is part of the test, and pre-processing it would be a big change.
We are using a dataclass for the test parameters. I may move it into Pydantic for a little extra checking on input parameters.

StumblyWumbly
Sep 12, 2007

Batmanticore!
I don't want to sound old, but what's up with kids today and their async statements?

My real question is I have some code running in a gui, my model may be modified by mouse events and I'm pretty sure the gui backend will make sure mouse events and display won't overlap, but I'd like to be sure so I wanted to add a mutex. Being lazy, I thought I'd just add a threading mutex, since they should all be pretty much the same, right? That's when I found asyncio.Lock. The documentation says it is "Not thread-safe", but I think it must mean you may need to wait while accessing it, as opposed to ROM which is perfectly thread safe?

I feel like I'm clear on the differences between threads and async. async seems neat and different, but a bit jankier than threads. Is there any real difference or preference between the threading.Lock vs asyncio.Lock? At a low level, they must be doing the same thing, but I could imagine differences in what happens when you need to wait for the lock.

StumblyWumbly
Sep 12, 2007

Batmanticore!
Thanks for the help, that clarifies things. I think I expected async to be more just because it's new.

Is it true that if I have thread lock protecting a resource that ends up getting used in async, things are protected? Or is there a potential issue because an async lock would automatically yield or await but the thread lock will not? This question is purely academic

StumblyWumbly
Sep 12, 2007

Batmanticore!

Vulture Culture posted:

You need to be really careful addressing synchronization in programs that are simultaneously using asyncio and worker threads. Trying to acquire a locked lock will block the thread, right? And as a general rule, you don't want to put blocking operations into the event loop. If you spin up worker threads for every synchronous task that could possibly hit a blocking operation, including working with thread-locked resources, you're fine.

So consider what Lock and RLock each do. Lock will block the thread until another thread releases the lock, but all the other tasks on your event loop are running in the same thread, so now as soon as one of them tries to wait on the lock, no other task will ever run again and you're deadlocked. RLock is re-entrant, meaning the same thread can call it multiple times and safely re-acquire the lock, which puts you in the opposite situation: attempts to acquire the lock from other coroutines in the loop, which all run in the same thread as the one you locked from, will always succeed and your lock will never, ever wait.

Ok, thanks, I see what you're getting at. Thread locks will bounce you to another thread, but because of their tie to the thread it wouldn't make sense to send you to another asyncio. I guess the other part is that since async won't yield unless explicitly told to, it doesn't make sense to have an async lock unless you do an async yield. My background is in FPGAs and microcontrollers, where things are just different.

It seems like asyncio is mainly for servicing high latency calls like network queries. Maybe they could be used to help with multiprocessing (which I haven't used) or maybe CUDA, but I assume those systems already have features so async isn't necessary. Is there a use case for this that I'm missing?

StumblyWumbly
Sep 12, 2007

Batmanticore!
One thing to be slightly careful of is that Pydantic released v2 in 2022, so some internet info is outdated. Most significantly, ChatGPT is pretty much unaware of Pydantic v2.
It's a pretty straightforward tool tho, like
Dataclasses but safer

StumblyWumbly
Sep 12, 2007

Batmanticore!
Is using docker to do builds with pyinstaller a dumb idea or a great idea?
It seems like a venv could work just as well but there's always something weird going on

StumblyWumbly
Sep 12, 2007

Batmanticore!
What I mean is, we use pyinstaller to make executables for our stuff. It always seems like this is a very delicate process because we have dlls and special files to include, so I've been encouraging folks to set up a docker that can make the build.
I feel like a venv should fix these problems but it also sounds like it hasn't in the past, and I'm not sure if I'm remembering the ancient past or if there are things that just work better outside a virtual environment

StumblyWumbly
Sep 12, 2007

Batmanticore!

Seventh Arrow posted:

Actually, I think I can probably even do everything in a single line:

code:
def front_times(str, n):
    return str[:3] * n if n > 0 else "Please enter a non-negative number"


Your code is correct and generally better, but simple is not always shorter. Using that type of if statement is generally just unnecessarily dense. In some places like list comprehension I think it can make things more efficient, but if you can use a standard if statement instead you probably should

Adbot
ADBOT LOVES YOU

StumblyWumbly
Sep 12, 2007

Batmanticore!
I'm playing around with Polars and this seems to work
code:
import polars as pl

file_path = 'my_file.csv'

# Read the CSV into a Polars DataFrame
df = pl.read_csv(file_path)

# Create a new column with only the names (non-numeric values)
numbers = ['1', '2', '3', '4', '5', '6', '7', '8', '9', '0']
df = df.with_columns(new_col=pl.when(~pl.col("first_column").str.contains_any(numbers)).then(pl.col("first_column")))

# Fill the new column down so missing values in 'new_col' get the most recent word
df = df.with_columns(pl.col('new_col').fill_null(strategy="forward"))

# Filter out the rows where there's a name (non-numeric value) in the first column
df = df.filter(pl.col("first_column").str.contains_any(numbers))
print(df)
All you have to do is change your framework and your problems will disappear also change!
Worth noting that polars does away with an explicit index column, but has some fast and powerful filtering.

E: Also worth noting ChatGPT is poo poo at helping with Polars because it is too new and re-uses some common terms.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply