|
Hey bros. I'm trying to partially implement PEP 582. I'd like to direct python to make virtualenvs etc, but it's murky how I'd do that. I'm going to assume there's an alias available, but how do I check what it is? Ie it could be `python`, `python3`, `python3.7` etc. One check will be forcing the user to specify a range of versions in a `TOML` file, and checking that `python[3] --version` matches it. edit: Further RFI: I've got the script mostly working, in that it automates operating from a venv in a way that's smoother than Pipenv and Poetry (Major caveat: No locking/dependency deconfliction), but it uses its packages in a in-proj-folder venv, ie the usual `venv/a few directories/site-packages` instead of to-level `__pypackages__`. Do y'all know of any ways around this, so the venv will point to a diff directory? Also, note that I'm making this as a binary file (coded in Rust), to avoid the headache of ensuring the right python install (2? 3? 3.7? user? sudo?) is executing the script, which can be an issue on Linux, esp for new users, and bit me recently with pipenv. Dominoes fucked around with this message at 13:59 on Jul 15, 2019 |
# ? Jul 15, 2019 07:49 |
|
|
# ? Jun 3, 2024 22:44 |
|
Dominoes posted:Hey bros. I'm trying to partially implement PEP 582. I'd like to direct python to make virtualenvs etc, but it's murky how I'd do that. I'm going to assume there's an alias available, but how do I check what it is? Ie it could be `python`, `python3`, `python3.7` etc. Since people can have multiple interpreters, you've got to have it so that you either ask the user or have a (required?) argument to see which python interpreter they want to use. Avoid using the alias because that could be anything. You have to use an absolute path to the interpreter. IIRC, some of these types of tools have a ton of checks looking for various places people could have interpreters installed. That is a pretty hairy situation though since sometimes you might have the same python version installed in more than one place. I think the best practice would be to just ask the user to provide the path to the interpreter.
|
# ? Jul 15, 2019 16:54 |
|
Thermopyle posted:IIRC, some of these types of tools have a ton of checks looking for various places people could have interpreters installed. That is a pretty hairy situation though since sometimes you might have the same python version installed in more than one place.
|
# ? Jul 15, 2019 17:04 |
|
You can pretty easily just iterate through possible python versions available on the path? In bash for Linux with "future-proofing": code:
I believe pipenv does something similar when you specify --three or whatever when creating an environment. Could also test for tools like pyenv if you wanted to get really fancy for people who install versions of python but do not add them to their path, but trying to deal with users who have python installed, but not on the path is a pointlessly hard endeavor. For the second point, you could potentially use the code:
Master_Odin fucked around with this message at 17:17 on Jul 15, 2019 |
# ? Jul 15, 2019 17:13 |
|
Master_Odin posted:For the second point, you could potentially use the
|
# ? Jul 16, 2019 13:58 |
|
Are those statements equivalent:Python code:
e: vvv Thanks ! unpacked robinhood fucked around with this message at 21:02 on Jul 18, 2019 |
# ? Jul 17, 2019 13:40 |
unpacked robinhood posted:Are those statements equivalent: I’m not quite certain, but I wouldn’t be surprised to learn that given matching but differently ordered indices in df1 and df2 you will get different insert orders into df1.extra_column.
|
|
# ? Jul 17, 2019 13:51 |
|
Yeah those could be different depending on the indices. With tolist you'll lose that information.
|
# ? Jul 17, 2019 14:55 |
|
I'm merging a couple csv's with pandas and am having issues with numbers getting rounded up when using pandas.read_csv. For example: 1904.9999 becomes 1905 Any suggestions? I spent a couple hours on google trying to figure this out and nothing I've tried has worked.
|
# ? Jul 18, 2019 20:55 |
The Fool posted:I'm merging a couple csv's with pandas and am having issues with numbers getting rounded up when using pandas.read_csv. low_memory=False, float_precision=“high”; and fill out dtype argument.
|
|
# ? Jul 18, 2019 21:27 |
|
float_precision on it's own wasn't working, but adding the other two arguments did. Thanks.
|
# ? Jul 18, 2019 21:50 |
|
I guess Kennith Reitz is getting out of the game? https://github.com/not-kennethreitz/team/issues/21 I sincerely hope that requests falls into the PSF organization and that they take over governance. It'll also be interesting to see if we do end up with another event-stream event.
|
# ? Jul 18, 2019 22:59 |
|
Master_Odin posted:I guess Kennith Reitz is getting out of the game? https://github.com/not-kennethreitz/team/issues/21 Oh poo poo, what's the drama this time?
|
# ? Jul 18, 2019 23:03 |
|
Just noticed you can add arbitrary variables to a dataclass, e.g. you can just do my_dataclass.asdf = "asdf" and then my_dataclass.asdf returns "asdf" should you ever need it. Printing my_dataclass doesn't, because the field isn't added to __repr__ but you can print(my_dataclass.asdf). Is this a side effect of dataclass design or is it a feature we are intended to use? edit: this might be normal think I've seen it before with other class types, but never used it..?
|
# ? Jul 18, 2019 23:05 |
|
You can do that on an instance of any class. https://docs.python.org/3/tutorial/classes.html#odds-and-ends
|
# ? Jul 18, 2019 23:11 |
|
Think this is a design question with some Python specific aspects: I've got some code that periodically queries an API, processes any new data it finds, and sends that off to another API. This all works fine as a console app I just run in a screen on a $5 Nanode. What I want to do is build a web interface for controlling it (start/stop the polling, check the status if it's currently processing some input, which can take a few minutes) and viewing the incoming data and processed results. It'd be just for my benefit -- no one else would ever interact with this -- but it's an excuse to learn web stuff and refactor the spaghetti I currently have. So I started learning Django; worked out the models and views, parsed my logs to pull in all of the work that's already been done into a database, so far so good. What I'm stuck on is, where does the code that does the work go and how does my Django code interact with it? I'm not even sure what to ask more specifically -- like when I navigate to the page that has the "start the thing" button, what should the corresponding View do when I click it? Am I spawning a thread, a new process? Where does the reference to whatever you'd need to stop the polling (or just check that it's still alive) live? How should the polling/processing code communicate that it's got input (and create an IncomingThing in the database) or that it's finished processing (and create a corresponding ResultThing)? I get that this is broad and I'm dumb so if you just want to point me in the direction of some concepts (or some existing code that does this kind of thing) even that would help, I was at a loss for what to Google. Thanks!
|
# ? Jul 18, 2019 23:18 |
|
KICK BAMA KICK posted:Think this is a design question with some Python specific aspects: I've got some code that periodically queries an API, processes any new data it finds, and sends that off to another API. This all works fine as a console app I just run in a screen on a $5 Nanode. What I want to do is build a web interface for controlling it (start/stop the polling, check the status if it's currently processing some input, which can take a few minutes) and viewing the incoming data and processed results. It'd be just for my benefit -- no one else would ever interact with this -- but it's an excuse to learn web stuff and refactor the spaghetti I currently have. What you're looking for is called a task queue. The canonical solution is called Celery. However, it's very configuration-heavy because it's Enterprise Grade. 95% of the time I prefer python-rq. It's simple with fine docs. Behind the scenes the basic idea works like this: You have a server like Redis running. Your webserver python code puts a message into Redis. Your task queue python code sees that message and runs the tasks you've configured to run when such a message appears.
|
# ? Jul 18, 2019 23:55 |
|
Master_Odin posted:I guess Kennith Reitz is getting out of the game? https://github.com/not-kennethreitz/team/issues/21 I hope someone hires him. Looks like PSF is taking over all of KR's stuff. Also, in reading that issue thread I found out that PSF now administers Black! Eventually, the black repo will be moving out of the python repo and into the psf repo.
|
# ? Jul 19, 2019 00:10 |
|
Thermopyle posted:I hope someone hires him.
|
# ? Jul 19, 2019 00:34 |
|
Thermopyle posted:What you're looking for is called a task queue.
|
# ? Jul 19, 2019 01:17 |
|
mr_package posted:Just noticed you can add arbitrary variables to a dataclass, e.g. you can just do my_dataclass.asdf = "asdf" and then my_dataclass.asdf returns "asdf" should you ever need it. Printing my_dataclass doesn't, because the field isn't added to __repr__ but you can print(my_dataclass.asdf).
|
# ? Jul 19, 2019 02:14 |
|
Thermopyle posted:What you're looking for is called a task queue. Me and another dev just made babbys first deployed web app and this is exactly what we did for a function that takes about 3 minutes to run. As a pro tip on RQ/Redis/Flask/Dash combo it doesnt play nice on windows 10. The worker.py file we had to grab poo poo out of the queue didnt work so stuff just stacked up in redis. The front end "web" worker times out after 30s on heroku so we also had to figure out how to not use a while loop to ask the queue if our jobs were done yet. Still kinda working on that last bit but it was confusing for me for a while. CarForumPoster fucked around with this message at 02:58 on Jul 19, 2019 |
# ? Jul 19, 2019 02:49 |
|
mr_package posted:Just noticed you can add arbitrary variables to a dataclass, e.g. you can just do my_dataclass.asdf = "asdf" and then my_dataclass.asdf returns "asdf" should you ever need it. Printing my_dataclass doesn't, because the field isn't added to __repr__ but you can print(my_dataclass.asdf). You can do this with most objects, even functions. The whole "consenting adults" thing: Python won't stop you from doing something stupid.
|
# ? Jul 19, 2019 09:07 |
|
Trivia: In old versions of Python, running with the `--version` arg outputs to stderr. Q: The location inside a venv for executables and custom scripts (eg python, ipython, pip etc) is `bin` on Ubuntu, and `Scripts` on Windows. Are there any other names it could have?
|
# ? Jul 19, 2019 14:52 |
|
CarForumPoster posted:Still kinda working on that last bit but it was confusing for me for a while. This reminds me of when I was first getting into web stuff...it was very confusing and nebulous and magical for a long time to me. Not that this is you, but it reminds me to post something I try to post every once in awhile to help the next person in my position from years ago: The whole idea and that your whole program runs from beginning to completion for every request to the webserver along with the consequences just took a long time to sink in. Django or Flask or Whatever has all sorts of fancy trappings to make structuring your software easier, but it all boils down behind the scenes to a single function that a webserver calls. The function takes data from the request like query params, headers, and POST data as arguments and returns a thing containing a string describing the response. However long this function takes mostly determines how fast your application is. I always tell people having a hard time figuring out whats going on to try writing a toy HTTP server, it's not terribly hard, there's lots of tutorials, and it really helps you grok wtf is going on.
|
# ? Jul 19, 2019 22:26 |
|
Thermopyle posted:This reminds me of when I was first getting into web stuff...it was very confusing and nebulous and magical for a long time to me. This is me
|
# ? Jul 20, 2019 16:14 |
|
If I write a dataclass where one of the fields is a dictionary of other dataclass objects you end up addressing them in a mixed format (dots and brackets e.g. my_dataclass.fruits['apple'].weight). Is there a good way to nest dictionaries without mixing addressing schemes like this? Am I just plain Doing It Wrong by making a dataclass turducken or it's ok and normal to work with them in this way? I've been trying to think of a way using a frozen dataclass as the key to the dictionary but so far do not see a way forward. In some ways best solution is to use unordered collection but that means iteration and for a large enough value of fruits that will probably become too slow.
|
# ? Jul 20, 2019 22:42 |
|
mr_package posted:If I write a dataclass where one of the fields is a dictionary of other dataclass objects you end up addressing them in a mixed format (dots and brackets e.g. my_dataclass.fruits['apple'].weight). Is there a good way to nest dictionaries without mixing addressing schemes like this? Am I just plain Doing It Wrong by making a dataclass turducken or it's ok and normal to work with them in this way? Since you already have a wrapper class, just give it various attr methods that pass through to the dictionary it possesses. E.g. The parent's __getattr__ could return whatever is returned by the dictionary's get() method. That should let you invoke things like my_dataclass.apple.weight
|
# ? Jul 20, 2019 23:17 |
|
QuarkJets posted:Since you already have a wrapper class, just give it various attr methods that pass through to the dictionary it possesses. E.g. The parent's __getattr__ could return whatever is returned by the dictionary's get() method. That should let you invoke things like my_dataclass.apple.weight Thank you this works perfectly just with a simple two lines: code:
So basically: code:
|
# ? Jul 21, 2019 01:10 |
|
That looks like it should work as-is; Python code:
QuarkJets fucked around with this message at 01:32 on Jul 21, 2019 |
# ? Jul 21, 2019 01:29 |
|
No setting in PyCharm (have sprung for Professional) to run all my tests before committing? Or is that a thing you configure in Git itself or something?
|
# ? Jul 21, 2019 03:14 |
|
KICK BAMA KICK posted:No setting in PyCharm (have sprung for Professional) to run all my tests before committing? Or is that a thing you configure in Git itself or something? I don't know about PyCharm but you could do that with a git pre-commit hook. I wouldn't advise it though because you should be committing as often as possible and it'll introduce friction to that. What you should do is test every branch before merging it into master.
|
# ? Jul 21, 2019 03:16 |
|
Just be a pre merge hook or ci
|
# ? Jul 21, 2019 03:59 |
|
Hey bros. It looks like Pipenv and Poetry take a while to lock is because there's no way to pull the deps of packages without downloading them. #1: WTF. #2: Any reason I couldn't make a database, put it online, impl a JSON api, and have it automate caching this? Would need to download every new release of every package once, but then should be GTG. Am I missing anything? For ref, the warehouse API is p good, but is missing this. edit: related. Perfect's not feasible, but we can do better. Expect an early release within a week. Dominoes fucked around with this message at 16:03 on Jul 22, 2019 |
# ? Jul 22, 2019 09:54 |
|
Dominoes posted:Hey bros. It looks like Pipenv and Poetry take a while to lock is because there's no way to pull the deps of packages without downloading them. #1: WTF. #2: Any reason I couldn't make a database, put it online, impl a JSON api, and have it automate caching this? Would need to download every new release of every package once, but then should be GTG. Am I missing anything? I haven't looked into it in detail, but I thought they downloaded them because they hashed the actual content of the downloads to ensure a matching download.
|
# ? Jul 22, 2019 17:43 |
|
Thermopyle posted:I haven't looked into it in detail, but I thought they downloaded them because they hashed the actual content of the downloads to ensure a matching download. You could otherwise resolve deps in a normal way, download only what you need, then hash as a final-step QC; I don't think that's what they're doing. Pipenv is especially bad about this.
|
# ? Jul 22, 2019 17:52 |
|
Basic API The first time it queries a package/specific version, it downloads the package to the server, and pulls the Metadata (Newer dist-info/wheel format; older egg-info not yet supported). This is sort of like what Poetry/Pipenv do locally. Then it caches it, and returns the cached results on future hits. The first time you hit a package/version combo it'll take a while, but will be faster for future hits. Dominoes fucked around with this message at 19:43 on Jul 22, 2019 |
# ? Jul 22, 2019 19:37 |
|
KICK BAMA KICK posted:Thanks! Lmao I actually knew about these (I remember looking up Huey when someone mentioned it here a few weeks ago) and had thought about using it to schedule processing some old data during downtime when new stuff wouldn't be coming in. Somehow never occurred to me to use it for the main loop itself. I just stumbled across this today: Understand How Celery Works by Building a Clone
|
# ? Jul 23, 2019 22:15 |
|
I have a small percentages of files out of a batch that don't parse well when opened with the default open(..) I've managed to get around this by checking the encoding on each file with filemagic, and setting the encoding parameter accordingly. Does it feel bad-practicy ?
|
# ? Jul 24, 2019 11:41 |
|
|
# ? Jun 3, 2024 22:44 |
|
In case anyone else had this problem...I started to typequote:Is there a better way than a file I git ignore to store secrets like API keys and what not? Like maybe a AWS service that can only be accessed by whitelisted IPs But then I googled like a good boy and there is and it works fine through boto3: https://aws.amazon.com/blogs/aws/aws-secrets-manager-store-distribute-and-rotate-credentials-securely/
|
# ? Jul 24, 2019 13:11 |