|
Just embrace the Oxford comma and not have to deal with any logic other than if the list is >= 3
|
# ? Nov 30, 2017 13:27 |
|
|
# ? Jun 5, 2024 05:51 |
|
a witch posted:Pycharm. Add the library to your project, set a breakpoint in it and run the debugger. Thermopyle posted:PyCharm is great, but if you don't want to use it, you can use pdb or ipdb. Foxfire_ posted:pudb's my favorite if you're on unix Thanks for the ideas.
|
# ? Nov 30, 2017 14:31 |
|
Hughmoris posted:Is there a good way to step through Python code? I found a Python library to parse torrent names (https://github.com/divijbindlish/parse-torrent-name/blob/master/PTN/parse.py) and I can't quite figure out how it works. That code leans pretty heavily on regular expressions, which Python debuggers aren't going to be as helpful with.
|
# ? Nov 30, 2017 16:01 |
|
I'm not sure if this is the right place for this question, but here goes. How do I add a CA certificate so that Python and Python-based apps trust TLS certs generated by that CA? I'm running Ansible on a Centos 7 box. I have playbooks that connect to Windows devices over WinRM protected by a cert generated by the CA. Ansible works fine but the CA is not trusted so I get a lot of verification errors every time I run the playbooks. Various googling has led me down the path of running code:
Ansible, however, stubbornly refuses to verify the certs. I'm at a bit of a loss as to how to get this working now. Can anyone point me in the right direction?
|
# ? Nov 30, 2017 18:02 |
|
Mr Crucial posted:I'm not sure if this is the right place for this question, but here goes. How do I add a CA certificate so that Python and Python-based apps trust TLS certs generated by that CA? I haven't used this authentication method myself, but do the following host vars help? code:
Tigren fucked around with this message at 18:12 on Nov 30, 2017 |
# ? Nov 30, 2017 18:10 |
|
Tigren posted:I haven't used this authentication method myself, but do the following host vars help? code:
|
# ? Nov 30, 2017 18:45 |
|
Mr Crucial posted:They don't unfortunately, because I'm not using certificate authentication, I'm using CredSSP auth over HTTPS which is a different matter. I did try adding the ansible_winrm_cert_pem like so, but it didn't work: Does setting the REQUESTS_CA_BUNDLE environment variable help? It looks like CredSSP auth is handled by requests-credssp. I love that Ansible even just tells you to ignore cert validation. Super secure! quote:When the Ansible controller is running on Python 2.7.9+ or an older version of Python that has backported SSLContext (like Python 2.7.5 on RHEL 7), the controller will attempt to validate the certificate WinRM is using for an HTTPS connection. If the certificate cannot be validated (such as in the case of a self signed cert), it will fail the verification process. Tigren fucked around with this message at 20:53 on Nov 30, 2017 |
# ? Nov 30, 2017 20:35 |
|
Tigren posted:Does setting the REQUESTS_CA_BUNDLE environment variable help? It looks like CredSSP auth is handled by requests-credssp. Good idea, but no that didn't help. I ramped up the verbosity of Ansible and I think the fault is actually in a PowerShell module that's part of Ansible itself. It handles the connectivity to Windows but having a peruse through it there's nothing in the way of certificate handling. I've logged a bug on the Ansible Github page for it. And yes, super secure. Getting into all this devops tooling stuff has really highlighted to me how much a shitshow security is across the entire space.
|
# ? Nov 30, 2017 21:36 |
|
Dominoes posted:Hey dudes: How do you actively test/work on functions in your code? This is a broad-question; I hope this context helps:
|
# ? Nov 30, 2017 21:41 |
|
unpacked robinhood posted:Little things: code:
FoiledAgain fucked around with this message at 22:58 on Nov 30, 2017 |
# ? Nov 30, 2017 22:53 |
|
Dominoes posted:This appears to be fixed in the latest PyCharm release! Can do everything in the integrated Ipython terminal. FWIW, I was never really clear on what behavior you were looking for. That might be because I almost always write a unit tests for whatever function. Like, I might not write the unit test first like a good TDD disciple, but if I get to the point where I'm wanting to run the function, I write a unit test to run the function and then press Ctrl-Shift-F10 to run it. This gets me more unit tests and lets me do set up work and whatever else needs done.
|
# ? Dec 1, 2017 00:36 |
|
You’ve always been able to highlight code and execute it in the built in i python terminal of pycharm.
|
# ? Dec 1, 2017 01:10 |
|
Apologies if this has been asked in here before. I'm using Google Cloud SDK (Note: not App Engine SDK although they are very similar). In additional this I am using google-cloud-bigquery, google-cloud-pubsub, and some other google things. It appears none of these things work properly with pylint. I get unable to import 'google.cloud' stuff everywhere I'm trying to use this stuff. If I bring up a repl and import from google.cloud it works fine and of course running Google's app engine local dev server it runs fine. Why is this such a mess?
|
# ? Dec 1, 2017 01:47 |
|
vikingstrike posted:You’ve always been able to highlight code and execute it in the built in i python terminal of pycharm. Thermopyle posted:FWIW, I was never really clear on what behavior you were looking for.
|
# ? Dec 1, 2017 07:30 |
|
I'm writing a Flask app that accepts file uploads that I then need to process and publish (upload) elsewhere. What's the best way to fork the "process_uploads" process? I can think of several ways (e.g. suprocess.Popen) but maybe I should be using the multiprocessing module? I want the Flask view to add a job to the queue and then call my "process_uploads" class/function/whatever to take and handle those.
|
# ? Dec 1, 2017 18:38 |
|
xgalaxy posted:Apologies if this has been asked in here before. code:
|
# ? Dec 1, 2017 18:48 |
|
mr_package posted:I'm writing a Flask app that accepts file uploads that I then need to process and publish (upload) elsewhere. What's the best way to fork the "process_uploads" process? I can think of several ways (e.g. suprocess.Popen) but maybe I should be using the multiprocessing module? I want the Flask view to add a job to the queue and then call my "process_uploads" class/function/whatever to take and handle those. You shouldn't fork process or threads from a view. You need a task task queue. A simple one is python-rq, a complex featureful one is celery.
|
# ? Dec 1, 2017 19:12 |
|
Thermopyle posted:You shouldn't fork process or threads from a view. I might be misusing the term 'view'. Flask Route? Assuming that's not the same thing? It's a simple enough use case I just want to call a second process on demand rather than setting up queue/polling/workers. The workload is low (less than 10 hits per day) so my main concern is just preventing users from sitting there asking "is anything happening?" after they click the upload button. Is it so horrible to do it this way? I've written a lot of Python over the years but it's all been scripting-style so this kind of async behaviour is new territory for me. I looked at python-rq while researching approaches to this and it looked very good but overkill, but if it's your recommendation I'll do it this way.
|
# ? Dec 1, 2017 19:42 |
|
Yeah, that's a job for a rq or celery setup.
|
# ? Dec 1, 2017 19:50 |
|
Ok I'll take your advice, python-rq. Can someone tell me in brief why it's wrong to do it the other way? Linking to docs/article/whatever is fine, this is obviously a gap in my knowledge and I want to gain more understanding-- I don't have a comp sci background I'm just "good with computers" and never stopped learning. Is it mostly a Python thing? If you were working in C++ / Java would you use the same approach? (Maybe they have queuing built-in to their standard libraries?)
|
# ? Dec 1, 2017 19:58 |
|
mr_package posted:Ok I'll take your advice, python-rq. Can someone tell me in brief why it's wrong to do it the other way? Linking to docs/article/whatever is fine, this is obviously a gap in my knowledge and I want to gain more understanding-- I don't have a comp sci background I'm just "good with computers" and never stopped learning. There's a myriad of reasons, none of which is too convincing on its own. Also it depends on what your server setup is like...nginx, Apache, threaded requests vs green threads, vs processes, blah blah blah. The first time I got bit by one of these myriad reasons was the fact that I opened myself to a DoS attack because each time a view was hit it would spawn a process to process a image, hit some urls, and update the database. Of course, I could come up with some sort of decentralized system to maintain a limited number of processes only and then a queue of backed-up tasks that needed run. But that would be re-inventing one of the many existing task queue systems. For something as small as you're talking about there are much more lightweight task queues, like http://django-background-tasks.readthedocs.io/en/latest/
|
# ? Dec 1, 2017 20:07 |
|
I want to add multi-threading to a basic webscraper I've been tasked with. I have a list of URLs to spread across threads, but don't want to hit the same host simultaneously. With a list of URLs, some from the same host, some from different hosts, what's the best way to set up thread Queue()s or some other URL pool so each thread can do simultaneous downloads as long as they're from different hosts? This seems like something simple, and something that would be in stdlib collections or itertools, but I'm not seeing it. If it's actually a tricky issue, that's fine, and I'll work on a solution -- I just don't want to re-invent the wheel.
|
# ? Dec 1, 2017 21:26 |
onionradish posted:I want to add multi-threading to a basic webscraper I've been tasked with. I have a list of URLs to spread across threads, but don't want to hit the same host simultaneously. Sort them, then use itertools.groupby to split into groups by host. Separate the tasks by host rather than URL.
|
|
# ? Dec 1, 2017 21:32 |
|
I'm new to Pycharm and utilizing virtual environments, and I'm running Windows 10. When creating a new project in Pycharm, I can't find a module that I want to install (https://github.com/divijbindlish/parse-torrent-name). Is my next best option to open up a console window, activate the virtual environment and install the module? Or is there a way to help Pycharm find the module for installation?
|
# ? Dec 2, 2017 00:28 |
|
Hughmoris posted:I'm new to Pycharm and utilizing virtual environments, and I'm running Windows 10. I usually click the Terminal button in PyCharm and install packages that way. It automatically activates the virtualenv or conda env for the project.
|
# ? Dec 2, 2017 00:36 |
|
Thermopyle posted:I usually click the Terminal button in PyCharm and install packages that way. It automatically activates the virtualenv or conda env for the project. That worked, thanks. PyCharm is a bit overwhelming coming from Vim or Atom.
|
# ? Dec 2, 2017 00:46 |
|
Hughmoris posted:I'm new to Pycharm and utilizing virtual environments, and I'm running Windows 10. Phone posting, but you should be able to open the project interpreter settings and install packages there. https://www.jetbrains.com/help/pycharm/installing-uninstalling-and-upgrading-packages.html
|
# ? Dec 2, 2017 00:48 |
|
Tigren posted:Phone posting, but you should be able to open the project interpreter settings and install packages there. Thanks. That was the initial route I pursued but the package I needed wasn't in the available list.
|
# ? Dec 2, 2017 00:54 |
|
Hughmoris posted:Thanks. That was the initial route I pursued but the package I needed wasn't in the available list. Weird, works for me. What is listed when you click on that "Manage Repositories" button? Mine has https://pypi.python.org/simple listed.
|
# ? Dec 2, 2017 01:33 |
|
Tigren posted:Weird, works for me. My "Manage Repositories" list was initially empty. I added the one that you listed and refreshed available packages and no change, still can't find it. It looks like it might only be displaying Conda packages? A quick google search says this might not be an extremely uncommon issue but I haven't found a solution.
|
# ? Dec 2, 2017 02:23 |
|
Umm, I'm not at my pc but there's a button on the right side at the bottom that switches between virtual environments and conda.
|
# ? Dec 2, 2017 03:13 |
|
Thermopyle posted:Umm, I'm not at my pc but there's a button on the right side at the bottom that switches between virtual environments and conda. That was it. I'm able to find the package. Of course, when I go to install it, it errors out. I get the same error when attempting to install it from CMD but I am able to manually install it with setup.py .
|
# ? Dec 2, 2017 03:42 |
|
Not sure this is the exact thread for it, but I'm using Python to build out a prototype. Long story short, I'm building out an document OCR process/pipeline to extract data from PDF documents. Many of these are just straight up scans of structured documents, hence the OCR bit. I'm using Tesseract for the moment and looking for suggestions on any other OCR solutions I can use. Cloud services are a no go. No Russian software companies either. Otherwise, the customer generally prefers buying commercial software in the end, but until then I just need to prove that this would be useful before we start buying things. Anyone have OCR experience, particularly with structured forms, and recognizing data tables? I've been getting OK results for now. Tabula looks interesting. Tesseracts HOCR output format is a nice way to identify exact locations of each word. I can find fields by certain words/phrases that tend to get OCR'd the best and then locate text relative to these locations. For the tabular stuff I was considering even attempting some sort of clustering to see if that helped pull out wrapped text/phrases. I have some check boxes to pull out and have had some luck cropping with OpenCV and the counting the the % of black pixels vs white. But at this point I feel like I'm reinventing the wheel.
|
# ? Dec 3, 2017 05:30 |
|
FAGGY CLAUSE posted:Not sure this is the exact thread for it, but I'm using Python to build out a prototype. I don't have a lot of experience but also have this exact problem so I'm curious what you find out. Why no cloud services though?
|
# ? Dec 3, 2017 13:21 |
|
Classified documents.
|
# ? Dec 3, 2017 14:25 |
|
FAGGY CLAUSE posted:Classified documents. Are you this guy: https://www.youtube.com/watch?v=h6TRYcx74qs
|
# ? Dec 3, 2017 14:47 |
|
I don't know how but you found me.
|
# ? Dec 3, 2017 16:23 |
|
It's probably better to ask in the general programming thread as there's nothing specific to python. I've actually been working on something similar for personal usage. however, my scanner has a "Scan to PDF" option that automatically OCR's documents so I haven't had to worry about the OCR part. I've just been using text parsing to identify the structured parts of PDFs I want to pull out.
|
# ? Dec 3, 2017 19:43 |
|
How is python+selenium for filling out lots of repetitive forms? I noticed that some people on my project team are manually entering in the data for 2000+ users in to a web portal. They've asked for help but my eyes will fall out of my head if I have to manually type in crap. I have all of the user data in a clean csv file. The steps that are needed are basically:
I used AutoIT for a similar job a few years ago but I figured I'd give Python a try for this (plus I forgot AutoIT).
|
# ? Dec 5, 2017 03:00 |
|
|
# ? Jun 5, 2024 05:51 |
|
It will work fine.
|
# ? Dec 5, 2017 03:09 |