|
Dominoes posted:What resources do you recommend to learn debugging? I've been coding for a few years, but have never tried, or understood it. You set break points, and it tells you what values variable has without using temporary print statements? Depends on the debugger, but yeah - you can set a point in the code, and when that line gets executed, the debugger does something. You can have it stop, so you can look at the stack and all the variables in context. Or you can make it log a message or value. Or you can set watch points so it only pauses when some condition is met, so you can inspect the state when some bad value shows up. Or just break on exceptions. Or even modify variables and set the thing running again. It's good! Definitely worth learning if only to get rid of the print statements, but you can do a lot with it. Just makes your life easier
|
# ? Aug 28, 2016 18:41 |
|
|
# ? May 9, 2024 18:06 |
|
Thermopyle posted:Do you use PyCharm? It's got great debugging stuff built-in and I can say more about them if you do. You can also use an ipython shell for this, which I think is easier/better personally. https://www.youtube.com/watch?v=Z4oieDoQEk0&t=380s That is a good explanation of it, and also a good talk on the subject.
|
# ? Aug 28, 2016 18:45 |
|
Thanks dudes! Using Pycharm; got the basics of its debugger working. Going to compare it to ipdb. Seems useful!
|
# ? Aug 29, 2016 01:43 |
|
I'm having trouble using pyenv on the latest OpenBSD. I could post the whole error log, but here are the last couple of lines, maybe it's enough.code:
|
# ? Sep 2, 2016 17:58 |
|
Ipython is just amazing, it should be the standard
|
# ? Sep 2, 2016 23:13 |
|
Not sure if this belongs here, but I'm starting to learn programming with Python and am trying to build a webscraper that stores information in some sort of a database. Planning to scrape like 10 different small websites with items and a handfull of attributes. What database would be easy to use with Python? I was thinking something like CouchDB, or SQLite.
|
# ? Sep 3, 2016 09:10 |
|
LochNessMonster posted:What database would be easy to use with Python? I was thinking something like CouchDB, or SQLite. SQLite is easiest because it doesn't require a database server, the catch is you can't have multiple concurrent writers to a sqlite database so if you want your scraper to be multithreaded then you should use something else. If you're just starting programming then you are probably not writing a multithreaded scraper anyway, so you should use SQLite. You might also want to look into sqlalchemy, which is an awesome library for working with all kinds of databases (SQLite included), although it's kind of big and complicated and maybe not very easy to set up for a beginner, especially if you've never worked with something similar before. tl;dr: Use SQLite. SQLite is awesome.
|
# ? Sep 3, 2016 10:39 |
|
Nippashish posted:SQLite is easiest because it doesn't require a database server, the catch is you can't have multiple concurrent writers to a sqlite database so if you want your scraper to be multithreaded then you should use something else. If you're just starting programming then you are probably not writing a multithreaded scraper anyway, so you should use SQLite. You might also want to look into sqlalchemy, which is an awesome library for working with all kinds of databases (SQLite included), although it's kind of big and complicated and maybe not very easy to set up for a beginner, especially if you've never worked with something similar before. I don't think I'll make it multithreaded, definately not straight away. I'm familiar with databases and can make intermediate sql queries so I might take a look at sqlalchemy as well. The last time I did some programming was in college (java, 15 years ago or something). So besides the general concepts I'm completely new. Especially to python. Is it ok to post my code here to let others have a look at it and point out design flaws and/or improvements?
|
# ? Sep 3, 2016 10:49 |
|
You can post whatever you want but some people might not want to read your entire codebase (others won't mind!). I think that the best thing you can do is to post code snippets along with questions about bugs/style/syntax/etc. "Is this the right way to do this *paste block of code*"
|
# ? Sep 3, 2016 12:06 |
|
LochNessMonster posted:I don't think I'll make it multithreaded, definately not straight away. I'm familiar with databases and can make intermediate sql queries so I might take a look at sqlalchemy as well. go for it, if it's big enough that somebody who doesn't care is going to need to scroll past it then maybe make a project.log thread and just link it here when you want feedback. the sqlalchemy tutorial uses an in-memory sqlite database, so it's worth checking out http://docs.sqlalchemy.org/en/latest/orm/tutorial.html to get familiar with how ORM works. you can of course just use raw sql queries everywhere if you want to, but that's no fun
|
# ? Sep 3, 2016 12:16 |
|
Thanks for the advice. I did look at the project.log but felt kinda silly for starting one as it's my first ever project. I currently have 1 long python script that scrapes a site, turns it to a beautifulsoup object and parses information I want. For now I print the output to stdout but I want to start storing it, hence my db question. The code is starting to get longer so I was contemplating on making different scripts. One scraper, one paraser and ome that writes stuff to database. I don't want to hammer sites and scrape 10 pages 20 times in a row to test parsing code for example. My biggest question is if I should create methods and/or classes to seperate code. I havent quite figured out when I should use those.
|
# ? Sep 3, 2016 12:41 |
|
LochNessMonster posted:Thanks for the advice. I did look at the project.log but felt kinda silly for starting one as it's my first ever project. first project or five thousandth, there's no entry criteria for project.log. start with functions, classes can come later(if they ever need to). try to break down your functions until they're doing one thing you can easily test within them, and/or replace easily later. for example, if my_parser.py contains five functions that have absolutely nothing in common other than being generally related to parsing, then there's no need for a MyParser class. once you start breaking things up in a way that makes sense to you, you'll figure out for yourself how your code should be structured. there's no hard and fast rules around "this MUST be a class", and you could easily write an entire app without using any. also do yourself a favour and try to get in the mindset of test driven development. it might seem like serious overkill for a first project("why would i need a test for something that just returns a url string? waste of time!"), but i find writing tests up front helps me reason about the flow of an application a lot easier than banging out a bunch of code that just does it first and trying to make it sane later.
|
# ? Sep 3, 2016 13:04 |
|
LochNessMonster posted:Not sure if this belongs here, but I'm starting to learn programming with Python and am trying to build a webscraper that stores information in some sort of a database. i'd recommend sqlite, but i'd also suggest maybe peewee, depending how much sql you want to pick up ps, i also like lxml, requests, you might too.
|
# ? Sep 3, 2016 13:48 |
|
LochNessMonster posted:I don't want to hammer sites and scrape 10 pages 20 times in a row to test parsing code for example. You just wrap your web request in a with block, or add a decorator (vcr.py supports urllib2, urllib3, requests and others automatically): Python code:
|
# ? Sep 3, 2016 14:11 |
|
Dex posted:also do yourself a favour and try to get in the mindset of test driven development. it might seem like serious overkill for a first project("why would i need a test for something that just returns a url string? waste of time!"), but i find writing tests up front helps me reason about the flow of an application a lot easier than banging out a bunch of code that just does it first and trying to make it sane later. Any guides/tutorials/guidelines I can look into on that? What I've done so far is just test every minor thing I'm trying to build as a seperate standalone app with static and/or test data. That and print every variable/output/whatever I'm trying to do to confirm it does what I want it to do. tef posted:i'd recommend sqlite, but i'd also suggest maybe peewee, depending how much sql you want to pick up I already found out about requests, it sure makes my life easier. I'll look into lxml and peewee, but my main goal is to learn python, so I'm not sure if peewee will get a fair chance onionradish posted:You might take a look at vcr.py, which gives an easy way to automatically '"record and play back" HTTP traffic so you can run your script/parser as much as you want without hitting the servers while you're testing or doing development on the parser. I currently write the files to disk and use those files to run my parsing tests again. My parser is pretty straight forward at the moment. I get the first page, see how many pages there are and get those too. I'm not sure how vcr will help me do this easier (without making it too complex for me). Looking at your code snippets confuses me as I have no idea how to use it in my current code. I really am a beginner at the bottom level here.
|
# ? Sep 3, 2016 14:58 |
|
LochNessMonster posted:Any guides/tutorials/guidelines I can look into on that? What I've done so far is just test every minor thing I'm trying to build as a seperate standalone app with static and/or test data. the official docs are a good starting point: http://docs.python-guide.org/en/latest/writing/tests/ . basically let the code tell you my_cool_function is returning 'i am very cool' by calling it and asserting the value, instead of printing it out and reading it yourself. this is one of the reasons why TDD helps me design my applications, too - instead of "oh i need an app that does x, y, z and also saves the world", break it down to one function you need _now_, write tests for that, then implement it. don't get too hung up on which testing framework you want to use. i use unittest most of the time, because it suits me. if you experiment with a few of them and find one that fits you better, roll with it quote:I'm not sure how vcr will help me do this easier (without making it too complex for me). Looking at your code snippets confuses me as I have no idea how to use it in my current code. I really am a beginner at the bottom level here. i actually posted in this thread earlier asking about https://betamax.readthedocs.io/en/latest/ , which is based on vcr. it's very, very cool if your app makes a lot of web requests and you don't want to use stubs or mocks, but might be a bit in-depth if you're not up on unit testing your code yet. keep in mind for now i guess, but the basic idea for your use case would be: - you have a function that gets a web page, parses it, then fetches the other eight pages(making up numbers, shh) - if you have a unit test for this function, it's going to run really slowly since you're waiting for nine network requests to complete - if you're using vcr/betamax and you've already run that test, the requests(headers, bodies, status codes, the lot) are already written to disk so it'll just replay those instead of making actual web calls so, an automatic way of doing what you're already doing by saving the files locally. from onionradish's example, synopsis.yaml would contain the request for 'http://www.iana.org/domains/reserved'.
|
# ? Sep 3, 2016 16:08 |
|
Hey all, Quick question, I've got a python & flask project (displaying test results pulled from a sql database) and looking to deploy it to a proper webserver. What would be the best webserver for this, if it has a relatively low workload (only 10 or so users max)? My first thought was apache/mod_wsgi but I had a look around and saw CherryPy and it looks ridiculously simple to get going. Also is it worth setting this kind of thing up in a docker container or nah? Running on a CentOS 7 minimal VM in Hyper-V.
|
# ? Sep 8, 2016 05:34 |
|
I also have a Flask webserver question. I have a Bokeh app that involves an HTML5 video and concurrent animation. It works great in Bokeh server, but in order to have more control over the HTML I ported it to Flask. I'm having a hell of a time getting the video element to work. In the dev server, it loads but isn't seekable. Apparently Werzkreug has an issue with partial content requests so I tried gunicorn with 4 workers. The video wouldn't even load (no errors anywhere as far as I could tell). Then I tried gevent and got the same behavior as with Werzkreug. Has anyone ever got HTML5 video to work with Flask? What are my options here? I'm still learning all the webdev aspects--after reading up on HTML5 video issues I'm surprised it works at all with the Bokeh server. (which uses Tornado... so I also tried some hacky way from StackOverflow of using Tornado to run Flask. Video not seekable) And for the alternative, going back to the Bokeh server. I'll ask here rather than the Google Group since BigRedDot will see it either way. Is there a way to use the server Jinja2 template to embed multiple models in different places? Since there are a bunch of other elements as well using one big layout model is not really an option, as far as I can tell. SurgicalOntologist fucked around with this message at 06:56 on Sep 8, 2016 |
# ? Sep 8, 2016 06:51 |
|
priznat posted:Hey all, I prefer gunicorn and nginx for deployment. It's overkill for 10 users, but then so is any web server. I googled this guide up and after a quick scan seems to be good: https://www.digitalocean.com/community/tutorials/how-to-serve-flask-applications-with-gunicorn-and-nginx-on-ubuntu-14-04
|
# ? Sep 8, 2016 14:04 |
|
SurgicalOntologist posted:Has anyone ever got HTML5 video to work with Flask? What are my options here? Can you run Nginx to serve the static files and proxy the other requests to your Python application?
|
# ? Sep 8, 2016 14:24 |
|
rt4 posted:Can you run Nginx to serve the static files and proxy the other requests to your Python application? Doesn't seem to change anything. At least with code:
|
# ? Sep 8, 2016 15:54 |
|
That nginx configuration is just going to proxy everything through flask. You need to route a specific prefix for static files (probably something like ^/static/.*, or by file extension.)
|
# ? Sep 8, 2016 15:58 |
|
priznat posted:Hey all, I prefer using uwsgi instead of gunicorn, but the result is the same and the effort required is not much different. There's this guide by the same person that sets up uwsgi, but I think all the configuration he did was a bit overkill, especially the upstart script.
|
# ? Sep 8, 2016 16:01 |
|
Asymmetrikon posted:That nginx configuration is just going to proxy everything through flask. You need to route a specific prefix for static files (probably something like ^/static/.*, or by file extension.) Thank you, it works. Turned out to be easier than I thought, as is usually the case.
|
# ? Sep 8, 2016 16:08 |
|
SurgicalOntologist posted:Doesn't seem to change anything. At least with You'll need to tell Nginx to attempt to serve static files before proxying. It's called [fixed]try_files[/files] http://nginx.org/en/docs/http/ngx_http_core_module.html#try_files
|
# ? Sep 8, 2016 16:08 |
|
HardDisk posted:I prefer using uwsgi instead of gunicorn, but the result is the same and the effort required is not much different. uwsgi rules, but be warned that the default configuration is Bad and Wrong. The author has a pretty hard-line stance on backwards compatibility and such, so you have to be very explicit to get it to act in a reasonable manner.
|
# ? Sep 8, 2016 16:13 |
|
Didn't know that, thanks. I don't think I use the defaults myself because I got an init script from someone else that worked with uwsgi. The file wasn't exactly extensive, but it wasn't long enough to warrant (for me) a separate file for configuration.
|
# ? Sep 8, 2016 16:33 |
|
Well, got the video to work but turns out communicating with the Bokeh server is pretty complicated. I'd like to go back to just using the vanilla Bokeh app. BigRedDot, is there a way to get more control over the server's template embedding? Edit: Obviously that was me in the Group, thanks for answering. SurgicalOntologist fucked around with this message at 17:50 on Sep 8, 2016 |
# ? Sep 8, 2016 16:48 |
|
What is the best resource to learn python if I have no programming knowledge? I was looking at the MIT online class but I can't add another time sensitive class to my schedule.
|
# ? Sep 8, 2016 17:16 |
|
goodness posted:What is the best resource to learn python if I have no programming knowledge? I was looking at the MIT online class but I can't add another time sensitive class to my schedule. I've always liked Think Like a Computer Scientist, but there's other highly recommended resources too that someone else can mention. I'd look into Code Academy maybe?
|
# ? Sep 8, 2016 17:21 |
|
Thanks for the responses all! Also I realized I said containers when I meant virtualenv. One of the annoyances I have had with centos is that it defaults to 2.7 and the scripts were developed on 3.4. I did some googling around and it looks like going code:
And then pointing the wsgi file to run the python script in that virtualenv project directory is the right way to go correct? I'm explaining it badly.
|
# ? Sep 8, 2016 21:55 |
|
It's worth noting that it feels like the thread regulars are mostly using conda instead of virtualenvs because of...reasons that I don't actually remember. I mean, I pretty much just use conda nowadays, but I can't say I remember why I switched. Though, I actually pretty much never install packages in my conda envs with conda...I still always use pip.
|
# ? Sep 8, 2016 23:08 |
|
goodness posted:What is the best resource to learn python if I have no programming knowledge? I was looking at the MIT online class but I can't add another time sensitive class to my schedule. Besides Think Like A Computer Scientist, another good one is Automate the boring stuff, which is not as "CS-ey" as TLACS, but more practical. It has you manipulate Excel sheets and work with JSON api's and scraping some web pages. A site like codewars is pretty decent for finding small problems to practice your python with. Also, the MIT class is not time sensitive. Just join now and you can stil watch all video's after the class has ended.
|
# ? Sep 9, 2016 18:58 |
|
Thermopyle posted:It's worth noting that it feels like the thread regulars are mostly using conda instead of virtualenvs because of...reasons that I don't actually remember. I mean, I pretty much just use conda nowadays, but I can't say I remember why I switched. I think it's because the majority of people in this thread seem to be doing scientific / dataviz type work and Anaconda gives you all that stuff. For web dev there's no reason to bother.
|
# ? Sep 9, 2016 19:45 |
|
I think part of the reason I started using conda was that its easier to get precompiled packages for Windows. The big one I always used to run in to long ago when I was doing dev on Windows was lxml.
|
# ? Sep 9, 2016 20:33 |
|
good jovi posted:I think it's because the majority of people in this thread seem to be doing scientific / dataviz type work and Anaconda gives you all that stuff. For web dev there's no reason to bother. The prod team at Paypal would disagree https://www.paypal-engineering.com/2016/09/07/python-packaging-at-paypal/
|
# ? Sep 10, 2016 16:39 |
|
I wish they talked about why they went with Miniconda instead of virtualenv. Reading between the lines it sounds like what I mentioned earlier...packages that require compiling.
|
# ? Sep 10, 2016 19:10 |
|
priznat posted:Thanks for the responses all! FYI the virtualenv tool is deprecated and obsolete. Virtual environment functionality is now built in to Python, so to create a virtualenv you should now do code:
|
# ? Sep 10, 2016 19:41 |
|
I usually just use pyvenv. Is this just an alias for python3 -m venv or does it do something slightly different?
|
# ? Sep 10, 2016 20:06 |
|
|
# ? May 9, 2024 18:06 |
|
It's an alias for python -m venv which has been relatively recently deprecated as per Python issue #25154 -- mostly since in many cases it isn't clear which Python interpreter or version you're creating the new virtualenv from. In short:
|
# ? Sep 10, 2016 20:40 |