Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Locked thread
spiritual bypass
Feb 19, 2008

Grimey Drawer
Is it just me or does Atom have an unreasonably long startup time?

Adbot
ADBOT LOVES YOU

Thermopyle
Jul 1, 2003

...the stupid are cocksure while the intelligent are full of doubt. —Bertrand Russell

OnceIWasAnOstrich posted:

Are you intending to make a GUI program that will be high-end and compete partially on the basis of having a nice, smooth GUI after years of development? If so don't do the web technologies based method. Atom, the text editor which is why that framework exists has a poo poo GUI that looks superficially pretty but has all sorts of GUI-based annoyances in my opinion. On the other hand if your alternative is a hacked together Qt GUI made by one person who has never done it before, the web framework GUI you come up will be way better in every way and come together way faster.

Yes, it is reasonable to hold this viewpoint because one example isn't as good as you would like.

After all, you cannot find a Qt GUI that is poo poo.

It is definitely not the case that you have to use your tools correctly.

I mean, all websites are the opposite of "nice, smooth after years of development" and they even have the advantage of working over a lightening fast network instead of being hamstrung by being completely local like we were suggesting here.


(sorry, I just woke up after a rough night)

OnceIWasAnOstrich
Jul 22, 2006

I mean, find me an example of a really well put together smooth HTML based desktop GUI app. Every impressive GUI desktop app I've ever used has been native, and Atom is the best example of a web based that I have personal experience with. It's certainly going to be a big deal in the future and will only get better as it gets more work and native development drops off. (Except that native development on mobile platforms is still moving way faster in terms of GUI quality than web-based). Obviously I can find a shitton of awful examples of both, I've just not experienced even a handful of top-tier web framework based GUIs, I was just using the best example I have personal experience with.

Also even most of the best web interfaces would be unacceptably laggy and slow as native apps, even as you sarcastically mentioned they are definitely hamstrung by being network-based for the most part.

Thermopyle
Jul 1, 2003

...the stupid are cocksure while the intelligent are full of doubt. —Bertrand Russell

I can't find you an example of a desktop web type GUI with years of development with modern technologies because people haven't been doing that widely for too long and I don't care enough to go searching anyway. Discord is pretty good, but I hardly use it so I hesitate to say it's great and smooth and wonderful, I just don't recall having any problem with it.

Your error wasn't exactly that you were wrong (though you partly were) it was that you thought your one example was the proof you needed.

The argument isn't that websites can be smooth as native, it's that they can be fast enough to not be the shitshow you make them out to be and in fact can be quite good. (I'm not writing off web tech as being just as smooth as Qt or whatever, I just don't know how to do a meaningful comparative benchmark)

In fact, many/most GUIs don't require any sort of "smoothness" superpower because they're simple forms which Qt or whichever web technology can handle without approaching their limits. The differentiating factor is that you may find web tech is easier to use, has way better tooling, has way better widgets, and can be made to look way better...or you may not. I certainly don't think that you will find that you can't make a good and well-performing UI, though (at least in the great majority of cases).

If course, web tech isn't all sunshine and roses and in many instances it may not be a better choice than something native.


edit: Also...sorry for the post full of snark. I shouldn't have posted when I did.
edit2: Also, wanted to make it clear I'm not particularly arguing for Electron. I'm arguing for the idea in general, there's other ways to do a web tech UI with Python like QtWebKit.

Thermopyle fucked around with this message at 20:59 on Aug 21, 2016

Series DD Funding
Nov 25, 2014

by exmarx

rt4 posted:

Is it just me or does Atom have an unreasonably long startup time?

Can you use it emacs-style where it constantly stays open and you can instantly open a new file in another buffer?

Suspicious Dish
Sep 24, 2011

2020 is the year of linux on the desktop, bro
Fun Shoe

OnceIWasAnOstrich posted:

I mean, find me an example of a really well put together smooth HTML based desktop GUI app. Every impressive GUI desktop app I've ever used has been native, and Atom is the best example of a web based that I have personal experience with. It's certainly going to be a big deal in the future and will only get better as it gets more work and native development drops off. (Except that native development on mobile platforms is still moving way faster in terms of GUI quality than web-based). Obviously I can find a shitton of awful examples of both, I've just not experienced even a handful of top-tier web framework based GUIs, I was just using the best example I have personal experience with.

Also even most of the best web interfaces would be unacceptably laggy and slow as native apps, even as you sarcastically mentioned they are definitely hamstrung by being network-based for the most part.

Discord and VS Code are built on Electron and are extremely snappy.

Master_Odin
Apr 15, 2010

My spear never misses its mark...

ladies
Well I'm convinced enough to use electron then and my design needs are way simpler than discord or VS Code. Thanks for the recommendations.

BigRedDot
Mar 6, 2008

rt4 posted:

Is it just me or does Atom have an unreasonably long startup time?

Yes. I ditched Atom and went back to SublimeText specifically because of the horrendous startup times.

Dex
May 26, 2006

Quintuple x!!!

Would not escrow again.

VERY MISLEADING!

Dominoes posted:

Hey, this is probably old news to most of y'all, but using py.test is a clear winner over the builtin unittest. No imports, classes, or inheritance required, just write functions with name starting with 'test', and use an assert statement in the func. Run with 'py.test filename.py'

Compared to builtin unittests, immediate advantages are lack of boilerplate, and not acting finicky about directory structure. I spent a while troubleshooting issues stemming from unittest being picky about relative imports; no issue with py.test.

hypothesis has some weird thing about pytest fixtures that i can't remember off the top of my head so i don't use pytest at all, i'd be curious what you're doing that unittest is boilerplate for you compared to writing your own assertion logic for everything though

for anyone who works heavily with remote APIs, have you used betamax for unit/integration testing? i only discovered it recently but i'm in love, just wondering if anybody has some horror stories with it or if i'm justified in raving about it to everyone

Dominoes
Sep 20, 2007

Dex posted:

hypothesis has some weird thing about pytest fixtures that i can't remember off the top of my head so i don't use pytest at all, i'd be curious what you're doing that unittest is boilerplate for you compared to writing your own assertion logic for everything though

Python code:
import unittest

class Tests(unittest.TestCase):

    def test_things(self):
        self.assertEqual(5 == 5)
Python code:
def test_things():
    assert 5 == 5
The code present in the first example but not the second (ie class setup and method names) doesn’t contribute to the code's meaning, and requires lookup/memorization.

Thermopyle
Jul 1, 2003

...the stupid are cocksure while the intelligent are full of doubt. —Bertrand Russell

Well yeah, but that's because unittest comes with a bunch of abilities that you're not demonstrating there. I mean, if you don't use its advanced abilities that's cool, but unittest using classes isn't just for the hell of it.

SurgicalOntologist
Jun 17, 2004

What's it for, I've always wondered. When choosing between two APIs, and one is built around subclassing, I'm choosing the other one 90% of the time. If the advantage is to be able to use things like setup and teardown methods, I'd rather do it pytest's way. If it turns out I need something like that, add a fixture. No need to built up something more complicated from the beginning just because you might need it later.

Dex
May 26, 2006

Quintuple x!!!

Would not escrow again.

VERY MISLEADING!

Dominoes posted:

Python code:
import unittest

class Tests(unittest.TestCase):

    def test_things(self):
        self.assertEqual(5 == 5)
Python code:
def test_things():
    assert 5 == 5
The code present in the first example but not the second (ie class setup and method names) doesn’t contribute to the code's meaning, and requires lookup/memorization.

Python code:
import unittest

class TestThing(unittest.TestCase):

    def test_tuples_equal(self):
        self.assertTupleEqual(tuple1, tuple2)

Python code:
    def test_tuples_equal(self):
	# Spot the bug.
        assert all([a==b for a,b in zip(tuple1,tuple2)])
slightly fairer example and not even really demonstrating why you'd use unittest. use whatever makes you happy, but it's worth keeping in mind that unittest is still used for a reason

SurgicalOntologist posted:

What's it for, I've always wondered. When choosing between two APIs, and one is built around subclassing, I'm choosing the other one 90% of the time. If the advantage is to be able to use things like setup and teardown methods, I'd rather do it pytest's way. If it turns out I need something like that, add a fixture. No need to built up something more complicated from the beginning just because you might need it later.

depends mostly on preference, occasionally what you use in testing(that's the thing i couldn't think of earlier), and what your org is like i guess. i work in a mixed environment so xunit-style tests are a plus, if pytest rocks your world then that's cool too

edit: also depends on your views on subclassing. a lot of my code is made of subclasses so i don't really have a problem doing it in my tests

Dex fucked around with this message at 23:38 on Aug 22, 2016

SurgicalOntologist
Jun 17, 2004

Dex posted:

Python code:
    def test_tuples_equal(self):
	# Spot the bug.
        assert all([a==b for a,b in zip(tuple1,tuple2)])
slightly fairer example and not even really demonstrating why you'd use unittest. use whatever makes you happy, but it's worth keeping in mind that unittest is still used for a reason

But why not just assert a == b?

Anyways I'm specifically curious what is gained from a subclass API, not necessarily trying to join into a py.test vs unittest debate.

Thermopyle
Jul 1, 2003

...the stupid are cocksure while the intelligent are full of doubt. —Bertrand Russell

SurgicalOntologist posted:

What's it for, I've always wondered. When choosing between two APIs, and one is built around subclassing, I'm choosing the other one 90% of the time. If the advantage is to be able to use things like setup and teardown methods, I'd rather do it pytest's way. If it turns out I need something like that, add a fixture. No need to built up something more complicated from the beginning just because you might need it later.

Oh, I'm not claiming that you can't do most/all of what unittest does with other testing frameworks, I'm just talking about providing fair examples.

Nowadays, I always use unittest because it's built-in, super easy to use, and the amount of boilerplate is a little silly to complain about in this age of IDE's and code generation. I mean, depending on how you think about code and what code you normally deal with, I understand why you'd want to use something like pytest. However, you have to really hate boilerplate if the 1 extra line added to your code for each class (plus the 1 import) is enough to cause you to pull in another dependency.

People often miss the fact that modules work a lot like classes, they just move the namespace up one level of abstraction and only let you have one "class" per file.

Personally, I'm completely comfortable with classes and I don't care if my code or other code uses them as long as its consistent and is using the right tool for the job.

Classes are useful for more than just pure OO programming, they're also useful for grouping related code to help yourself and others understand what goes with what. Modules and packages serve the same purpose, but then you have to context switch to looking at your hierarchy of folders/files wherein sometimes it just makes more sense to group some stuff into a class. This purpose is why I actually always use unittest classes to make logical bundles of tests even if I'm not using any of the other features a TestCase brings to the table (you can actually use unittest with regular functions, though I've never messed with it).

Dex
May 26, 2006

Quintuple x!!!

Would not escrow again.

VERY MISLEADING!

SurgicalOntologist posted:

But why not just assert a == b?

Anyways I'm specifically curious what is gained from a subclass API, not necessarily trying to join into a py.test vs unittest debate.

depends on your test data whether that'll take an age. if you can always, always rely on a == b then unittest's assertions have nothing to offer you though, of course. if you do, then you run the risk of screwing up your assertions like i did in the second block and you now have a passing test for something that isn't right, and might not be immediately obvious to anybody why that is. there's other stuff in unittest too, it's just one of the main reasons i like it

not really sure what your problem with subclassing is so i'm going to say that's really just a preference thing on your part? i prefer subclassing and using setUpClass and setUp over scoping fixtures but it's not better, it's just how i do things. if unittest offers you nothing and pytest suits your needs, happy days, i just don't think a single import and (unittest.TestCase) is significant enough boilerplate to avoid unittest completely

SurgicalOntologist
Jun 17, 2004

Dex posted:

depends on your test data whether that'll take an age. if you can always, always rely on a == b then unittest's assertions have nothing to offer you though, of course. if you do, then you run the risk of screwing up your assertions like i did in the second block and you now have a passing test for something that isn't right, and might not be immediately obvious to anybody why that is. there's other stuff in unittest too, it's just one of the main reasons i like it

I still don't understand. What does assertTupleEqual do differently? From the docs I can't figure out any differences including efficiency. Your zip example makes me think that you have in mind short-circuiting, but a quick test shows that Python does short circuit when testing tuple equality (and I'll bite, the bug was not testing the length of the tuples. Although I still have no idea idea the point you're trying to make as to why you'd ever want to write your test like that). I mean, what's a situation where you cannot "always, always rely on a == b" but are fine testing the elements individually?

And yeah, not liking a subclass-based API is personal preference. I did once have a project where v0.1 required the user to subclass and override certain methods. I eventually changed it to having the user instantiate the class, passing in the necessary functions. It seems to make the most sense; why make a new class in order to make just a single instance of it, and presents a much more straightforward API to the user. I believe someone here pushed me in that direction but I don't remember. Anyways, I feel like a case could be made that in general, APIs should not be based on inheritance, but I am not prepared to make that case myself. Just an intuition. It feels Java-y, I guess. This is all besides the point though. I'm continuing the conversation because I want to find out what these advantages of unittest are that are being implied but not really spelled out.

Using classes to organize function as Thermopyle discusses makes sense to me, and I've done that before. And personally the boilerplate isn't what led me to py.test, rather it was just getting to write assert statements rather than method calls. But I'm honestly curious what you had in mind when you said "there's a reason unittest uses classes". Just organization?

Thermopyle
Jul 1, 2003

...the stupid are cocksure while the intelligent are full of doubt. —Bertrand Russell

SurgicalOntologist posted:

I feel like a case could be made that in general, APIs should not be based on inheritance,

API's should be based on what makes the most sense along many dimensions. Dogmatically claiming one way or the other with regards to inheritance isn't very useful.

So, I know there's this line of thinking in Python-land that says too many things are made with classes. I don't think that's quite right.

I think it's more useful to say that there are things that are designed with classes when there's no benefit and maybe even classes make them worse. There's also things that would benefit from classes.

A more useful thing to believe than "too many things are built with classes" is to think "classes are a powerful and useful tool" and then to understand them inside and out (not just the mechanics of Python classes, but OO programming in general) and then be super-aware of the fact that you might be tempted to overuse them.

Of course, my way is not as easy as being dogmatic, and maybe if you're not going to take the effort to go my way on this, you're better off just thinking they're overused...but I'm not confident about that.

SurgicalOntologist
Jun 17, 2004

Yeah, I agree with you there. There are certainly things that benefit from classes; I'm not a dogmatic "don't use classes" person. I've made a handful of libraries and most of them are heavily based on classes. But an inheritance API is more specific than simply using classes. And I only said "in general" because I didn't want to rule out specific cases where an inheritance API really does make sense. That's why I jumped into this conversation: taking you at face value that unittest is one of those cases, and being curious as to what the benefits are in this specific case. On the surface it looks a lot like a case where the OOP-ness is not really adding anything. I believe you and Dex when you say it does add something, but I would like to know what that is, for my own education. (having never really used unittest but skipped right to pytest).

SurgicalOntologist fucked around with this message at 02:43 on Aug 23, 2016

Dex
May 26, 2006

Quintuple x!!!

Would not escrow again.

VERY MISLEADING!

SurgicalOntologist posted:

I still don't understand. What does assertTupleEqual do differently? From the docs I can't figure out any differences including efficiency. Your zip example makes me think that you have in mind short-circuiting, but a quick test shows that Python does short circuit when testing tuple equality (and I'll bite, the bug was not testing the length of the tuples. Although I still have no idea idea the point you're trying to make as to why you'd ever want to write your test like that).

i've seen more or less that assert in a codebase before, is why i included it. did i write it? nope. would i write it? probably, but i'm going to pretend not. would it make through peer review? you'd hope not, but poo poo happens. as it was, that test was green and edited by somebody who didn't _quite_ understand what he was rewriting originally, so the simplified code _should_ have meant a simplified test once it started failing. alas, mistakes were made but green is green. if the standard in place is "use assertTupleEquals when asserting tuples are equal" and so on, it acts as documentation for anybody new scanning through the tests before they do anything. i think this aspect of tests gets overlooked a lot

also i just thought asserting 5 == 5 was a bit of a bullshit example when i asked "what does pytest give you that takes a load of boilerplate in unittest", so i figured i was allowed to use one too :) i've used pytest before and never really had a problem with it, it just doesn't work for me these days so i use unittest.

quote:

Anyways, I feel like a case could be made that in general, APIs should not be based on inheritance, but I am not prepared to make that case myself.

inheritance is about sharing functionality imo, nothing more. if you need functions from three classes in your class, why not just inherit from all of them(assuming your mro isn't haywire)? if those functions could serve the codebase better ripped out of their classes and put somewhere saner, do that instead. if your class requires the user to override stuff and does nothing on its own, then sure, you've probably taken a weird design decision somewhere since that's more of an interface thing. i'm sure somebody else has a fantastic use case for fully abstract classes too, i just don't use any in my own stuff and tend to manage without them

quote:

I want to find out what these advantages of unittest are that are being implied but not really spelled out.

it's built-in, structured nicely and saves me a lot of boilerplate would be the core reasons i like it - i work mostly with json apis now, stuff like self.assertDictContainsSubset is just way simpler to scan through in a PR than converting .keys and .values to sets then using issubset/issuperset, or iterating through expected and comparing to actual, or whatever other method somebody wants to use to say a[keys1-15] are somewhere in dict b and we don't care that the rest doesn't match. if you think the structure sucks and it causes you to write tons of stuff you don't need then that's fine too.

i didn't say i thought unittest uses classes for a reason, i said people use unittest for a reason :) (if it does i'm not aware of it, never really thought too strongly about it tbh), but i do think unittest being class based is an xunit thing more than any core philosophy - which has its benefits when you work with people(or are a person) working on other languages more often than not. i think it was some recent episode of python testing(great podcast even if i'm misremembering why i started thinking about it) that had me considering whether or not the xunit origin is a bad thing for people who work mostly in python though, since a lot of the conventions violate what you'd do in your production code

edit: i should refresh pages more often

quote:

I believe you and Dex when you say it does add something, but I would like to know what that is, for my own education. (having never really used unittest but skipped right to pytest).

if you're using pytest and it's all going well you have no real reason to change imo. if you find yourself writing a lot of complicated asserts it might be worth looking at the ones built into unittest instead. that's about it. it can be worthwhile to flip between different test frameworks on different projects just for the sake of it though - doctest is pretty useless for my current work, but it was nice for another project where having the documentation executing itself gave the people using it a degree of confidence that it a) worked b) was documented clearly enough that they didn't have to learn the whole thing just to make a couple of changes later

Dex fucked around with this message at 04:19 on Aug 23, 2016

Thermopyle
Jul 1, 2003

...the stupid are cocksure while the intelligent are full of doubt. —Bertrand Russell

SurgicalOntologist posted:

Yeah, I agree with you there. There are certainly things that benefit from classes; I'm not a dogmatic "don't use classes" person. I've made a handful of libraries and most of them are heavily based on classes. But an inheritance API is more specific than simply using classes. And I only said "in general" because I didn't want to rule out specific cases where an inheritance API really does make sense. That's why I jumped into this conversation: taking you at face value that unittest is one of those cases, and being curious as to what the benefits are in this specific case. On the surface it looks a lot like a case where the OOP-ness is not really adding anything. I believe you and Dex when you say it does add something, but I would like to know what that is, for my own education. (having never really used unittest but skipped right to pytest).

I didn't really claim it was adding anything other than what I mentioned about grouping related tests. I just question that it is costing you anything.

UberJumper
May 20, 2007
woop
Can someone tell me what is the preferred practice for doing imports inside a package for third party packages?

Which is better:

# Foo/bar.py
code:
import moo

def do_something():
    # lots of code
    hats = moo.cat(....)
    # more code
Or doing this?

Foo/utils.py
code:
from moo import cat 
Foo/bar.py
code:
from .utils import cat

def do_something():
    # lots of code
    hats = cat(....)
    # more code
Basically hiding usages of the third party library moo in utils, or using the third party moo explicitly. Some people at work say the first one others say the second one. We already have a pretty big utils sub package that is more or less imported throughout the project.

UberJumper fucked around with this message at 17:38 on Aug 23, 2016

tef
May 30, 2004

-> some l-system crap ->

UberJumper posted:

Can someone tell me what is the preferred practice for doing imports inside a package for third party packages?

it depends :eng101:


quote:

# Foo/bar.py
code:
import moo

def do_something():
    return moo.cat(....)

This is good when you're doing something a lot with moo, or cat isn't a clear function

quote:

Or doing this?

Foo/utils.py
code:
from moo import cat 

This is good when you're only doing cat and it's an obvious name: from urlparse import urlparse

If you're writing a standalone executable script, from x import * is acceptable, but rarely otherwise.

quote:

Foo/bar.py
code:
from .utils import cat

def do_something():
    return cat(....)
Basically hiding usages of the third party library moo in utils, or using the third party moo explicitly. Some people at work say the first one others say the second one. We already have a pretty big utils sub package that is more or less imported throughout the project.
`
This is good when you might change the moo library later. It's heavyweight when you won't do it. It might not be worth wrapping pytz or requests, but if the API is ugly, a wrapper gives you the chance to make a smaller easier api for the things you need without coupling your code.

It's all a tradeoff.

I tend to

- wrap any big 3rd party module, especially one i'm trying out for size. Often just to set up defaults.
- use import foo over from foo import bar

but usually when i'm writing larger chunks of code, for cheap hacks i'll do whatever

UberJumper
May 20, 2007
woop

tef posted:

it depends :eng101:


This is good when you're doing something a lot with moo, or cat isn't a clear function


This is good when you're only doing cat and it's an obvious name: from urlparse import urlparse

If you're writing a standalone executable script, from x import * is acceptable, but rarely otherwise.

`
This is good when you might change the moo library later. It's heavyweight when you won't do it. It might not be worth wrapping pytz or requests, but if the API is ugly, a wrapper gives you the chance to make a smaller easier api for the things you need without coupling your code.

It's all a tradeoff.

I tend to

- wrap any big 3rd party module, especially one i'm trying out for size. Often just to set up defaults.
- use import foo over from foo import bar

but usually when i'm writing larger chunks of code, for cheap hacks i'll do whatever

Thanks!

I am not really wrapping the moo library (it is actually a fairly nice library, just has horrible method names). There is just a bunch of really cryptic method names, that are not upto the PEP8 standard (e.g. 'g_cfg_l' which literally means load_config).

It is also a library that almost every single module in our entire package (probably 100+ files), imports. So i have been debating just putting a bunch of "from moo import g_cfg_l as load_config" into utils/__init__.py then in our code just use:

code:
from .utils import load_config

UberJumper fucked around with this message at 18:15 on Aug 23, 2016

accipter
Sep 12, 2003

UberJumper posted:

Thanks!

I am not really wrapping the moo library (it is actually a fairly nice library, just has horrible method names). There is just a bunch of really cryptic method names, that are not upto the PEP8 standard (e.g. 'g_cfg_l' which literally means load_config).

It is also a library that almost every single module in our entire package (probably 100+ files), imports. So i have been debating just putting a bunch of "from moo import g_cfg_l as load_config" into utils/__init__.py then in our code just use:

code:
from .utils import load_config

I like that last idea. The biggest thing for me is that the import is traceable and the source of the function/class is clear. If you want to rename it, do it once and have all of your references point back to that one renaming.

tef
May 30, 2004

-> some l-system crap ->

UberJumper posted:

Thanks!

I am not really wrapping the moo library (it is actually a fairly nice library, just has horrible method names). There is just a bunch of really cryptic method names, that are not upto the PEP8 standard (e.g. 'g_cfg_l' which literally means load_config).

It is also a library that almost every single module in our entire package (probably 100+ files), imports. So i have been debating just putting a bunch of "from moo import g_cfg_l as load_config" into utils/__init__.py then in our code just use:

code:
from .utils import load_config

never have a file called utils

put it in a file called "moo_wrapper" or "mootools" or "mooutils"

never have a file called utils

it's a broken window and it will attract detritus

Thermopyle
Jul 1, 2003

...the stupid are cocksure while the intelligent are full of doubt. —Bertrand Russell

tef posted:

never have a file called utils

put it in a file called "moo_wrapper" or "mootools" or "mooutils"

never have a file called utils

it's a broken window and it will attract detritus

easy fix: helpers.py

Death Zebra
May 14, 2014

I'm writing a program to try and filter out duplicate job adverts out of my search results.

1) This obviously means going through a lot of pages. Is there some standard procedure for this not being interpreted as an attack e.g. using the sleep command for a certain randomised amount of time before moving on to the next page like a human who was actually reading said page would?

2) Does the close command do a good enough job of closing a page?

e.g.
code:
x = urllib.urlopen('url')
x.close()

Or should I get Qpython3 from Git Hub so I can use context library (or whatever it's called)?

Thermopyle
Jul 1, 2003

...the stupid are cocksure while the intelligent are full of doubt. —Bertrand Russell

Death Zebra posted:

I'm writing a program to try and filter out duplicate job adverts out of my search results.

1) This obviously means going through a lot of pages. Is there some standard procedure for this not being interpreted as an attack e.g. using the sleep command for a certain randomised amount of time before moving on to the next page like a human who was actually reading said page would?

2) Does the close command do a good enough job of closing a page?

e.g.
code:
x = urllib.urlopen('url')
x.close()
Or should I get Qpython3 from Git Hub so I can use context library (or whatever it's called)?

1) Just depends on whatever anti-scraping measures said site has implemented.
2) Use the requests library to save yourself time and hassle.

Series DD Funding
Nov 25, 2014

by exmarx
Look up robots.txt and follow it

onionradish
Jul 6, 2006

That's spicy.

Death Zebra posted:

I'm writing a program to try and filter out duplicate job adverts out of my search results.

1) This obviously means going through a lot of pages. Is there some standard procedure for this not being interpreted as an attack e.g. using the sleep command for a certain randomised amount of time before moving on to the next page like a human who was actually reading said page would?

Here's a very recent blog post on that topic. It describes best practices then shows how to implement them in Scrapy, but they're easy to implement no matter what library or tools you're using to scrape.

How to Crawl the Web Politely with Scrapy

mike12345
Jul 14, 2008

"Whether the Earth was created in 7 days, or 7 actual eras, I'm not sure we'll ever be able to answer that. It's one of the great mysteries."





I'm trying to run iruby notebook inside a virtualenv on a linux vm. Basically I probably need to forward port 8888 to all other devices, so the Windows host can access it, is my guess. I did a similar thing to get jupyter working in a docker container.

But virtualenv, I have no clue. I tried googling for two hours, but virtualenv is a very common name with too many similarities, I give up.

SurgicalOntologist
Jun 17, 2004

What is it exactly that you're having trouble with? Starting the notebook? For accessing it outside the VM the virtualenv should be irrelevant. A virtualenv is just a folder and some path manipulations.

mike12345
Jul 14, 2008

"Whether the Earth was created in 7 days, or 7 actual eras, I'm not sure we'll ever be able to answer that. It's one of the great mysteries."





SurgicalOntologist posted:

What is it exactly that you're having trouble with? Starting the notebook? For accessing it outside the VM the virtualenv should be irrelevant. A virtualenv is just a folder and some path manipulations.

I can't access it from Windows, can access it from inside the vm. I found this blog http://gisellezeno.com/tag/virtualenv.html
that says to forward it using ssh? ssh -L 54321:localhost:54321 user@server not sure but anyway it's not working for me

SurgicalOntologist
Jun 17, 2004

Yes, you already mentioned something about forwarding the port in order to get access inside the VM. SSH is one way to do that. I'm not a Windows person so I don't know if you can do that using putty, but as you said in your original post 8888 is the port you need to forward. The post you linked manually specifies 54321 when launching the notebook which is why they're forwarding that port.

But you seemed to be already aware you needed to forward 8888, so I was assuming you already had a way to do so... in any case this is all purely to do with the VM and not the virtualenv. If the notebook is launched the virtualenv's relevance is done.

Dex
May 26, 2006

Quintuple x!!!

Would not escrow again.

VERY MISLEADING!

mike12345 posted:

I can't access it from Windows, can access it from inside the vm. I found this blog http://gisellezeno.com/tag/virtualenv.html
that says to forward it using ssh? ssh -L 54321:localhost:54321 user@server not sure but anyway it's not working for me

forwarding from something bound to localhost only like that requires your firewall and selinux policies to allow it. the official docs might make more sense https://ipython.org/ipython-doc/3/notebook/public_server.html#notebook-public-server

but yeah, virtualenv has nothing to do with ports which is why google was leading you nowhere

mike12345
Jul 14, 2008

"Whether the Earth was created in 7 days, or 7 actual eras, I'm not sure we'll ever be able to answer that. It's one of the great mysteries."





Ah, ok. I thought virtualenv creates a virtual container with its own interface. Hmm, then I don't know. I mean docker works.

mike12345
Jul 14, 2008

"Whether the Earth was created in 7 days, or 7 actual eras, I'm not sure we'll ever be able to answer that. It's one of the great mysteries."





Dex posted:

forwarding from something bound to localhost only like that requires your firewall and selinux policies to allow it. the official docs might make more sense https://ipython.org/ipython-doc/3/notebook/public_server.html#notebook-public-server

but yeah, virtualenv has nothing to do with ports which is why google was leading you nowhere

yeah, a simple

c.NotebookApp.ip = '*'

did the trick.

thanks!

Dominoes
Sep 20, 2007

What resources do you recommend to learn debugging? I've been coding for a few years, but have never tried, or understood it. You set break points, and it tells you what values variable has without using temporary print statements?

Adbot
ADBOT LOVES YOU

Thermopyle
Jul 1, 2003

...the stupid are cocksure while the intelligent are full of doubt. —Bertrand Russell

Dominoes posted:

What resources do you recommend to learn debugging? I've been coding for a few years, but have never tried, or understood it. You set break points, and it tells you what values variable has without using temporary print statements?

Do you use PyCharm? It's got great debugging stuff built-in and I can say more about them if you do.

If not (well, even if you are), pdb (well, really ipdb because iPython is the best), is pretty simple.

The very simplest thing you can do thats beyond littering print statements throughout your code is:

Python code:
# some code

import ipdb; ipdb.set_trace()

#something troubling here
When you run your code, it will stop at the ipdb line and drop you into a special command line thingamabob. You can inspect values and start stepping line-by-line through your code. A quickly googled reference to some of the commands you can use.

  • Locked thread