|
Thermopyle posted:Of course, Google also has a real hard-on for Go, so maybe it's just the new and shiny factor? Wasn't Go also created at Google?
|
# ? Jan 4, 2017 23:02 |
|
|
# ? May 19, 2024 22:37 |
|
Thermopyle posted:I always use pip to install stuff in my conda environments I use conda wherever I can (which is almost always), particularly cause 1. I can do conda upgrade --all, 2. it'll take care of dependencies while respecting my MKL numpy.
|
# ? Jan 4, 2017 23:04 |
|
Cingulate posted:Why do you do that? Because conda is usually (at least it used to be, I haven't checked in a while) behind on versions of packages I use. Lots of packages aren't on conda, so then I use pip anyway and it is (was?) a pain to maintain packages with both conda and pip. requirements.txt is widely used, so I can easily use upstream generated requirements.txt or downstream can use my requirements.txt. And it's not very good practice to upgrade all your packages in one fell swoop, you should pin your packages at specific versions and only upgrade packages that you need to. Basically, I've found no upside to using conda-provided packages and some downsides.
|
# ? Jan 4, 2017 23:41 |
|
Thermopyle posted:Because conda is usually (at least it used to be, I haven't checked in a while) behind on versions of packages I use. Lots of packages aren't on conda, so then I use pip anyway and it is (was?) a pain to maintain packages with both conda and pip. requirements.txt is widely used, so I can easily use upstream generated requirements.txt or downstream can use my requirements.txt. Do you work much with the numpy/scipy? I found that these basically require using conda for package management on Windows.
|
# ? Jan 5, 2017 00:29 |
|
accipter posted:Do you work much with the numpy/scipy? I found that these basically require using conda for package management on Windows. Not at all, but I do all my work in Ubuntu virtual machines on my Windows host machine, so I guess I probably wouldn't have a problem anyway. If for some reason I had to work on a Windows machine with numpy, I'd probably use conda to install numpy and still use pip for everything else.
|
# ? Jan 5, 2017 00:36 |
|
I can't think of a simple way to accomplish the following: I have a CSV file that varies in size (currently 25k rows), with each line being a pair of names. I want to count how many times that each pair of names appear throughout the file, regardless of their order: "James,Sarah" is equal to "Sarah,James" for counting purposes. Can I just convert the string to some sort of numerical value, then count how many times each numerical value appears in the file?
|
# ? Jan 5, 2017 01:44 |
|
Hughmoris posted:I can't think of a simple way to accomplish the following: code:
Cingulate fucked around with this message at 01:55 on Jan 5, 2017 |
# ? Jan 5, 2017 01:50 |
|
Cingulate posted:Like this? I don't think that works - what if you have (James, Sarah) and (Julie, James)? If I understand your code correctly, you would have counts['James'] as 2, and counts['Julie'] and counts['Sarah'] as 1, right? Whereas I believe HughMorris wants (James, Sarah) to have a count of 1, and (Julie, James) to have a count of 1. I think this will work (although it may be slow?). code:
Jose Cuervo fucked around with this message at 02:18 on Jan 5, 2017 |
# ? Jan 5, 2017 02:05 |
|
Jose Cuervo posted:I don't think that works - what if you have (James, Sarah) and (Julie, James)? If I understand your code correctly, you would have counts['James'] as 2, and counts['Julie'] and counts['Sarah'] as 1, right? Whereas I believe HughMorris wants (James, Sarah) to have a count of 1, and (Julie, James) to have a count of 1. code:
You can do it more elegantly (... particularly with the dict ordering in python 3.6 ...), but maybe it's a start. As a one-liner: code:
code:
Cingulate fucked around with this message at 02:23 on Jan 5, 2017 |
# ? Jan 5, 2017 02:18 |
This should do it. The trick is to sort your name-tuples. Here I use a simple comparison rather than sorted() because that way we don't need to go tuple > list > tuple - we can just go tuple -> tuple. The answer to whether you can convert to some numerical value and count how many times that occurs is 'yes', that's how the __hash__() magic method works under the hood when you index by tuples in a dictionary. Cingulate's solution is just fine, but I prefer using the CSV module to read CSVS rather than splitting and joining on my own. I think this is the fastest and best solution so far, but they're all basically fine. Python code:
Eela6 fucked around with this message at 03:23 on Jan 5, 2017 |
|
# ? Jan 5, 2017 02:45 |
|
Cingulate posted:As a one-liner: Eela6 posted:
I'll give these a go. Thanks!
|
# ? Jan 5, 2017 04:28 |
|
CPython is really, really bad and even with the new "asyncio" stuff even something like node.js can run circles around it. Python's threading model is the same as Ruby's, which is to say: non-existent. Any object can touch theoretically touch any object. I like GrumPy's approach because all those restrictions don't matter for good real-world code anyway. PyPy went batshit crazy trying to support all the Python use-cases whereas GrumPy just cares about "Python, the good parts". Perhaps the next thing GrumPy should do is add a better threading model to Python, and maybe some of the newer features like async support, and then we'll get a cool, modern Python 2 runtime.
|
# ? Jan 5, 2017 06:07 |
|
Python the good parts apparently doesn't include the standard library
|
# ? Jan 5, 2017 14:29 |
|
QuarkJets posted:Wasn't Go also created at Google? Yeah Suspicious Dish posted:CPython is really, really bad and even with the new "asyncio" stuff even something like node.js can run circles around it. I said this in a different thread, I think, but Mozilla says you can get a pretty good performance boost by compiling Python to Web Assembly or asmjs (or whatever it's called this year) and running the result on SpiderMonkey. It's probably not a terribly fair comparison, though, because JS interpreters don't have to care about threading or C interop. Still kinda neat, though.
|
# ? Jan 5, 2017 15:05 |
|
more like dICK posted:Python the good parts apparently doesn't include the standard library Yes, that's correct.
|
# ? Jan 5, 2017 22:44 |
|
Another day, another Bokeh: https://bokeh.github.io/blog/2017/1/6/release-0-12-4/
|
# ? Jan 9, 2017 18:45 |
|
code:
|
# ? Jan 10, 2017 19:27 |
Cingulate posted:
Not with numpy's logical indexing, I don't think. If you're just doing a simple list or generator comprehension you could, eg: (x for x in function_that_generates_d() where condition(x)) But if you're doing logical indexing I believe you need to assign at least once. Is there a specific reason you need not to assign d, or are you just curious?
|
|
# ? Jan 10, 2017 19:34 |
|
Eela6 posted:Not with numpy's logical indexing, I don't think. If you're just doing a simple list or generator comprehension you could, eg: There is no burning need - I'm just curious. I basically just wanted to do something like code:
|
# ? Jan 10, 2017 20:04 |
|
Is this just to have it be done in a one-liner?
|
# ? Jan 10, 2017 21:38 |
|
QuarkJets posted:Is this just to have it be done in a one-liner?
|
# ? Jan 10, 2017 21:40 |
|
Variables are cheap.
|
# ? Jan 10, 2017 21:44 |
|
more like dICK posted:Variables are cheap. Also, in cases where the array is large and the rest of the function is long, I have to explicitly delete the reference to free up the memory, or do something like code:
|
# ? Jan 10, 2017 21:57 |
|
Cingulate posted:Not necessarily cognitively. I don't understand. Do you mean that the code becomes harder to read by having to assign the array to a variable before indexing with it? I think a complicated one-liner is actually the more difficult option If you are indexing the array with itself, that creates a new array, and you can assign that to the same variable name; no need to delete anything, as nothing refers to the old array then QuarkJets fucked around with this message at 22:11 on Jan 10, 2017 |
# ? Jan 10, 2017 22:09 |
|
QuarkJets posted:I don't understand. Do you mean that the code becomes harder to read by having to assign the array to a variable before indexing with it? I think a complicated one-liner is actually the more difficult option QuarkJets posted:If you are indexing the array with itself, that creates a new array, and you can assign that to the same variable name
|
# ? Jan 10, 2017 22:16 |
|
Right. And for the earlier post, instead of assigning the result to "out" you assign it to "d" (and ideally you don't use a single character variable name like "d" but that's just me)
|
# ? Jan 10, 2017 22:36 |
|
I mostly use conda nowadays just so I can easily install different python versions for different projects. I can't remember the last time I used it to install an actual package that was hard to install on my OS. I've been thinking about switching to pyenv + virtualenv/venv. Does anyone have any thoughts about conda vs pyenv + virtualenv/venv they'd like to share?
|
# ? Jan 11, 2017 01:31 |
|
Does the python len() function actually count every item in the list when it's called, or does it use some shortcut?
|
# ? Jan 11, 2017 15:49 |
|
According to this post, the length of a list is stored in the list object and it looks like len() just returns that value (in CPython, at least.)
|
# ? Jan 11, 2017 15:56 |
|
jon joe posted:Does the python len() function actually count every item in the list when it's called, or does it use some shortcut? Objects representing a finite collection are generally supposed to adhere to a protocol whereby they define a specially-named method, __len__(), which is called by len(). That way, it is up to a given collection class to implement its own response to len(). A given collection class might cache the number of elements so that it can respond quickly when asked how big it is, or it might just laboriously count its items every time; it's up to the implementer.
|
# ? Jan 11, 2017 17:59 |
|
jon joe posted:Does the python len() function actually count every item in the list when it's called, or does it use some shortcut? How would it know when to stop counting if it doesn't know how many items are the list?
|
# ? Jan 11, 2017 18:39 |
|
Suspicious Dish posted:How would it know when to stop counting if it doesn't know how many items are the list? I don't know how much of Python's standard types are defined by spec or left up to implementation; wouldn't it be technically possible for list to be backed as a linked list?
|
# ? Jan 11, 2017 18:41 |
Asymmetrikon posted:I don't know how much of Python's standard types are defined by spec or left up to implementation; wouldn't it be technically possible for list to be backed as a linked list? Python has no formal spec; it's defined by implementation. I made a cursory search of the Python Language reference and I don't see anything stopping you except good taste.
|
|
# ? Jan 11, 2017 19:05 |
|
I am getting into web scraping and trying out BeautifulSoup for the first time. Here is my code for getting the name and address of a particular gas station, along with the prices of all kinds of gas it has information for.Python code:
|
# ? Jan 11, 2017 20:00 |
|
Jose Cuervo posted:I am getting into web scraping and trying out BeautifulSoup for the first time. Here is my code for getting the name and address of a particular gas station, along with the prices of all kinds of gas it has information for. The select function is a lot more useful as you won't have to work too much with the specific tags unless the HTML is particularly atrocious. Python code:
|
# ? Jan 11, 2017 21:07 |
|
StormyDragon posted:The select function is a lot more useful as you won't have to work too much with the specific tags unless the HTML is particularly atrocious. Wow, that looks much nicer than what I was doing. Questions: 1. I was reading through the BeautifulSoup documentation for using .select and did not see where something like .select("[itemprop=streetAddress]") was defined. How did you know to do that? EDIT: Ah, this must be 'Find tags by attribute value'. 1.b. And how do you know when to use # vs . ? For example in soup.select("#prices .credit-box .price-display")? 1.c. And why does soup.select("#prices .fuel-type .section-title") return an empty list and not the same list as soup.select("#prices .section-title")? EDIT: Is this because .fuel_type and .section-title are both class values of the same div? soup.select("#prices .credit-box .price-display") works because it first locates the 'prices' div, then the 'credit-box' div inside the 'prices' div, and finally the 'price-display' div that is inside the 'credit-box' div, right? 2. When creating the type_price dictionary, does BeautifulSoup guarantee that you will have things returned in the order found on the page? That is, will the order of items in the list returned by soup.select("#prices .credit-box .price-display") necessarily be the same as the order of items in the list returned by soup.select("#prices .credit-box .price-time")? Jose Cuervo fucked around with this message at 22:48 on Jan 11, 2017 |
# ? Jan 11, 2017 22:25 |
|
They're on there, that documentation is kinda dense and hard to get a clear overview of what's available, I think https://www.crummy.com/software/BeautifulSoup/bs4/doc/#css-selectors
|
# ? Jan 11, 2017 22:35 |
|
Jose Cuervo posted:1.b. And how do you know when to use # vs . ? For example in soup.select("#prices .credit-box .price-display")? Read about CSS selectors.
|
# ? Jan 11, 2017 22:45 |
|
I usually end up with a pattern that looks like this:Python code:
|
# ? Jan 11, 2017 23:06 |
|
|
# ? May 19, 2024 22:37 |
|
Jose Cuervo posted:2. When creating the type_price dictionary, does BeautifulSoup guarantee that you will have things returned in the order found on the page? That is, will the order of items in the list returned by soup.select("#prices .credit-box .price-display") necessarily be the same as the order of items in the list returned by soup.select("#prices .credit-box .price-time")? You will always get items in the order they were found in the html hierarchy that the selector is navigating. Though you have to be careful that the page doesn't omit values otherwise you get instances of zipping up items that don't match. I did try using the class selector .credit-price but it turns out that this class is not attached to price displays without a price, fortunately there were another css class on there.
|
# ? Jan 11, 2017 23:16 |