|
vikingstrike posted:I’m a conda user too that would be up for trying something new that’s supported more generally by the language. Couple of questions: To install packages, you just type pipenv install numpy while you are in your projects directory. It will download the wheel and install it in the correct virtualenv for your project. I just tried numpy on my Windows machine (traditionally where you have the most problems with compiled packages): code:
|
# ? May 15, 2018 15:09 |
|
|
# ? May 30, 2024 13:51 |
|
He asked specifically about MKL linked numpy though. In my experience MKL >>> every other implementation so its kinda important/worth it. conda makes it super easy without having to mess around with $LD_LIBRARY_PATH etc. e: Not saying pipenv is universally bad or that conda is universally better or anything. Just that there is a use case for conda that pipenv doesn’t seem to solve. e2: To use the intel python distribution you just do code:
Boris Galerkin fucked around with this message at 19:13 on May 15, 2018 |
# ? May 15, 2018 19:07 |
|
Oh, I skimmed right over the MKL part.
|
# ? May 15, 2018 19:15 |
|
Though, if you wanted to use pipenv for some reason, you can of course download the numpy-mkl wheel and install it by pipenv install numpy-mkl.whl. To be clear, I'm very torn between several of these attempted solutions. For example, I do not like that pipenv defaults to wildcard versions on installed packages. It should pin the version number by default. In fact, poetry seems to do a lot better with regards to a lot of this stuff. Unfortunately, I like the transparent virtualenv handling that pipenv does so I'm constantly torn between the two. edit: oh another thing that poetry does better than pipenv...you don't have to manage a Pipfile (or requirements.txt) in addition to a setup.py file. edit2: Like I said on that reddit thread, I think its better that a good-enough tool becomes a standard in the python community soon rather than a perfect tool at some later time. Thermopyle fucked around with this message at 19:37 on May 15, 2018 |
# ? May 15, 2018 19:25 |
|
Apparently you can also do thiscode:
|
# ? May 15, 2018 19:42 |
|
Thermopyle posted:Though, if you wanted to use pipenv for some reason, you can of course download the numpy-mkl wheel and install it by pipenv install numpy-mkl.whl.
|
# ? May 15, 2018 21:42 |
|
I am looking into using asyncio coroutinea and I believe the normal event loop is designed to block the thread until all coroutines finish. In particular, I am fixed on run_until_complete. Is this true? If so, has anybody seen an alternate scheduler that is meant to be manually ticked instead?
|
# ? May 15, 2018 22:05 |
|
Is there a reason you don't want to run the event loop in its own thread?
|
# ? May 15, 2018 22:45 |
|
I'm struggling to work out the best way to structure my project. It'll be written in Python, but feel free to point me to a different thread if I should be using that instead. There is a database of about 200 websites. The code will.. 1. Check the database and get the list of sites that it is supposed to be checking based on a value. 2. Access each of those websites and perform 2 functions, load() and scrape() for some Selenium-based magic to scrape some data. 3. Write that data back to the database. Conceptually the simplest is the bruteforce option of ignoring the database and writing 200 very similar python files and then manually running them all. However, that sounds like a very tedious idea, and not at all scalable in the way this should be. The next level of involves a ridiculous if elif elif elif elif else statement saying "if the site is x then run this script elif ..." all the way through the 200 sites. Also, a bad idea. The best idea (probably bad) idea that I've got is to write the load() function, store it in the database, call it and then exec(load_script), but I assume that's bad too.
|
# ? May 15, 2018 22:56 |
|
How similar are the sites you need to scrape? How often do they change format?
|
# ? May 15, 2018 23:06 |
|
Curious if anyone has an idea how to approach something I want to code. I have data that look like this: Ultimately, I want to give the user the ability to select a region on the map and calculate the RMS. For now, I need to figure out how to draw the circle (or rect) and retain the coordinates. I've been looking at the matplotlib event handling stuff but I'm not quite following it.
|
# ? May 15, 2018 23:51 |
|
vikingstrike posted:How similar are the sites you need to scrape? How often do they change format? Very. I'll be logging in to each site and there's not going to be much standardisation. There are updates every now and then but infrequently enough that once I get it all done at first I don't mind tweaking it when they change.
|
# ? May 16, 2018 00:46 |
|
Sad Panda posted:Very. I'll be logging in to each site and there's not going to be much standardisation. If they are all different, the scraping problem is going to be hard to generalize. Hopefully you can generalize it to a few cases and go from there. Although, not knowing more about what you're trying to scrape and how it is presented, this may not be as annoying as it appears at first read. Thermopyle and Boris Galerkin, thanks for the responses. I'm probably going to stick with conda for the time being since it suits my needs quite well. I don't do anything to complex, so just having two environments for 2.7 and 3.6 along with my standard data science/data analysis libraries does it for me. Now that pycharm allows for separate conda environments per project, I may play around with that, but I never have to package code to distribute it, so I don't think I'm doing anything super terrible. I know that pandas, for example, can change code and it will alter the output at times of functions, which would be handled by a single environment per project, but this isn't common and is usually really easy to spot. Maybe for my next round of work, I'll change things up.
|
# ? May 16, 2018 01:06 |
|
Sad Panda posted:Very. I'll be logging in to each site and there's not going to be much standardisation.
|
# ? May 16, 2018 01:14 |
|
Thermopyle posted:Is there a reason you don't want to run the event loop in its own thread? Most every line of code will interact with data in the original thread. It is not performance-intensive; I just want to be able to write them out without throwing around a bunch of context objects.
|
# ? May 16, 2018 01:19 |
|
Lpzie posted:Curious if anyone has an idea how to approach something I want to code. Check out holoviews and/or Bokeh. http://holoviews.org/reference/apps/bokeh/selection_stream.html#bokeh-gallery-selection-stream https://demo.bokehplots.com/apps/selection_histogram
|
# ? May 16, 2018 05:04 |
|
Rocko Bonaparte posted:Most every line of code will interact with data in the original thread. It is not performance-intensive; I just want to be able to write them out without throwing around a bunch of context objects. I'm having a hard time imagining your question even making conceptual sense, but that's probably just me being slow. Like, how do you imagine an event loop not running in another thread while at the same time not blocking the current thread? FWIW, there's a list of some alternative event loops here.
|
# ? May 16, 2018 16:52 |
|
I'm setting up a static website, hopefully so it's community editable. So I'm thinking markdown and GitHub. I only really know Python, so I went for Pelican, but now I'm wondering if I shouldn't just suck it up and use Jekyll. Thoughts from the more experienced people ITT?
|
# ? May 16, 2018 18:05 |
|
Are you sure you want something like that instead of a wiki? A wiki is the first thing I thought of when you said "community editable".
|
# ? May 16, 2018 18:28 |
|
Thermopyle posted:Are you sure you want something like that instead of a wiki? A wiki is the first thing I thought of when you said "community editable". Also I want to force feed GitHub and markdown down their throats anyways
|
# ? May 16, 2018 18:32 |
|
Has everyone ditched os.path in favor of pathlib? For new code/projects would you treat os.path as deprecated or would you just use whatever is easier for you?
|
# ? May 16, 2018 21:06 |
|
mr_package posted:Has everyone ditched os.path in favor of pathlib? For new code/projects would you treat os.path as deprecated or would you just use whatever is easier for you?
|
# ? May 16, 2018 21:25 |
|
I use pathlib 100% of the time now.
|
# ? May 16, 2018 21:50 |
|
I have never used pathlib, is it standard in 3.x or would you need to install it with pip
|
# ? May 17, 2018 04:32 |
|
QuarkJets posted:I have never used pathlib, is it standard in 3.x or would you need to install it with pip Standard as of 3.4
|
# ? May 17, 2018 04:54 |
|
What's the use case for os.path or pathlib? If you're trying to load up a file included in your app/package why not include it in your setup.py as package_data and then use code:
e: I see this in the pathlib docs code:
Boris Galerkin fucked around with this message at 07:02 on May 17, 2018 |
# ? May 17, 2018 06:57 |
|
Boris Galerkin posted:What's the use case for os.path or pathlib? If you're trying to load up a file included in your app/package why not include it in your setup.py as package_data and then use There are lots and lots of useful ways to use os.path (and presumably pathlib) but you could easily never encounter a need for it. Sometimes you're not just working with files that come up with app/package, sometimes you want to derive output file names from input file names, or do something in the same directory as a file, or sometimes you want to check whether a file or directory even exists before trying to do something with it, etc quote:I see this in the pathlib docs I agree, this seems convoluted and bad. I assume (hope?) that people here aren't using it that way
|
# ? May 17, 2018 08:15 |
|
pathlib is nice if you need to navigate around files and directories a lot, but if you're dealing with a fixed path then you don't really have much use for these Path objects. It also abstracts away a lot of OS specific stuff. os.path can do much of the same things but pathlib has a much nicer API imo
|
# ? May 17, 2018 10:41 |
|
QuarkJets posted:I agree, this seems convoluted and bad. I assume (hope?) that people here aren't using it that way This specific case of "fixed path / string / string" does seem convoluted and bad, but I've never really seen anyone do that in practice. I use pathlib very extensively, and I use the overloaded / operator extremely often. It's really nice when having a Path representing a user-supplied or dynamic directory and wanting to refer to some hardcoded filename in that directory. pathlib became much nicer to use in 3.6 also, since builtin IO functions now support path-like objects in addition to string paths. This means you can now do: Python code:
Python code:
|
# ? May 17, 2018 12:37 |
|
It always felt to me they overloaded / because os.path.join() tends to be quite verbose for what it does.
|
# ? May 17, 2018 12:46 |
|
Boris Galerkin posted:What's the use case for os.path or pathlib? If you're trying to load up a file included in your app/package why not include it in your setup.py as package_data and then use You use pathlib or os.path to do things to files. Sometimes you need to get the directory name of a file so you can write a file into the same directory. Sometimes you need to walk all the files in a directory tree. There's a ton of uses. Overloaded division operator works well if you can get past the initial "culture shock". It's more succinct and by abstracting away the edge cases of joining path parts it makes your program clearer.
|
# ? May 17, 2018 16:18 |
|
I've got a real puzzling guy on my hands right now, could use outside thoughts. Say we have some string: '\x1b[1mThis is the string with some random ANSI garbage.\x1b(B\x1b[m' That string has a length of n, but a printed length of x, where x is len(This is the string with some random ANSI garbage.). I need to split lines whose printed len is > terminal width in order to display them properly. My gut instinct is that I should strip all of the ANSI escape sequences from the strings, then use something like textwrap.wrap, then reinsert the ANSI stuff. This would involve some kind of tricky index tracking, and would be generally annoying. Is there any simpler/more elegant way any of y'all can think of to solve this here problem?
|
# ? May 18, 2018 00:28 |
|
vikingstrike posted:It always felt to me they overloaded / because os.path.join() tends to be quite verbose for what it does. It seems about the same level of verbosity to me. It basically swaps a function call and commas for a constructor and slashes Python code:
|
# ? May 18, 2018 00:57 |
|
The March Hare posted:I've got a real puzzling guy on my hands right now, could use outside thoughts.
|
# ? May 18, 2018 01:07 |
|
Master_Odin posted:I'd probably subclass textwrap.TextWrapper to work mostly the same, just in that it ignores ANSI escape sequences in calculating the length of a given work chunk. Yep, that's a good idea.
|
# ? May 18, 2018 01:21 |
|
Master_Odin posted:As vikingstrike said, if it's possible to generalize, you should try and do that. But if each site is unique enough that they require their own parser, your best bet is to store the scraping code in a DB or dynamically load it from a directory (I'd chose the latter as it'll be way easier to update and version control). But even in those cases, see if there isn't generalized helper functions you can write for it. Well for example on one of the sites, my Selenium code looks like... Python code:
For the balance, it is usually visible, but the way of finding the element is always going to be different and also the part retrieved. That uses [1:] because the balance on that site is "£29.01" and I want to strip the £. What do you mean by dynamically loading from the directory? This is the first project I've done that is anything more than a few static python files in a folder.
|
# ? May 18, 2018 12:32 |
|
In numpy (or some interoperable library like pandas) is there a way to create basic shapes like triangles or circles in an ndarray/tensor/table/whatever? I'm featurizing a space of entities represented like {pos = (x,y) , shape = Triangle{base_len = 1.5}, ...} or whatever and while I'm entirely capable of writing this myself* it would be nice to be able to just have python do it. Specifically I have C channels, each of which represent a different aspect, such as hitpoints or a one-hot encoding of things like faction or unit type (domain is a custom RTS), and I'm trying to fill all of these channels based on the current state of the gameboard, but they have different primitive shapes and approximating them all as rectangles would overfill too much. I'm essentially looking to "rasterize" the (continuous-valued) game state, but with data other than colors, and then feed it into a NN. (I'm researching a way to represent this data as different sorts of NN inputs, but I need this crude representation for the time being as a baseline). Unfortunately searching for anything relating to words like "triangle" or "shape" and "numpy" yield results about the shape property or triangular matrices so it's hard to find relevant info. * In fact, I did write this myself in the engine in another language, but it's plagued with issues trying to intuit the "true" shape of the tensor from the python side, and I'm trying to move this to only sending over entity data as described and featurize it into an ndarray on the python side. Linear Zoetrope fucked around with this message at 18:54 on May 18, 2018 |
# ? May 18, 2018 18:51 |
|
Thermopyle posted:I'm having a hard time imagining your question even making conceptual sense, but that's probably just me being slow. Like, how do you imagine an event loop not running in another thread while at the same time not blocking the current thread? Now it's not like I'll have a coroutine yield for 5 seconds and then never get back to it. The idea is that I'd be ticking the event system in the regular Unity 3d event loop. So at least once every frame, everything gets a shot.
|
# ? May 19, 2018 04:56 |
|
I'm serializing classes in JSON to send them over an Ajax request and load those as data for a network graph in a web page. For example: Python code:
I have trouble defining the edges because the library uses "from" and "to" keys, which are reserved words in python. For example this is a valid edge declaration: JavaScript code:
Python code:
e: I've switched to changing the key names in __dict__ before calling json.dumps() unpacked robinhood fucked around with this message at 11:48 on May 19, 2018 |
# ? May 19, 2018 10:43 |
|
|
# ? May 30, 2024 13:51 |
|
unpacked robinhood posted:I'm serializing classes in JSON to send them over an Ajax request and load those as data for a network graph in a web page. Rename the arguments.
|
# ? May 19, 2018 11:36 |