|
LuckySevens posted:Oops, I just copy and pasted his style instead of rewriting, which was silly. What are you trying to achieve with this isinstance that isn’t achieved by just having the one dataclass inherit from another? Is there something you want to do in __init__ that the dataclass isn’t doing? Imo you’re exposing too much of the implementation here. traverse_tree and stock_tree should have a leading underscore to indicate they’re ‘private’ and calc_price should be the only ‘public’ method (Might also want to rename that to something like calc_price_using_tree, since there are other methods to calculate an option price and you might want to extend the class to include those) Also the trees used in binomial option pricing method aren’t especially complicated and can easily be represented with an array and some moderately clever indexing - generic tree data structures aren’t necessarily (and could be quite tricky to adapt, since nodes in the binomial tree can have two parents). If you wanted you could create a generic price_option_using_tree() method in the base class that calls the methods of the derived class(es) to calculate the payoffs in each node.
|
# ¿ Feb 7, 2020 13:30 |
|
|
# ¿ Apr 28, 2024 20:16 |
|
Hollow Talk posted:Also, depending on your python version (anything >=3.6 I think?), I would replace the string format stuff for logging with f-strings, which are both faster and more readable. The logging module’s printf-style formatting can be even faster, since it won’t evaluate the message if the logging level means it’s not going to be logged. Of course this makes your logging calls a lot uglier.
|
# ¿ Mar 2, 2020 20:27 |
|
fuf posted:I'm confident I'll get the things I need eventually, it's just a slow process!
|
# ¿ Mar 4, 2020 14:40 |
|
From what you’re describing it doesn’t sound like putting the data into a SQL database is going to help you. It might be a better option for storing/retrieving the data in future (though you should think long and hard before creating a table with 200,000 columns), but if you’ve already got the data in a pandas data frame and don’t know what to do next it is not going to help. Some more detail would be useful. What is the structure of your data and what analysis are you trying to run on it?
|
# ¿ Apr 6, 2020 11:41 |
|
Spaced God posted:I'll try to explain in vagueries while hopefully giving the scope of the problem. I don't want anyone knowing a comedy web forum helped me be a good intern lol I can only give vague advice on this kind of vague info, but it sounds like you are trying to do something quite difficult and have a moderate-to-severe case of running before you can walk. A few things stand out: When you say that your data is 60*200000, did you get that the wrong way round or do you really have 60 rows and 200,000 columns? What does each row in your data frame represent? Is it an event where someone is called out, with date/time/location? Can you do more basic queries, such as ‘how many call outs were there on June 19 2019?’ Or ‘how many call outs were there in total on region X during 2018?’ Can you do visualisations of these? It sounds like your extraction, data cleaning and analysis steps are getting quite mixed up with each other - this is invariably a bad idea.
|
# ¿ Apr 6, 2020 22:26 |
|
Spaced God posted:Yeah this is definitely being popped out of the womb and being told to fly a spaceship. A lot of my time has been spent on stackexchange or deep in the confines of panda reference pages. It sounds like if you can identify (from lat/long co-ordinates) when a call was to out of area then you get a decent chunk of the way there? You could add an extra column ‘is_out_of_area’ to the dataframe then reuse that in later queries? One additional bit of advice I’d give is: don’t try to do too much in a single step, and don’t spend too long scouring the pandas docs hoping to find the one function that does exactly what you want. There’s no shame in just iterating over the dataframe row by row to do calculations, particularly if you’re new to both pandas and Python.
|
# ¿ Apr 9, 2020 09:53 |
|
And even if you could, so could the person sending totally_the_file_you_wanted_and_not_a_cryptolocker_check_the_hash_if_you_dont_believe_me.exe
|
# ¿ Apr 28, 2020 14:18 |
|
Phone posting so can’t post sample code, but if the only part of the file you want is:code:
|
# ¿ Sep 3, 2020 18:18 |
|
QuarkJets posted:OP just needs to use pySFDSF Take screenshots of the spreadsheet and train a neural net.
|
# ¿ Sep 4, 2020 08:45 |
|
my bitter bi rival posted:Random question: I am working through Eric Matthes Python Crash Course right now and I've gotten a lot out of it. I'm currently going through the part about importing data from a csv and plotting it with matplotlib. Using two lists has some downsides, but using a dict like this is an absolutely loving terrible idea for a number of reasons. I’ll limit myself to pointing out two. Firstly, and most obviously, if there are multiple observations on the same date then you will lose data, since only the last such observation will be retained. Secondly, pretty much every operation you might want to do on this data is significantly easier if they are in lists. For example - calculating the first differences in temperature highs is as trivial as code:
This second point touches on the related issue that by putting the data into a dict like this you are throwing away the ordering. The source data file is ordered - there is a first row, a second row, etc - whereas dicts are not, so by putting this data in a dict this ordering is lost (In this example it may be recoverable if the date stamps are ordered with no duplicates, but that’s a very big if).
|
# ¿ Oct 30, 2020 03:03 |
|
Profile it and find out what’s taking the most time, then optimise that part. Repeat until performance is acceptable. If you want more specific guidance you’ll need to post code and profiler output.
|
# ¿ Jan 5, 2021 11:00 |
|
Do you want me give you a hint to point you in the right direction or solve the whole thing for you? Assuming the former; you don’t need pandas for that. Just put the cipher text into a string and read the docs on string operations and python’s slicing syntax.
|
# ¿ Jan 11, 2021 18:52 |
|
Famethrowa posted:My code currently is attempting that, but I'm having issues with transversal. I need to slice, say, every third letter, and then once I hit the end of the string, loop back to the beginning of the string to finish the count and reslice. I could perhaps just duplicate the string many times over to achieve that, but that feels clunky as hell. You said you didn't want hand holding, but you should use the slicing notation code:
I'm not the one marking your work, but if I were I would deduct marks if you used pandas for this - it really isn't the right tool for the job. I'd give a bonus point or two if you managed to solve it in one line.
|
# ¿ Jan 12, 2021 10:02 |
|
You shouldn’t use numpy either. QuarkJets posted:I feel like there may be a clever list comprehension that could give a fast, succinct solution Not posting it since OP wants to figure things out themselves, but there’s a one-line solution.
|
# ¿ Jan 12, 2021 11:17 |
|
Zoracle Zed posted:Anyone else noted how awful it is googling for anything python-related these days? SEO means the first couple pages of results are all geeksforgeeks.com and other awful beginner tutorial spam. (Which, like, even for beginners, seem bad.) You want to know the kwargs alternative to the format string, or the location of the parser module? If the latter why not just call it with an invalid string and see where the exception gets thrown from?
|
# ¿ Mar 24, 2021 13:45 |
|
Have I misunderstood pip’s version specifiers, or is it doing something weird here? If I run code:
code:
(I originally found this problem with a requirements.txt file that had a bunch of >= dependencies, also this is installing from an Azure artefacts repository that’s mirroring PyPI if that makes any difference)
|
# ¿ Jun 17, 2021 15:05 |
|
OnceIWasAnOstrich posted:No, you are using it right. Something in the dependency resolution it is doing seems to be calling for cryptography. I would normally say that something else you have installed or are installing has a conflict with the cryptography version number. What version of Python are you using? Maybe there are no cryptography packages for your Python in that range or the repo/your environment is busted. Should have thought of doing that that myself, but tried it and found that the underlying problem is that there’s no cryptography package on this piece of poo poo azdo mirror of ours, so none of the packages can install. No idea why pip wouldn’t just say that instead of bullshitting about conflicts, but hey ho. In conclusion: gently caress computers
|
# ¿ Jun 17, 2021 16:16 |
|
Hed posted:Isn’t cryptography the one that switched to linking against rusttls recently? Possibly true - I don’t know much about it other than that msal depends on it (msal is the Microsoft authentication library - employer uses OAuth all over the shop so I have to use this package a lot) I also found that if I try to do this with two packages with missing dependencies I can send pip into an infinite loop, which is really stupid.
|
# ¿ Jun 18, 2021 10:58 |
|
Does DataFrame.interpolate() do what you need ?
|
# ¿ Jul 13, 2021 14:22 |
|
If you’re doing that you might as well write a new context manager that deletes the tempfile on completion. Phone posting so can’t provide a snippet, but could post one later if no one else obliges in the meantime…
|
# ¿ Jul 29, 2021 14:55 |
|
12 rats tied together posted:the context manager is pretty easy and is a great python feature, you basically just define __enter__ and __exit__ methods There's a simpler syntax since (I think) 3.5, using the @contextmanager decorator: Python code:
|
# ¿ Aug 2, 2021 22:29 |
|
D34THROW posted:it was just passing line_data...which was passing it a False for some reason, instead of None or whatever This means that the name line_data has been defined as False somewhere in your code. It sounds like you didn’t (deliberately) do that, but are you importing any modules using the pattern from foo import *?
|
# ¿ Sep 1, 2021 22:04 |
|
Do you want the first billion integers in a random order? numpy.random.permutation(n) will give you a random permutation of the integers up to n, but I have never tested it on a billion integers.
|
# ¿ Sep 9, 2021 17:54 |
|
Epsilon Plus posted:What I want is to get a list that sometimes starts [1, 3, 4, 7, 9...] and other times starts [1, 2, 3, 5, 8...] or maybe [2, 4, 5, 9, 12...] If that’s all you want then just generate a sequence of random integers between 1 and 5 (or whatever you want the maximum difference to be) then take the cumulative sum of that sequence ( np.cumsum will do the job ).
|
# ¿ Sep 10, 2021 00:54 |
|
AfricanBootyShine posted:I have what I think is a really simple question with numPy. I finally have a job where I can actually use it for work, but it's been years since I did any real python work so I'm a bit lost. As posters above have commented, this is easy enough to turn into a pandas DataFrame via the DataFrame.read_csv() function. This is almost certainly what you actually want to do - a NumPy 3D array is going to be a lot more awkward to retrieve the correct data from. Your data structure does look quite odd, though. Is there any reason you've arranged things as code:
code:
|
# ¿ Sep 17, 2021 21:23 |
|
code:
|
# ¿ Oct 8, 2021 16:49 |
|
setUp() and tearDown() are the recommended ways to share (de-) initialisation stages between tests in unittest, but I agree with just using pytest - apart from anything else it requires less typing (also it can run unittest tests, so no need to rewrite existing tests)
|
# ¿ Jan 7, 2022 23:40 |
|
D34THROW posted:"Why isn't this table getting populated? The data is formed well, the vars() of it looks good, everything is populated...lemme go look at the function." Use type hints. Any decent editor will warn you if a function hinted as returning a value does not return a value (and might also warn about returning an object of the wrong type, depending on how complex the definition is)
|
# ¿ Feb 7, 2022 17:50 |
|
Dawncloack posted:I have to backfill information into an SQL database, and I am writing a python script for it. I'm having a hard time believing that the data retrieval works in the way you describe, or that what you're attempting to do would even work, let alone whether it's a good idea. Before you go creating your very own entry for the coding horrors thread, a few questions. From what you've described like the system offers two interfaces to the data - get_historical_data(date) and get_realtime_data() (or similar). To me that screams that the system contains two data storage components - a real-time stream or queue and a data store populated from that stream/queue - and that get_realtime_data() simply retrieves the latest data from the stream. If that's the case then changing your system time won't get you the data you want (in addition to being horrific behaviour in and of itself). Similarly, the most likely (to me) explanation for the difference in precision between the two methods is that the system has an archive data store that was improperly set up and the data gets truncated on storage. If that's the case there's nothing you can do to recover the lost precision. Of course it's possible that get_realtime_data() is for some extremely hosed up reason getting your system time and using that to query the archive and that the truncation in the other method is happening in the interface instead of the actual storage, but all of that would require some truly galaxy-brained programming from whoever implemented the interface - so much so that I would seriously question whether any of the interfaces were even retrieving the correct data in the first place. So have you verified that changing the system time in the call to get_latest_data() actually gets you the data you want, and not (eg) just the latest data with the timestamp changed?
|
# ¿ Feb 13, 2022 12:22 |
|
Mycroft Holmes posted:I've run into a problem with my homework. I have to take the following list: There are some *very* compact ways to do this in python that it sounds like you haven't met yet (and the question hint is suggesting you not use). As this is a homework question I'll just give a couple of hints for now. To populate book_data you'll need to iterate over each entry the current list, transform it into the specified format, then append that transformed value to book_data To do the transformation you should look at either f-strings (if you've met them), or the .join() method. Edit: okay, from your answers above sounds like we need to take a step back. Look at the first element of the list you've been given. What type of object is it? How would you get the title or author from that object? DoctorTristan fucked around with this message at 19:52 on Feb 24, 2022 |
# ¿ Feb 24, 2022 19:49 |
|
punk rebel ecks posted:You lost me here. The tutorial has no "requirements.txt". Whoever wrote the tutorial might have skipped creating a requirements.txt because the project doesn’t have many dependencies. Try just running pip install flask inside the virtual env - that may be all you need. To run a python file in a virtual environment, you launch a terminal, activate the environment in that terminal (as described by posters above), then do run the file as you normally would (python -m path_to_file if you’re on windows) You absolutely can use virtual environments within VSCode - it wouldn’t be much use as a python dev environment if you couldn’t. Do ctrl-shift-P and search for the Python: Select Interpreter command - that will allow you to select the virtual environment. Now whenever you create a new terminal window within VSCode it will automatically activate the virtual environment in that window, so code running in that window will run in the virtual environment. Alternatively if you created the virtual environment within the workspace folder (which is what I do), VSCode will automatically detect it and ask if you want to use it.
|
# ¿ Apr 27, 2022 08:24 |
|
QuarkJets posted:Each line of a CSV is usually going to have a newline character at the end of it so it's a pretty safe assumption, yeah. I'm sure there are madmen out there that use some other character for some reason but that's unusual This is not a safe assumption at all for the last (data) line in the file - it’s only true there if the creator of the file ended with a blank line (which you’re supposed to do, but people frequently don’t bother).
|
# ¿ Apr 27, 2022 10:30 |
|
QuarkJets posted:I'm talking about the last character of every line, not the last line of every file Let me rephrase. If the file ends with a blank line then every line ends with a newline character. If the file does not end with a blank line, then the last line does not end with a newline character, so the assumption is false.
|
# ¿ Apr 27, 2022 18:59 |
|
Why are you using a regex instead of just slicing each input line?? Edit to be a little more helpful: if you know the widths of the columns, then I’d just call readline() in a loop to get each line out as a string and slice each string using the known fixed column widths. If you are already using pandas/don’t mind using it then the read_fwf() function will do that for you. If you don’t know the column widths and are looking for a ‘smart’ library that can infer the column widths from the file itself then afaik you’re SOL. DoctorTristan fucked around with this message at 10:07 on Jun 27, 2022 |
# ¿ Jun 27, 2022 07:04 |
|
Falcon2001 posted:I look forward to 14.000000000001 pages about floating point number formatting weirdness.
|
# ¿ Jul 14, 2022 23:28 |
|
You’re partitioning the set into its equivalence classes defined by the relation ‘the timestamps overlap’ - I don’t believe there’s any faster way to do this than the naïve brute force method. Since it looks like you’re doing it on a pandas dataframe it would probably be simpler to use a new column to keep track of the equivalence classes - phone posting so I’ll have to do it in pseudo code but the idea is: * create a new integer column of zeros, called ‘equivc’ or whatever * start with the first row, set .loc[0, ‘eqquivc’] = 1 * find every row that overlaps with a row having equivc == 1, set equivc=1 for those rows * repeat until you don’t find any more rows * now find the first row that still has equivc==0, set equivc=2 for that row, and repeat the above steps * keep going until there are no rows left with equivc == 0 Once you’re done with this then every row with equivc==1 has mutually overlapping intervals, as does every row with equivc ==2 and so on.
|
# ¿ Jul 28, 2022 18:07 |
|
I don’t have any specific advice on pyinstaller, but I will comment that ‘a bunch of people running a .exe I pass around’ is a solution that may be okay in the short term but absolutely will come back to bite you sooner or later (exactly how quickly depends a bit on how big the organisation is and how quickly requirements change). Hard to give detailed advice on what to do instead without more details on what you’re doing, but it does sound like what you *really* need is a database and some proper ETL tools.
|
# ¿ Oct 14, 2022 15:54 |
|
Jose Cuervo posted:I am looking through free text strings (short hospitalization reasons) for the word 'parathyroidectomy'. I am able to do simple string matching (e.g., looking for 'para' in example_string), but this type of searching assumes that the word has been spelled correctly and will not catch paarthyroidectomy, even though that would be a relevant result. Is there any library which would help me search these strings for misspelled matches? Fuzzywuzzy
|
# ¿ Oct 14, 2022 19:33 |
|
Josh Lyman posted:It says "KeyError: 2". I'll remove that as an argument and see if that helps. Reducing encapsulation is rarely going to help you and I really recommend you don’t do that. I’d say about 40% of the weird python errors I’ve had to help coworkers with were caused by a name collision in a huge monolithic script, where breaking it up into smaller functions would have either prevented it entirely or made the error much more obvious. It’s a bit inelegant, but have you tried catching the KeyError and setting a breakpoint inside the catch statement? That should let you inspect the variables at the point of the error and get a better idea of what’s going on. Also, what’s your source for this data? Your comment about how three notebooks simultaneously hit an error kind of makes me suspect that the issue might originate with your data source (eg a database connection dropped, or JWT token expired) but is not getting caught until later.
|
# ¿ Nov 15, 2022 08:55 |
|
|
# ¿ Apr 28, 2024 20:16 |
|
Deadite posted:Can anyone help me understand why this example ends in an error: In the first example you’re trying to call astype(float) on the string ‘Missing’
|
# ¿ Nov 24, 2022 01:43 |