|
Not sure if this is the right place to ask, but does anyone know where the phrase `Embarrassingly parallel' was first coined, or by whom? I'm using it in a maths paper and should probably cite it somehow.
|
# ¿ Sep 18, 2009 19:27 |
|
|
# ¿ May 17, 2024 16:23 |
|
I was only going to use it in a throwaway remark at the end of an unimportant section, but now I will use it all over the drat place.
|
# ¿ Sep 20, 2009 03:06 |
|
The acknowledgements section of my thesis now thanks SH/SC "for clarifying the use of the term `embarrassingly parallel' ".
|
# ¿ Sep 23, 2009 21:21 |
|
csammis posted:This is CoC, actually, so I'm going to have to write to your professors and invalidate your thesis immediately Theses are like software: it doesn't really matter if something is wrong so long as you fix it at some point.
|
# ¿ Sep 23, 2009 21:26 |
|
Silly complexity question, is the worst-case cost of merging n sorted lists (each of size O(1) ) into a single sorted list O(n log n)? Obviously this can be done in O(n log n) time by running your favourite O(n log n) sorting algorithm, but I wanted to double-check that I wasn't being stupid and overlooking some better way. DoctorTristan fucked around with this message at 20:55 on Jan 13, 2010 |
# ¿ Jan 13, 2010 20:38 |
|
Once you lay it out like that it's obvious. Thanks.
|
# ¿ Jan 13, 2010 22:03 |
|
coffeetable posted:I'm trying to build a primitive election simulator, but one of the things I'm having trouble with is a good way to generate the preference votes: I want them to be "biased" so that some candidates will have a higher ranking on average than others. Anyone know an algorithm for this? Hooray! A question that's vaguely within my field! You'll need to define a vector of voting probabilities, so that a given voter votes for candidate 1 with probability p1, for candidate 2 with probability p2, etc. If you already know what voting probabilities you want then you can implement it yourself like AD suggested. Alternatively if you have access to a probability library that has a function for simulating multinomial distributions you can use that. If you want to simulate a random vector of voting probabilities, then I suggest you model it as a Dirichlet distribution with appropriate parameters. You can simulate from this by renormalising a vector of Gamma random variables - any good probability/stats library should have a function for simulating gamma rvs. This is also a pretty good reference for random variable simulation, but it's more probability than CS-oriented.
|
# ¿ Mar 18, 2010 23:29 |
|
coffeetable posted:yeah, i'm doing something like this at the moment. unfortunately i'm wanting to generate 2-3 million preference ballots (in batches of a thousand or so, generating a new set of probabilities for each) Ok, if you're simulating large numbers of votes using the same probability vector you should definitely simulate that as a multinomial random variable, using an appropriate prob/stats library. To clarify what I mean by this, if there are r candidates and n voters cast their votes with identical probabilities (p1,..., pr), then the vector of votes for each candidate (n1,...,nr) has a multinomial distribution. Any good implementation of this will be able to simulate (n1,...,nr) in a time that is O(r), rather than O(n).
|
# ¿ Mar 18, 2010 23:51 |
|
coffeetable posted:Ah, I think I messed up - by preference ballot, I meant a ranking of the candidates by preference, not "tick the candidates you like". Ah ok, I was thinking of a single vote system. The proble isn't so much about programming as it is the model you put on the voter preferences. You model each voter's preference vote as a (multivariate) random variable; and within each batch each voter has the same (probabilistic) distribution on their preferences. What distribution do you choose? In principle this is up to you. A simple way is to model each voter's 'liking' for each candidate as a univariate random variable (log-Gaussian, Gamma, Beta or whatever) and have them vote according to their liking rank. Might be better to move this to the SPE thread if you've any further questions.
|
# ¿ Mar 20, 2010 12:33 |
|
Open them with macros disabled? Do it in a VM if you really want to be careful?
|
# ¿ Jan 7, 2019 12:38 |
|
Dominoes posted:Does anyone know of any resources that track financial news article performance? Doing a Google search for "financial news article track records" doesn't produce relevant results. A random scan of a few financial news articles at any point appears to produce a few vague, but quantifiable predictions over time spans ranging from a week to a year. I'm suspicious that all predictions, including ones from respected sources, are no better than random, since my understanding from statistics is that any better-than-random guess can be leveraged into large profits using derivatives. This includes price changes (or lack-thereof) of any kind. There are any number of (paid) services that do things like scrape news sources for company names and push out a feed with the time stamp, ticker and some kind of sentiment score - the usual suspects in the financial data world (Thomson Reuters, Bloomberg, Factset, Nasdaq...) all offer something like this. If you’re asking whether someone has put together a comprehensive historical dataset of financial news stories, combined this with (expensive) intraday equities data and made the results freely available on the internet, then the answer is no, not that I’m aware of.
|
# ¿ Jan 7, 2019 19:10 |
|
Dominoes posted:Nailed it! It wouldn't be too difficult, Systematically identifying and extracting stock predictions from news articles is a non-trivial NLP problem. Even after that the question of ‘was it accurate?’ is not always obvious (Over what timeframe? Relative to what benchmark?) Such a study would involve several months work by a skilled team, plus (probably) licensing a few proprietary software libraries and datasets. That is expensive and finance is not an industry people go into in order to give valuable work away for free.
|
# ¿ Jan 9, 2019 12:53 |
|
On an unrelated note, you currently have class Car inheriting from two other classes, name and model:code:
code:
DoctorTristan fucked around with this message at 09:42 on Apr 16, 2019 |
# ¿ Apr 16, 2019 07:09 |
|
Linear Zoetrope posted:So I'm getting interested in one of my old projects again: making a deckbuilder for Kingdom Hearts Re: Chain of Memories. Have you considered getting obsessed with a different game The most popular/successful set of techniques for highly constrained discrete programming problems like this are so-called ‘state-space relaxation’ methods. Unfortunately I don’t know these nearly well enough to teach them, and the iirc usual approach is ‘Find a professor who knows about state-space relaxation and agree a consulting fee’. You might be able to get somewhere with a combination of brute force, heuristics and branch and bound. You didn’t say anything about how big is the card list from which you choose the deck, but if it’s (say) 400 cards, then the number of ordered 3 card combos is about 63 million, and presumably most of those aren’t feasible solutions, so you may be able to get a few steps in (e.g find all feasible 3, 4 card decks) via brute force. After that I’d try tightening the constraints then progressively loosening them, so eg if you want to solve for decks with a maximum value of 100, do something like: 1) brute force solutions for 3,4 card decks, store them 2) set MaxValue = 1 3) Find candidate solutions by adding cards to the ‘tighter constraint’ solutions found in previous steps, 4) remove any candidate solutions that breach constraints 5) heuristically filter the set of solutions down filter if necessary (if it’s getting too large) 6) store all the solutions for this MaxValue 7) set MaxValue += 1 and repeat from (3) This may or may not work, and there are a lot of details that it will be up to you to figure out, but it might give you a reasonable starting point.
|
# ¿ Dec 12, 2019 17:32 |
|
Python and its environment/path fuckery is going to drive me to a killing spree one day. Is your path the same before/after running activate/deactivate (try echo %PATH% before and after) ? Is there possibly a naming conflict between the HTK files and something in your python path? Speculating slightly here, but did you install Anaconda for all users or just for a single user?
|
# ¿ Feb 4, 2020 11:43 |
|
That entirely depends on what this event is. If it’s Easter Sunday then the only thing you need is a calendar. If it’s the date of the highest recorded temperature in a particular location then you should probably just forecast a date in the middle of summer. If it’s the date of the high point of the S&P 500 in a given year then you are likely wasting your time. The most commonly recommended introductory text to ML algorithms is still ‘The elements of statistical learning’ by Friedman, Hastie and Tibshirani. Though before you go too crazy with your 50 observations you should search for ‘Overfitting’, ‘Data Snooping’ and ‘Out-of-sample Validation’
|
# ¿ Feb 26, 2020 22:12 |
|
AgentCow007 posted:How do you go about converting random bits (like from a hardware RNG) into constrained results? For example if I want to randomly pick from [A-Za-z0-9] (62 values), using 6 bits of random data (64 values), do I have any options besides mod (which would bias the first two characters) or discarding out-of-range values (wasting muh precious entropy)? You are trying to construct a map from each of 64 possible inputs to one of 62 possible outputs. A simple application of the pigeonhole principle shows that you can’t do this without either (i) at least one output that multiple inputs map to, or (ii) some inputs that map to nothing. As you point out, doing (i) will bias the distribution of the output, so with a single input your only option is (ii) - if you get one of the unmapped values you throw it away and try again with a fresh rng call (this is known as ‘rejection sampling’) Alternatively you could do what the poster above suggested and consume additional inputs to generate a longer sequence of outputs - a quick calculation shows you’d need to consume 31 random 6-bit inputs to generate a sequence of 32 random symbols. The benefit of this is that you’re now making fewer calls per symbol to the (presumably expensive ) hardware rng. (Unless you’re doing something critical like generating cryptographic keys, worrying about ‘muh entropy’ is probably overthinking it and some flavour of prng will usually be fine for the vast majority of use cases )
|
# ¿ May 31, 2020 18:51 |
|
Hughmoris posted:I've been trying to teach myself the basics of git and contributing to open source. I found some typos in a popular repo, submitted my first PR, and they merged it. I'm now listed as a contributer. Feel no shame; the other applicants certainly won’t. DoctorTristan fucked around with this message at 19:27 on Apr 14, 2021 |
# ¿ Apr 14, 2021 19:16 |
|
Speculation, but is the median returning a NaN value (could conceivably happen if there’s missing or bad data)? That would fail both inequalities and hit the ELSE condition.
|
# ¿ May 26, 2021 23:13 |
|
hbag posted:i uh What this guy said, also - not saying this is drawn from experience or anything - but you can never really predict where a piece of code you write is going to end up or who might demand to see it, so maybe dance like nobody’s watching but pick variable names like they’ll be read out in a televised courtroom
|
# ¿ Jan 16, 2022 17:18 |
|
KillHour posted:Well that's a lot easier to answer. First, findAll has been deprecated and moved to find_all: Is this… package maintainers actually deciding to give a poo poo about PEP8?
|
# ¿ Jan 17, 2022 14:16 |
|
Assuming you’re only going to generate at most few hundred cards per game I don’t see any reason to bother with a hash anyway - just generating a random sequence of cards and retrying any that .equals() any previous card in the sequence will certainly be Good Enough™️
|
# ¿ Nov 17, 2022 17:56 |
|
This is going to be a horrifically vague question, but is there a good tutorial anywhere on Active Directory, OAuth2, client credential flow and all that? Everything I’ve found so far is either uselessly vague or assumes you already know most of the terminology. (Paid is fine - I can probably swing the training budget) Background is that I have never had to think about authentication before, but now I’m trying to deploy some dash apps that need to make api calls that require authorisation against some other AD app and I need to gain some idea of what needs to be done here so I can ask one of the AD specialists the right questions.
|
# ¿ Dec 13, 2022 19:22 |
|
ultrafilter posted:This came up in another thread and it's a nice little puzzle: If you’re eg 3-5 years out then the range is the two disjoint intervals [y-5, y-3], [y+3, y+5].
|
# ¿ Jun 24, 2023 20:55 |
|
Computer viking posted:It also looks fairly easy to use the Google drive API to ask for a copy of a file in a given format, as long as the export stays under 10 MB. Possibly useful if you need to automate this on a schedule, though I don't know how often you need to do an interactive OAuth login. Gonna take a wild guess here that OP does not wish to deal with web apis nor OAuth
|
# ¿ Oct 10, 2023 14:54 |
|
It’s pronounced like the ‘i’ in ‘linux’
|
# ¿ Apr 5, 2024 08:19 |
|
|
# ¿ May 17, 2024 16:23 |
|
Computer viking posted:I'm case you're joking, that does have one officially correct pronunciation, using i as in Linus as pronounced by a Swedish-speaking Finn. He owns the trademark, and while he didn't name it, it is based on his name; he gets to dictate how it should sound.
|
# ¿ Apr 6, 2024 19:30 |