Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
pmchem
Jan 22, 2010


QuarkJets posted:

I peer review AI/ML papers for several journals, AMA

If you ask me whether your paper is novel or even slightly good, the nice answer is "no" and the mean answer is "lol no"

what AI/ML research do you publish, personally?

Adbot
ADBOT LOVES YOU

pmchem
Jan 22, 2010


I just ran across this:
https://algorithmsbook.com/files/dm.pdf
via
https://news.ycombinator.com/item?id=31123683

“Algorithms for Decision Making”. The pdf will always be free. Looks like a nice survey.

Does anyone else have free (not pirated) e-pdf algorithm books to recommend?

I like to collect them for my little PDF library, covering a range of topics. HPC/Scientific computing, AI/ML, basic comp sci, computational geometry, etc.

pmchem
Jan 22, 2010


bob dobbs is dead posted:

mediocre book you've actually read and did problems from beats great book you hoard in your pdf trove any day of the week

While I don’t disagree, I’ve already done my homework years ago and actively work. Some of us still like to keep a personal library up to date.

pmchem
Jan 22, 2010


bob dobbs is dead posted:

so have i and so do i, altho this 'relevance eng' job just involves normal software dev nowadays. still applies imo

That’s great. Care to share any resources you’ve found useful?

pmchem
Jan 22, 2010


I'm pondering the following sort of ML problem: a game/simulation with many independent non-interacting agents each acting according to the same exact model, continuous input space, continuous output space, dynamic environment, continuous (real) reward function evaluated ONLY at the end of the game/simulation (not per step in the game/simulation). Reward function cannot be used to calculate gradient of model parameters (e.g., no backprop). Assume solution is, say, a pytorch implementation of whatever flavor NN you desire.

What training strategies might you consider other than neuroevolution?

Adbot
ADBOT LOVES YOU

pmchem
Jan 22, 2010


mightygerm posted:

Sounds like a Q-learning or PPO problem to me. They should be able to learn a policy even when the reward function is null until the end of an episode.

yeah, was considering NAF Q-learning for the continuous space but I thought that required loss/reward at each iteration, not just end of an episode (see algo 1 in Gu). guess I'll poke at other variants.

ultrafilter posted:

You might look into Bayesian optimization as an alternative to any RL-based approach.

already solved things that way

anyone have a favorite actor/critic approach for delayed rewards?

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply