The Artificial Intelligence & Machine Learning Megathread

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > The Artificial Intelligence & Machine Learning Megathread

pmchem: Jan 22, 2010

QuarkJets posted:

I peer review AI/ML papers for several journals, AMA

If you ask me whether your paper is novel or even slightly good, the nice answer is "no" and the mean answer is "lol no"

what AI/ML research do you publish, personally?

# ¿ Feb 10, 2022 13:33

Adbot: ADBOT LOVES YOU

# ¿ May 21, 2024 08:54

pmchem: Jan 22, 2010

I just ran across this:
https://algorithmsbook.com/files/dm.pdf
via
https://news.ycombinator.com/item?id=31123683

�Algorithms for Decision Making�. The pdf will always be free. Looks like a nice survey.

Does anyone else have free (not pirated) e-pdf algorithm books to recommend?

I like to collect them for my little PDF library, covering a range of topics. HPC/Scientific computing, AI/ML, basic comp sci, computational geometry, etc.

# ¿ Apr 26, 2022 15:00

pmchem: Jan 22, 2010

bob dobbs is dead posted:

mediocre book you've actually read and did problems from beats great book you hoard in your pdf trove any day of the week

While I don�t disagree, I�ve already done my homework years ago and actively work. Some of us still like to keep a personal library up to date.

# ¿ Apr 26, 2022 16:29

pmchem: Jan 22, 2010

bob dobbs is dead posted:

so have i and so do i, altho this 'relevance eng' job just involves normal software dev nowadays. still applies imo

That�s great. Care to share any resources you�ve found useful?

# ¿ Apr 26, 2022 16:38

pmchem: Jan 22, 2010

I'm pondering the following sort of ML problem: a game/simulation with many independent non-interacting agents each acting according to the same exact model, continuous input space, continuous output space, dynamic environment, continuous (real) reward function evaluated ONLY at the end of the game/simulation (not per step in the game/simulation). Reward function cannot be used to calculate gradient of model parameters (e.g., no backprop). Assume solution is, say, a pytorch implementation of whatever flavor NN you desire.

What training strategies might you consider other than neuroevolution?

# ¿ Apr 10, 2024 20:30

Adbot: ADBOT LOVES YOU

# ¿ May 21, 2024 08:54

pmchem: Jan 22, 2010

mightygerm posted:

Sounds like a Q-learning or PPO problem to me. They should be able to learn a policy even when the reward function is null until the end of an episode.

yeah, was considering NAF Q-learning for the continuous space but I thought that required loss/reward at each iteration, not just end of an episode (see algo 1 in Gu). guess I'll poke at other variants.

ultrafilter posted:

You might look into Bayesian optimization as an alternative to any RL-based approach.

already solved things that way

anyone have a favorite actor/critic approach for delayed rewards?

# ¿ Apr 10, 2024 22:18

The Something Awful Forums > Discussion > Serious Hardware/Software Crap > The Cavern of COBOL > The Artificial Intelligence & Machine Learning Megathread