Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Insurrectionist
May 21, 2007
I don't have a lot of AI experience, me and a pal took ML/NN as an elective during our bachelor back in spring 2021 and we kept using it for like every project afterwards since it was fun. This was before the generative AI explosion though as we finished spring 2022 (also the last time I worked on ML) and we focused a lot on recognition/classification using keras/TF libraries. We made a cool sudoku-solver app that would solve from photos you took of unsolved or partially solved sudoku, though I mostly worked on the decidedly mundane solving algorithm there, as well as a shameless ripoff of google's quickdraw and poo poo. To be honest, I haven't kept up at all in the past 2 years however.

Basically, I understand a lot of the underlying principles, but that's about it. So I have a few question for those who are more involved. Some of these may be too generalised or just industry secrets but w/e just call me an idiot and skip those.

- How much progress in design and understanding of ML models has actually been made in the past say 5 years? Pretty difficult to answer really especially since my impression is the big companies and foundations were keeping cards close to the chest before GPT-2. I was mostly concerned with building from scratch with Google Colab instances as hardware limitation so nothing I've worked with compares to what they were using in the biz even back then anyway. Are the recent commercial models reaching a plateau in what can be achieved, and are they using smoke and mirrors to hide weaknesses at all?

- What kinds of network designs are favored for generative NNs? Is there a focus of width (nodes per layer) over depth (# of layers) or vise versa? Do they utilise input-reducing layers like CNNs (pooling etc)?

- What kind of activation functions are modern generative AIs using? Is it still plain old Sigmoid and the like? I remember we used ReLU a decent amount but I can't imagine that would be useful for anything more complex than the data analysis we were working with since it is hardly a sophisticated function.

- I remember most of the hardware bottleneck for neural networks was for training the NN, and not running the completed model. How advanced are the various free, locally runnable generative NNs nowadays? What kind of hardware so you need to run them?

Adbot
ADBOT LOVES YOU

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply