|
I don't have a lot of AI experience, me and a pal took ML/NN as an elective during our bachelor back in spring 2021 and we kept using it for like every project afterwards since it was fun. This was before the generative AI explosion though as we finished spring 2022 (also the last time I worked on ML) and we focused a lot on recognition/classification using keras/TF libraries. We made a cool sudoku-solver app that would solve from photos you took of unsolved or partially solved sudoku, though I mostly worked on the decidedly mundane solving algorithm there, as well as a shameless ripoff of google's quickdraw and poo poo. To be honest, I haven't kept up at all in the past 2 years however. Basically, I understand a lot of the underlying principles, but that's about it. So I have a few question for those who are more involved. Some of these may be too generalised or just industry secrets but w/e just call me an idiot and skip those. - How much progress in design and understanding of ML models has actually been made in the past say 5 years? Pretty difficult to answer really especially since my impression is the big companies and foundations were keeping cards close to the chest before GPT-2. I was mostly concerned with building from scratch with Google Colab instances as hardware limitation so nothing I've worked with compares to what they were using in the biz even back then anyway. Are the recent commercial models reaching a plateau in what can be achieved, and are they using smoke and mirrors to hide weaknesses at all? - What kinds of network designs are favored for generative NNs? Is there a focus of width (nodes per layer) over depth (# of layers) or vise versa? Do they utilise input-reducing layers like CNNs (pooling etc)? - What kind of activation functions are modern generative AIs using? Is it still plain old Sigmoid and the like? I remember we used ReLU a decent amount but I can't imagine that would be useful for anything more complex than the data analysis we were working with since it is hardly a sophisticated function. - I remember most of the hardware bottleneck for neural networks was for training the NN, and not running the completed model. How advanced are the various free, locally runnable generative NNs nowadays? What kind of hardware so you need to run them?
|
# ¿ Mar 5, 2024 10:48 |
|
|
# ¿ May 21, 2024 03:31 |