Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Entropist
Dec 1, 2007
I'm very stupid.
Here's a more accessible and high level description of some issues by Karpathy: https://karpathy.medium.com/yes-you-should-understand-backprop-e2f06eab496b

Adbot
ADBOT LOVES YOU

Entropist
Dec 1, 2007
I'm very stupid.
The Llama models released by Meta are tuneable and people have been using them to make things, such as Open Assistant. The main problem is whether you have enough data to tune it on (and also that it's still non trivial to do in terms of implementation and computational resources). In particular you can't really do RLHF as that requires tons of human resources.

Entropist
Dec 1, 2007
I'm very stupid.

Nektu posted:

Why is recognizing diseases early a use case for generative AI? On first glance it looks like something for a pattern matching algorithm (which recognizes symptoms).

Generative AI is just a fancy new word for pattern matching algorithm, it seems.

People seem to use it for any huge scale machine learning models that use the attention mechanism, thus you can also say that pretty much any machine learning task is a use case for generative AI. Alternatively, it may refer to prompt-based interaction with any large-scale model, then I guess it's just an alternative interface to the patterns that are detected.

Entropist
Dec 1, 2007
I'm very stupid.
I do NLP research including LLM evaluation but yeah, that doesn't mean I know everything about it and of course I don't have the resources to make my own.

Entropist
Dec 1, 2007
I'm very stupid.
I would say that using a LLM is overkill for this use case.

As for the context size, it really depends what kind of units of information you are interested in. If it's mainly words, you can use pretrained word embeddings and context doesn't matter as the word contextual meaning will come mainly from the pretraining data and not much from your context. Otherwise you can make document embeddings - either with the few word snippets, or by merging the snippets into full sentences or chapters of videos. It's really up to you at what level you want to have a searchable semantic representation. If more contextual info is needed you could also include for example the video title into each document embedding or whatever.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply