|
Llama 3 is getting some noise. There's a small version people are running on small hardware. about https://ai.meta.com/blog/meta-llama-3/ try https://www.meta.ai/ Mescal fucked around with this message at 16:43 on Apr 24, 2024 |
# ? Apr 24, 2024 16:31 |
|
|
# ? May 5, 2024 16:06 |
|
I've been doing some basic tests and the 8B model is pretty decent, it does better in some programming and reasoning tasks than larger models like Mixtral 8x7b and Mistral Large. I think the suspicion that earlier models are over-parameterized is true. Microsoft has a 3.8B model coming out soon and they say it's good, who knows.
|
# ? Apr 24, 2024 16:38 |
|
Yeah Microsoft is claiming just yesterday they used some novel training methods akin to reading it children's books to get gpt 3.5 performance (which at one point had 170b params, according to the Internet, although I think the turbo model had half as much) out of 3.8b params which would be... Like almost 1/44 the size, which if true but I'm skeptical as that would mean ChatGPT on a raspberry pi
|
# ? Apr 24, 2024 16:51 |
|
Yeah, Microsoft had a white paper last summer starting off their PHI series where they train it using only textbook quality data. One thing I do know is that for decades Microsoft or Bill Gates has been archiving all kinds of data and trying to find permanent ways to store it for the future, I would imagine that those efforts would give them access to those kinds of things.
|
# ? Apr 24, 2024 17:42 |