Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Mega Comrade
Apr 22, 2004

Listen buddy, we all got problems!
As usual with these projects, they oversell.
It claims to be "the first" but it's not. Several other projects have beaten it to market.
And it's unaided solution score is 13%.

Now 13% if true is actually impressive, but isn't an autonomous engineer. It's not even a bad junior.
The project seems to have huge money pushing it to VCs though, I've seen it everywhere.

Adbot
ADBOT LOVES YOU

cr0y
Mar 24, 2005



Is that openAI robot video floating around real?

E: this one

https://youtu.be/Sq1QZB5baNw?si=n7EztUlGwvqFI-os

Mega Comrade
Apr 22, 2004

Listen buddy, we all got problems!
It's real as in it's a real demo aimed at bringing in VC money.

The robotics are by far the most impressive thing. Everything on top is stuff we've been seeing for a while.

Speech to speech "reasoning" is actually trivial now with LLMs, just take any off the shelf speech to text, slap the text into the prompt, do the reverse on the way out.
Their previous demos on their channel are all more impressive from a technical stand point, but a talking robot captures peoples imaginations.

Mega Comrade fucked around with this message at 11:17 on Mar 14, 2024

mobby_6kl
Aug 9, 2009

by Fluffdaddy

Mega Comrade posted:

It's real as in it's a real demo aimed at bringing in VC money.

The robotics are by far the most impressive thing. Everything on top is stuff we've been seeing for a while.

Speech to speech "reasoning" is actually trivial now with LLMs, just take any off the shelf speech to text, slap the text into the prompt, do the reverse on the way out.
Their previous demos on their channel are all more impressive from a technical stand point, but a talking robot captures peoples imaginations.
First time watching that demo. I think you're right in that it's gluing together a lot of the existing components - speech recognition, object recognition, LLM, text to speech, etc. The object recognition layer probably feeds the LLM with stuff like "in front of you is an apple and this guy" (that it reads out for us), and so on. Still, if it can translate prompts into appropriate actions in and physically execute them, that's pretty drat neat. if it can do it in a generalized way and not this single demo, obviously, which is a big if.

Somehow I missed that OpenAI was doing robotics, I think this would explain Musky's sudden obsession with making a Tesla robot.

Mega Comrade
Apr 22, 2004

Listen buddy, we all got problems!

mobby_6kl posted:

Somehow I missed that OpenAI was doing robotics, I think this would explain Musky's sudden obsession with making a Tesla robot.

They aren't, or at least they haven't said they are.

This is a completely unrelated company just using GPT4 in their stack. Stick "OpenAI " in the demo title though and it gets picked up better by search engines, and it's no bad thing people thinking you are connected to the biggest name in AI.

Mola Yam
Jun 18, 2004

Kali Ma Shakti de!
they are connected; they have funding from OpenAI, and a collaboration agreement specifically for the LLMs to be used in humanoid robots

https://www.prnewswire.com/news-releases/figure-raises-675m-at-2-6b-valuation-and-signs-collaboration-agreement-with-openai-302074897.html

Boris Galerkin
Dec 17, 2011

I don't understand why I can't harass people online. Seriously, somebody please explain why I shouldn't be allowed to stalk others on social media!
How good are the LLMs at being multilingual, like how a person raised in a multilingual household would be?

Been thinking about this because the other day I was watching TV and there was a random scene where the language switched to [non-English language]. I understand and speak this language, but there was a word I didn't know.

I said "Siri, what does [foreign word] mean in English?" and Siri couldn't understand that I was switching language for that word. I would have accepted an explanation in English.

I tried "Siri, [in foreign language: what does this word mean in English]?" and Siri transcribed my words into the closest approximate English syllables I guess, which was gibberish. I would have accepted an explanation in the foreign language.

I asked about this in the iPhone thread since it was a Siri question and I know Siri isn't a LLM (right?) but it's spurred some additional discussion about how Siri sometimes can't distinguish perfectly between numbers like "15" and "50" for some people.

This is just an example. But in real life real world, when I talk to my family or some friends we do switch language like that and it's completely natural/normal.

Should clarify that this is in the context of voice assistants.

mobby_6kl
Aug 9, 2009

by Fluffdaddy

Boris Galerkin posted:

How good are the LLMs at being multilingual, like how a person raised in a multilingual household would be?

Been thinking about this because the other day I was watching TV and there was a random scene where the language switched to [non-English language]. I understand and speak this language, but there was a word I didn't know.

I said "Siri, what does [foreign word] mean in English?" and Siri couldn't understand that I was switching language for that word. I would have accepted an explanation in English.

I tried "Siri, [in foreign language: what does this word mean in English]?" and Siri transcribed my words into the closest approximate English syllables I guess, which was gibberish. I would have accepted an explanation in the foreign language.

I asked about this in the iPhone thread since it was a Siri question and I know Siri isn't a LLM (right?) but it's spurred some additional discussion about how Siri sometimes can't distinguish perfectly between numbers like "15" and "50" for some people.

This is just an example. But in real life real world, when I talk to my family or some friends we do switch language like that and it's completely natural/normal.

Should clarify that this is in the context of voice assistants.

Here's chatgpt 3.5:

quote:

Q: What does "ernährungswissenschaftler" in German mean?
A: "Ernährungswissenschaftler" is a German word that translates to "nutritionist" in English. It refers to a person who specializes in the study of nutrition and its effects on human health. Ernährungswissenschaftler typically work in various fields related to nutrition, including research, education, counseling, and public health.

This is something I typed like that though. I think one issue with Siri might be that it's doing voice recognition and of course tries to match it to the most likely English word.

Tei
Feb 19, 2011

Boris Galerkin posted:

How good are the LLMs at being multilingual, like how a person raised in a multilingual household would be?

I think better than a human being. A human works more in "modes", is thinking in english or thinking in spanish or german. A person thinking in german will try to understand in german a word he heard.

With ChatGPT, you can write a question in a mix of spanish and english, and the bot will understand the question and answer it. To me that means all words where evaluated for a the potential individual meaning and in the context on all potential languages, including programming languages.

mawarannahr
May 21, 2019

Tei posted:

I think better than a human being. A human works more in "modes", is thinking in english or thinking in spanish or german. A person thinking in german will try to understand in german a word he heard.

With ChatGPT, you can write a question in a mix of spanish and english, and the bot will understand the question and answer it. To me that means all words where evaluated for a the potential individual meaning and in the context on all potential languages, including programming languages.

I don't think this is true. It's not uncommon to switch languages in the middle of a sentence among members of a bilingual household. I'm pretty sure a lot of people think in multiple languages simultaneously, too.

Rappaport
Oct 2, 2013

mawarannahr posted:

I don't think this is true. It's not uncommon to switch languages in the middle of a sentence among members of a bilingual household. I'm pretty sure a lot of people think in multiple languages simultaneously, too.

I often find myself in situations where I can remember a word or a term in one language but not the other, in the middle of a sentence or writing something

Of course it could just be early(?) dementia :corsair:

Main Paineframe
Oct 27, 2010

Boris Galerkin posted:

How good are the LLMs at being multilingual, like how a person raised in a multilingual household would be?

Been thinking about this because the other day I was watching TV and there was a random scene where the language switched to [non-English language]. I understand and speak this language, but there was a word I didn't know.

I said "Siri, what does [foreign word] mean in English?" and Siri couldn't understand that I was switching language for that word. I would have accepted an explanation in English.

I tried "Siri, [in foreign language: what does this word mean in English]?" and Siri transcribed my words into the closest approximate English syllables I guess, which was gibberish. I would have accepted an explanation in the foreign language.

I asked about this in the iPhone thread since it was a Siri question and I know Siri isn't a LLM (right?) but it's spurred some additional discussion about how Siri sometimes can't distinguish perfectly between numbers like "15" and "50" for some people.

This is just an example. But in real life real world, when I talk to my family or some friends we do switch language like that and it's completely natural/normal.

Should clarify that this is in the context of voice assistants.

Doesn't really matter what context it's in, a LLM is a LLM. And Siri is definitely not a LLM.

How well a LLM handles different languages depends pretty much exclusively on its training set. And I think you're making the mistake of anthromorphizing it here. LLMs don't work like human thought processes do. They don't really have a sense of "language" in the first place. The only thing they know about a given word is what words it tends to be used along with most often.

They don't know that "ernährungswissenschaftler" is German or that "nutritionist" is English - they just know that "ernährungswissenschaftler" tends to be used along with other words that are way too long and contain way too many consonants. Even though they have no concept of "ernährungswissenschaftler" being a German word, they'll tend to use it with other German words, because that's how it tends to be used in much of their training data.

Unless you use it on an otherwise English sentence along with words like "meaning" or "translation", in which case it'll respond with an English description of what it means, because any instances of "what does ernährungswissenschaftler mean?" in its training data will be followed by an English explanation of its meaning.

Circling back around to your original question, I'll repeat myself: it depends on its training data, because it's basically just repeating back words based on statistical analysis of its training set. They'll tend to use English words with English words and German words with German words, but that's because their training data will most likely use English words with English words and German words with German words. There's no fundamental understanding of language there. If their training data contains a lot of mixing English with German, then they'll be more likely to mix English with German themselves. Simple as that.

That's for responses, anyway. In terms of taking input, if you're typing the words in, then it shouldn't have any issue with multilingual input since its all just words, and the LLM doesn't have a real concept of language. But if you're speaking words, then you're not putting words directly into the LLM. You have to go through voice recognition first, and that's not an LLM. Moreover, it is usually much more complicated. Voice recognition tends to still be language-specific, because it's probably a couple orders of magnitude more complex than just crunching text.

Boris Galerkin
Dec 17, 2011

I don't understand why I can't harass people online. Seriously, somebody please explain why I shouldn't be allowed to stalk others on social media!

Tei posted:

I think better than a human being. A human works more in "modes", is thinking in english or thinking in spanish or german. A person thinking in german will try to understand in german a word he heard.

In the context of people raised multilingual this is probably not true. It's not true for me, and I assume it's not true for people who have grown up speaking 2+ languages and can switch between them fluently on a word by word and phrase by phrase case. I'm not talking about people who learned a language in adulthood and use the crutch of thinking in English and translating to Spanish.

Also anyway, I should have made it more clear but I'm only talking about voice inputs to LLMs. I know Siri isn't a LLM but I use Siri via voice for a lot of things. I'm just wondering what the progress is on voice assistants being able to recognize multiple languages being used in the same context/conversation.

SaTaMaS
Apr 18, 2003

Main Paineframe posted:

Doesn't really matter what context it's in, a LLM is a LLM. And Siri is definitely not a LLM.

How well a LLM handles different languages depends pretty much exclusively on its training set. And I think you're making the mistake of anthromorphizing it here. LLMs don't work like human thought processes do. They don't really have a sense of "language" in the first place. The only thing they know about a given word is what words it tends to be used along with most often.

They don't know that "ernährungswissenschaftler" is German or that "nutritionist" is English - they just know that "ernährungswissenschaftler" tends to be used along with other words that are way too long and contain way too many consonants. Even though they have no concept of "ernährungswissenschaftler" being a German word, they'll tend to use it with other German words, because that's how it tends to be used in much of their training data.

Unless you use it on an otherwise English sentence along with words like "meaning" or "translation", in which case it'll respond with an English description of what it means, because any instances of "what does ernährungswissenschaftler mean?" in its training data will be followed by an English explanation of its meaning.

Circling back around to your original question, I'll repeat myself: it depends on its training data, because it's basically just repeating back words based on statistical analysis of its training set. They'll tend to use English words with English words and German words with German words, but that's because their training data will most likely use English words with English words and German words with German words. There's no fundamental understanding of language there. If their training data contains a lot of mixing English with German, then they'll be more likely to mix English with German themselves. Simple as that.

That's for responses, anyway. In terms of taking input, if you're typing the words in, then it shouldn't have any issue with multilingual input since its all just words, and the LLM doesn't have a real concept of language. But if you're speaking words, then you're not putting words directly into the LLM. You have to go through voice recognition first, and that's not an LLM. Moreover, it is usually much more complicated. Voice recognition tends to still be language-specific, because it's probably a couple orders of magnitude more complex than just crunching text.

quote:

Me:
What language is "ernährungswissenschaftler"? What language is "nutritionist"?

ChatGPT:
The word "ernährungswissenschaftler" is German. It translates to "nutrition scientist" in English, closely related to the term "nutritionist" but with a slight emphasis on the scientific study of nutrition. The term typically refers to someone who specializes in the study of nutrition science, including aspects such as diet, nutritional values, and the effects of food on health and disease.

On the other hand, "nutritionist" is an English word. A nutritionist is a person who advises on matters of food and nutrition impacts on health. Depending on the region, the term "nutritionist" can have different levels of regulation regarding its use. In some places, anyone can call themselves a nutritionist without specific qualifications, whereas in others, the title may require certification or licensure.

It's pretty hard to define a sense of "language" other than as a collection of associations between words, which is exactly what LLMs are.

mawarannahr
May 21, 2019

SaTaMaS posted:

It's pretty hard to define a sense of "language" other than as a collection of associations between words, which is exactly what LLMs are.

I'm not gonna define language but it is not a collection of associations between words. At all. Nor is it for LLMs, because tokens are not the same thing as words.

Liquid Communism
Mar 9, 2004


Out here, everything hurts.




Language is also context dependent, and LLMs are by their nature incapable of grasping context.

Tei
Feb 19, 2011

Boris Galerkin posted:

In the context of people raised multilingual this is probably not true. It's not true for me, and I assume it's not true for people who have grown up speaking 2+ languages and can switch between them fluently on a word by word and phrase by phrase case. I'm not talking about people who learned a language in adulthood and use the crutch of thinking in English and translating to Spanish.

Also anyway, I should have made it more clear but I'm only talking about voice inputs to LLMs. I know Siri isn't a LLM but I use Siri via voice for a lot of things. I'm just wondering what the progress is on voice assistants being able to recognize multiple languages being used in the same context/conversation.

Heres this, somewhat has a joke.


Tei fucked around with this message at 22:57 on Mar 14, 2024

Vivian Darkbloom
Jul 14, 2004


https://twitter.com/TechBurritoUno/status/1768363023192768799

Open AI CTO gives basically the worst responses possible.

Interviewer: "What data was used to train Sora?"
Mira Murati: "We used publicly available data and licensed data."
"So, videos on YouTube?"
"... I'm actually not sure about that."
"Ok, videos from Facebook, Instagram?"
"You know, if they were publically available, available yeah publically, available to use, there might be the data but I'm not sure, I'm not confident about that."

Boris Galerkin
Dec 17, 2011

I don't understand why I can't harass people online. Seriously, somebody please explain why I shouldn't be allowed to stalk others on social media!

Tei posted:

Heres this, somewhat has a joke.



I have no idea what you mean by this but clearly I didn't repeat myself enough if you misunderstood what I posted in the first place.

notwithoutmyanus
Mar 17, 2009

Vivian Darkbloom posted:

https://twitter.com/TechBurritoUno/status/1768363023192768799

Open AI CTO gives basically the worst responses possible.

Interviewer: "What data was used to train Sora?"
Mira Murati: "We used publicly available data and licensed data."
"So, videos on YouTube?"
"... I'm actually not sure about that."
"Ok, videos from Facebook, Instagram?"
"You know, if they were publically available, available yeah publically, available to use, there might be the data but I'm not sure, I'm not confident about that."

That's about the most conceivably clueless response ever.
"we grab anything people weren't smart enough to lock down, and maybe you know other people's poo poo too? :shrug:"

Tei
Feb 19, 2011

Boris Galerkin posted:

I have no idea what you mean by this but clearly I didn't repeat myself enough if you misunderstood what I posted in the first place.

Are you the CTO of Open AI?

GABA ghoul
Oct 29, 2011

Main Paineframe posted:

Doesn't really matter what context it's in, a LLM is a LLM. And Siri is definitely not a LLM.

How well a LLM handles different languages depends pretty much exclusively on its training set. And I think you're making the mistake of anthromorphizing it here. LLMs don't work like human thought processes do. They don't really have a sense of "language" in the first place. The only thing they know about a given word is what words it tends to be used along with most often.

They don't know that "ernährungswissenschaftler" is German or that "nutritionist" is English - they just know that "ernährungswissenschaftler" tends to be used along with other words that are way too long and contain way too many consonants. Even though they have no concept of "ernährungswissenschaftler" being a German word, they'll tend to use it with other German words, because that's how it tends to be used in much of their training data.

Unless you use it on an otherwise English sentence along with words like "meaning" or "translation", in which case it'll respond with an English description of what it means, because any instances of "what does ernährungswissenschaftler mean?" in its training data will be followed by an English explanation of its meaning.

Circling back around to your original question, I'll repeat myself: it depends on its training data, because it's basically just repeating back words based on statistical analysis of its training set. They'll tend to use English words with English words and German words with German words, but that's because their training data will most likely use English words with English words and German words with German words. There's no fundamental understanding of language there. If their training data contains a lot of mixing English with German, then they'll be more likely to mix English with German themselves. Simple as that.

That's for responses, anyway. In terms of taking input, if you're typing the words in, then it shouldn't have any issue with multilingual input since its all just words, and the LLM doesn't have a real concept of language. But if you're speaking words, then you're not putting words directly into the LLM. You have to go through voice recognition first, and that's not an LLM. Moreover, it is usually much more complicated. Voice recognition tends to still be language-specific, because it's probably a couple orders of magnitude more complex than just crunching text.

You are making a lot of assumptions about how language processing works in human brains.

Natively bilingual people can effortlessly jump between speaking language A, language B or a random mixture of them in just a single sentence without any feeling that they are speaking different languages. It feels more like "language" is just another classification in the brain that each word and grammar rule has. Like an apple is associated with the category of "fruits" and broccoli with "vegetables", this word/grammar rule is associated with "English" and that with "German". Like I can effortlessly speak only about fruits, I can effortlessly use only English words.

I think it would be really strange if humans had some specific natural mechanism in the brain to separate languages because AFAIK it wasn't really a common scenario in our evolutionary past that you needed to use more than one language.

SaTaMaS
Apr 18, 2003

Liquid Communism posted:

Language is also context dependent, and LLMs are by their nature incapable of grasping context.

Cool so you have no idea how LLMs or transformers work

SaTaMaS
Apr 18, 2003

mawarannahr posted:

I'm not gonna define language but it is not a collection of associations between words. At all. Nor is it for LLMs, because tokens are not the same thing as words.

People such as Ferdinand de Saussure have already worked to define language. He argues that words acquire meaning through their relational positions in a language system rather than through direct links to the external world (aka structuralism). Tokens aren't the same as words, but words are made up of tokens. LLMs rely on the statistical relationships between words in large datasets to predict and generate text, and the context in which a word appears is crucial for its interpretation.

cat botherer
Jan 6, 2022

I am interested in most phases of data processing.

SaTaMaS posted:

Cool so you have no idea how LLMs or transformers work
"Context" exists on a spectrum. Transformer models still operate with a relatively crude (but still pretty useful) attention mechanism on fixed-length context windows. There's no ability to transfer deep semantic information - e.g. to generalize information between sources and develop internal hypotheses. As an example, look how bad they are at simple arithmetic.

Current models fall especially short of human contextual understanding when you consider the superhuman amounts of information they are trained on. Humans are incredibly efficient at learning. Fundamentally, all current models boil down to advanced nearest-neighbors. They can learn embedding to make that far more effective, but they cannot extrapolate outside of space the training data occupies.

SaTaMaS
Apr 18, 2003

cat botherer posted:

"Context" exists on a spectrum. Transformer models still operate with a relatively crude (but still pretty useful) attention mechanism on fixed-length context windows. There's no ability to transfer deep semantic information - e.g. to generalize information between sources and develop internal hypotheses. As an example, look how bad they are at simple arithmetic.

The reason LLMs are bad at arithmetic isn't because of deep semantic information. Arithmetic involves almost so semantics, it's entirely syntax. However it involves executing precise, logically defined algorithmic operations, while LLMs are designed to predict the next word in a sequence based on learned probabilities.

quote:

Current models fall especially short of human contextual understanding when you consider the superhuman amounts of information they are trained on. Humans are incredibly efficient at learning. Fundamentally, all current models boil down to advanced nearest-neighbors. They can learn embedding to make that far more effective, but they cannot extrapolate outside of space the training data occupies.

Sure, but most people can't or don't extrapolate beyond "common sense" either.

Main Paineframe
Oct 27, 2010

SaTaMaS posted:

It's pretty hard to define a sense of "language" other than as a collection of associations between words, which is exactly what LLMs are.

It doesn't make sense to talk about a "sense of 'language'". Of course, it's pretty hard to define "language", which is why there's entire scientific fields dedicated to the study of human languages, which has created several theoretical frameworks for understanding the concept of language. Chomsky usually gets posted here for his political beliefs, but his day job is linguistics. Similarly, Tolkien is best known for his fantasy works, but he was actually a philologist whose fantasy books were heavily influenced by his studies of language and literature.

But even in that context, "a collection of associations between words" is a strikingly poor definition for a language. An individual language is generally understood to be a system, containing not just word associations but several sets of rules (sometimes quite complex ones). And in the context we're talking about, which goes well beyond the scientific study of language in isolation, I think we also need to note the considerable social, cultural, and historical ties languages hold, because these are all relevant to contexts in which people might want to speak one language over another. For example, code-switching.

GABA ghoul posted:

You are making a lot of assumptions about how language processing works in human brains.

Natively bilingual people can effortlessly jump between speaking language A, language B or a random mixture of them in just a single sentence without any feeling that they are speaking different languages. It feels more like "language" is just another classification in the brain that each word and grammar rule has. Like an apple is associated with the category of "fruits" and broccoli with "vegetables", this word/grammar rule is associated with "English" and that with "German". Like I can effortlessly speak only about fruits, I can effortlessly use only English words.

I think it would be really strange if humans had some specific natural mechanism in the brain to separate languages because AFAIK it wasn't really a common scenario in our evolutionary past that you needed to use more than one language.

I'm not quite sure what you read from my post, but it sure as hell doesn't sound like what I wrote. I didn't make any assertions about how language processing works in human brains, nor did I ever claim that there's "some specific natural mechanism in the brain to separate languages". It's a ridiculous claim to make, which is exactly why I didn't make it, nor did I suggest anything even remotely similar to it.

SaTaMaS posted:

People such as Ferdinand de Saussure have already worked to define language. He argues that words acquire meaning through their relational positions in a language system rather than through direct links to the external world (aka structuralism). Tokens aren't the same as words, but words are made up of tokens. LLMs rely on the statistical relationships between words in large datasets to predict and generate text, and the context in which a word appears is crucial for its interpretation.

I think Ferdinant de Saussure might be a bit behind the curve in linguistic study, given that he died more than a century ago. His influence certainly extends into the current day, but now his structural linguistics are just one of many linguistic theories out there. I also think your read of structural linguistics doesn't quite line up with mine (though, granted, I'm not a linguist). As far as I know, structural linguistics is fundamentally about interpreting the rules of a language. It's not that individual words acquire meaning through their relational positions in a language system, it's that grammatical elements and semantics and syntax and other language rules acquire meaning through those relational positions. The "structure" in "structuralism" is the structure of the language system itself.

cat botherer
Jan 6, 2022

I am interested in most phases of data processing.

SaTaMaS posted:

The reason LLMs are bad at arithmetic isn't because of deep semantic information. Arithmetic involves almost so semantics, it's entirely syntax. However it involves executing precise, logically defined algorithmic operations, while LLMs are designed to predict the next word in a sequence based on learned probabilities.
Effective algorithms to actually do arithmetic put in a layer of semantics. Separately from that though, a human can learn arithmetical algorithms from a textbook, but LLMs cannot.

quote:

Sure, but most people can't or don't extrapolate beyond "common sense" either.
"Common sense" extrapolations are precisely what makes humans so much better at reasoning than LLMs. Common sense is an amazing thing, it just seems mundane because of how common it is.

SaTaMaS
Apr 18, 2003

cat botherer posted:

Effective algorithms to actually do arithmetic put in a layer of semantics. Separately from that though, a human can learn arithmetical algorithms from a textbook, but LLMs cannot.
It's not at all true that an LLM can't learn arithmetic problem solving, for example https://machine-learning-made-simple.medium.com/how-google-built-the-perfect-llm-system-alphageometry-ed65a9604eaf

quote:

"Common sense" extrapolations are precisely what makes humans so much better at reasoning than LLMs. Common sense is an amazing thing, it just seems mundane because of how common it is.

Not really, LLMs are quite proficient at common sense reasoning, GPT4 is at just 0.3% under human proficiency in one benchmark: https://rowanzellers.com/hellaswag/

GABA ghoul
Oct 29, 2011

Main Paineframe posted:

I'm not quite sure what you read from my post, but it sure as hell doesn't sound like what I wrote. I didn't make any assertions about how language processing works in human brains, nor did I ever claim that there's "some specific natural mechanism in the brain to separate languages". It's a ridiculous claim to make, which is exactly why I didn't make it, nor did I suggest anything even remotely similar to it.

:confused:

This is what you wrote:

quote:

And I think you're making the mistake of anthromorphizing it here. LLMs don't work like human thought processes do. They don't really have a sense of "language" in the first place. The only thing they know about a given word is what words it tends to be used along with most often.

Humans don't have a "sense of language" either when speaking habitually, as any natively bilingual person will tell you. They freely interchange words and grammar of language A, language B or the pidgin mixture of both, without even being consciously aware of doing it. At no point is there any reasoning about the abstract concept of language A, B or the pidgin involved(unless you make a conscious decision to edit the generated speech in your head to strictly conform to one of them before you say it out aloud).

You assume that the way LLMs generate speech is different from the way humans do, which we just don't know. It all looks suspiciously similar at a first glance though.

Main Paineframe
Oct 27, 2010

SaTaMaS posted:

It's not at all true that an LLM can't learn arithmetic problem solving, for example https://machine-learning-made-simple.medium.com/how-google-built-the-perfect-llm-system-alphageometry-ed65a9604eaf

That is a) geometry rather than arithmetic, and even more importantly for this conversation, b) not using an LLM to do the math. This is a perfect example of how easily the capabilities of LLMs get enormously exaggerated, and how important it is to remain clear-eyed about them and keep their limitations in mind.

All of the actual mathematics is being done by their symbolic deduction engine, a more traditional machine learning system. The LLM plays an accessory role here by basically restating the initial problem with more and more detail until there's enough detail there for the deduction engine to handle. The exact details use a level of jargon that's a little tough for me to follow, but the situation here appears to be that the deduction engine is extremely picky about its input data and usually requires humans to rewrite the problem in a specialized syntax and language, while also adding much more detail because things have to be extremely specific for the limited capabilities of the deduction engine.

Which is still significant work, of course, but it's very important to note here that the LLM isn't actually doing any number-crunching here. All seems to be doing is rewording the problem to make it possible for the dedicated math-solver (which is not a LLM, and therefore has very limited language capabilities, to interpret.

GABA ghoul posted:

:confused:

This is what you wrote:

Humans don't have a "sense of language" either when speaking habitually, as any natively bilingual person will tell you. They freely interchange words and grammar of language A, language B or the pidgin mixture of both, without even being consciously aware of doing it. At no point is there any reasoning about the abstract concept of language A, B or the pidgin involved(unless you make a conscious decision to edit the generated speech in your head to strictly conform to one of them before you say it out aloud).

You assume that the way LLMs generate speech is different from the way humans do, which we just don't know. It all looks suspiciously similar at a first glance though.

My entire point is that humans can reason about the abstract concept of language, and can consciously choose to edit the generated speech in their head to strictly conform to one of them, or mix them as they please. They don't have to do either of those things, but the fact that they can (and that LLMs can't) is important to note in the context of the specific question that was originally being asked and the specific situation being discussed. A more conventional voice recognition engine is likely to draw far too firm a line between languages and be unable to handle bilingual input at all. On the other hand, LLMs are unable to draw that line at all, buyt they're good enough at following the example of their training set and the prompt that it's extremely unlikely the user will notice that the line isn't hard.

Mega Comrade
Apr 22, 2004

Listen buddy, we all got problems!

SaTaMaS posted:

It's not at all true that an LLM can't learn arithmetic problem solving, for example https://machine-learning-made-simple.medium.com/how-google-built-the-perfect-llm-system-alphageometry-ed65a9604eaf


Did you read your own source before posting?
The LLM does none of the problem solving.


vvv ok Stretch Armstrong

Mega Comrade fucked around with this message at 21:41 on Mar 15, 2024

Lucid Dream
Feb 4, 2003

That boy ain't right.
If the LLM enables the problem to be solved then it kinda solved it. If I use a calculator to help me do a complicated math problem, I still solved it even if the calculator helped. If the LLM is smart enough to choose to use the calculator then I think it counts.

Lucid Dream fucked around with this message at 21:25 on Mar 15, 2024

Quixzlizx
Jan 7, 2007
https://arstechnica.com/security/2024/03/researchers-use-ascii-art-to-elicit-harmful-responses-from-5-major-ai-chatbots/

Chatbots were tricked into ignoring their guardrails with a combination of ASCII art and a simple substitution cipher.

PhazonLink
Jul 17, 2010
thats dumb

Roadie
Jun 30, 2013
It's a relatively intuitive result once you consider that "safety" reinforcement training on all the corpo LLMs is a pretty thin layer of paint on top of a huge mess of ingested content of all kinds, including text porn, reddit posts about blowing about stuff, dril tweets, etc. In theory one could produce a "safe" LLM from the ground up by using a carefully curated set of data, but that would, you know, actually cost a lot of money compared to just scraping the internet and stealing all the creative work of a ton of people.

For a comparison here, Google has a model trained entirely on weather data, and so the only results it will ever give are... obviously, weather data. It can't be 'unsafe' because it doesn't know how to be.

Roadie fucked around with this message at 06:34 on Mar 16, 2024

Rappaport
Oct 2, 2013

Can we make a Bob Ross AI? Not like the digital ghouls Disney keeps conjuring up, just train an AI to chat soothing small nothings and making nice paintings on the user's screen. Maybe throw in some Mister Rogers for the terminally online doom-scrolling 4-year-olds, too.

SaTaMaS
Apr 18, 2003

Main Paineframe posted:

That is a) geometry rather than arithmetic, and even more importantly for this conversation, b) not using an LLM to do the math. This is a perfect example of how easily the capabilities of LLMs get enormously exaggerated, and how important it is to remain clear-eyed about them and keep their limitations in mind.

All of the actual mathematics is being done by their symbolic deduction engine, a more traditional machine learning system. The LLM plays an accessory role here by basically restating the initial problem with more and more detail until there's enough detail there for the deduction engine to handle. The exact details use a level of jargon that's a little tough for me to follow, but the situation here appears to be that the deduction engine is extremely picky about its input data and usually requires humans to rewrite the problem in a specialized syntax and language, while also adding much more detail because things have to be extremely specific for the limited capabilities of the deduction engine.

Which is still significant work, of course, but it's very important to note here that the LLM isn't actually doing any number-crunching here. All seems to be doing is rewording the problem to make it possible for the dedicated math-solver (which is not a LLM, and therefore has very limited language capabilities, to interpret.

My entire point is that humans can reason about the abstract concept of language, and can consciously choose to edit the generated speech in their head to strictly conform to one of them, or mix them as they please. They don't have to do either of those things, but the fact that they can (and that LLMs can't) is important to note in the context of the specific question that was originally being asked and the specific situation being discussed. A more conventional voice recognition engine is likely to draw far too firm a line between languages and be unable to handle bilingual input at all. On the other hand, LLMs are unable to draw that line at all, buyt they're good enough at following the example of their training set and the prompt that it's extremely unlikely the user will notice that the line isn't hard.

I think you're really underestimating the technology here. There's no reason it can't be trained on arithmetic problems and I assume another LLM already has been. "All" that AlphaZero did in order to beat all humans at Go was combine a Transformer (aka the same fundamental technology as an LLM) that suggested some next likely moves with a Monte Carlo Tree Search to figure out which of the suggested moves would be the best. There's not a lot that looks like human reasoning going on in either case, and the LLM/Transformer was only 50% of the solution.

Main Paineframe
Oct 27, 2010

SaTaMaS posted:

I think you're really underestimating the technology here. There's no reason it can't be trained on arithmetic problems and I assume another LLM already has been. "All" that AlphaZero did in order to beat all humans at Go was combine a Transformer (aka the same fundamental technology as an LLM) that suggested some next likely moves with a Monte Carlo Tree Search to figure out which of the suggested moves would be the best. There's not a lot that looks like human reasoning going on in either case, and the LLM/Transformer was only 50% of the solution.

Well, you showed that as an example of a LLM doing arithmetic, and I pointed out that the paper itself says that the LLM is not actually doing any math there. That doesn't mean that I'm underestimating LLMs, it means that you were wrong about your claim, and it means you tried to "prove" your claim with a paper that directly contradicted what you were trying to use it to prove.

Judging from what you're saying about AlphaZero here, I think it's time to to suggest that we all agree on an important ground rule here: "LLM" is not a synonym of "transformer", "neural network", "machine learning", or "AI". LLMs are a kind of transformer, which is a kind of neural network, which is a kind of machine learning, which is a kind of AI. But that doesn't mean that LLMs can do anything that AI, machine learning, neural networks, or transformers can do.

Nor should they be expected to! Large language models, as the name implies, are models that are designed for and specialized for handling language. Similarly, there are other kinds of machine learning systems, architectures, and models that are specialized for handling other tasks, like math. The symbolic deduction engine used in AlphaGeometry is a machine-learning system, and in fact a substantial portion of the paper you posted was dedicated to innovations and advances they'd made in how to train that specialized machine-learning geometry-solver. But it wasn't a LLM!

AlphaZero may use many of the same fundamental technologies as a LLM, but that doesn't mean it's a LLM, nor does it mean that LLMs can do the exact things that it does! And just as ChatGPT would be quite poor at doing what AlphaZero does, AlphaZero would be quite poor at doing what ChatGPT does. And this is fine! There's no problem with handling different kinds of tasks with different kinds of systems that are specialized for each task. Especially when these systems can be connected together in ways that allow them to pass tasks around to whichever system is best suited for the specific task that's been given to them.

Different tools for different tasks. The problem is when one of the tools is a talking hammer, and suddenly people start thinking everything is a nail.

Adbot
ADBOT LOVES YOU

Boris Galerkin
Dec 17, 2011

I don't understand why I can't harass people online. Seriously, somebody please explain why I shouldn't be allowed to stalk others on social media!

Lucid Dream posted:

If the LLM enables the problem to be solved then it kinda solved it. If I use a calculator to help me do a complicated math problem, I still solved it even if the calculator helped. If the LLM is smart enough to choose to use the calculator then I think it counts.

I get what you're trying to say here but it's also wrong. Every day I use a fancy calculator to solve an equation similar to this:



There is a difference between saying I solved it and that I used a computer to solve it. If I said I solved it, and I actually did, then I'd probably get grant and research money thrown at me left and right and a fast-track to a tenured professorship. Technically the computer doesn't even solve it because there are no solutions to this equation. I use the computer to tell me what the answer could be, but not what it actually is because again there is no solution and it turns out the inputs I give it are like the most important thing.

That weather prediction model someone posted is interesting. People have been using machine learning to fit their data and to get appropriate inputs since forever. I haven't read that paper but from glancing at it, it seems to be an extension of that.

At the end of the day, I have Feelings about using AI in these types of numerical computations. The most important thing about the results that come out of these systems of equations (which are well known!) is that they are only as good as their inputs. People have been making guesses and assumptions to what these inputs are for centuries. It's nothing new. New techniques come now and then or old techniques are discovered to be applicable to other fields. But ultimately, the inputs used are well explained and well reasoned. Unless the AI can explain why it decided that the parameter alpha = 3.4 is the best and most appropriate value to use and how it arrived at this conclusion it is entirely not useful.

For example that AI that plays GO may be able to say "this move is the best" and that works if you just want to crush your opponent, but why is it the best move? Nobody knows and nobody can explain it.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply