Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
gurragadon
Jul 28, 2006

Liquid Communism posted:

A GPT-4 implementation is incapable of experiencing stress, and again has open access to the materials being tested on, so by its very nature a test of its stress management and memory cannot have any meaningful results.

I was asking if you thought the bar exam had meaningful results for a human? What you see as not having meaningful results from GPT-4, I see as GPT-4 removing the obstacle of stress management from the process and performing in a superior way.

If the Bar exam was still timed but was open book would that change your opinion? The information would be available, and the difference would be GPT-4 being able to access that information faster.

Adbot
ADBOT LOVES YOU

Good Dumplings
Mar 30, 2011

Excuse my worthless shitposting because all I can ever hope to accomplish in life is to rot away the braincells of strangers on the internet with my irredeemable brainworms.

porfiria posted:

Yeah but it also knows you can't build a real horse out of cabbage (but you can build a statue of one), that horses can't drive because they aren't smart enough and don't have hands, and so on. All this stuff may just be weighted values in a huge matrix or whatever, but it can build associations that are vastly more extensive and subtler than words just tending to appear near other words in the training data.

You edited your response a bit. So just to expand:

I'd say it does "know" what a horse is, but that's for some definition of "know." It doesn't have any kind of audio or visual model for a horse (although it probably will soon, so it's kind of a moot point). And of course it doesn't have any personal, subjective associations with a horse in the way that a human does.

But as a matter of language, I'd say yeah, it can deploy "horse" correctly, and "knows" just about all the facts about horses there are, and how those facts inter-relate to other facts about the world in a comprehensive way that, to my mind, meets a lot of the criteria for "knowing" something.

It doesn't know you can't build a real horse out of cabbage, that's the exact reason why hallucinations are a thing.

The current generators infamously have problems consistently answering 1+1 = ? correctly, and that's the most obvious sign that pattern-matching is categorically different from reasoning: you can't add processing power or more examples of math and be able to be 100% sure that it won't answer fundamental problems like that wrong. You can be 99% sure, but then that 1% of the time comes up in an engineering spec, or a legal contract, and suddenly the theoretical difference between those concepts is painfully clear.

It's absolutely possible to create learning systems that don't gently caress this up - a system that uses operations as the base content of its model rather than straight data would be perfect for faking 'reasoning' - but it still wouldn't be truly capable of it or creativity since it'd be limited to only the operations we taught its model. Even then, it'd be a very different critter, limited to questions about that specific field of operations rather than the 'jam these two genres together/hallucinate infinite detail' that GPT can do.

Good Dumplings fucked around with this message at 02:32 on Mar 28, 2023

Liquid Communism
Mar 9, 2004


Out here, everything hurts.




gurragadon posted:

I was asking if you thought the bar exam had meaningful results for a human? What you see as not having meaningful results from GPT-4, I see as GPT-4 removing the obstacle of stress management from the process and performing in a superior way.

If the Bar exam was still timed but was open book would that change your opinion? The information would be available, and the difference would be GPT-4 being able to access that information faster.

It does have meaningful results for a human, although I'm sure there are better approaches, but that may be my own bias against the validity of standardized tests.

If passing the test were the purpose of the test you would have a point regarding the GPT-4 results.

It is not. The purpose of the bar exam is for a candidate to demonstrate they have the skills necessary to practice law at the standard set by the bar association, as a condition of admission to the bar and licensure to practice law. The ability to manage that stress is part of the point of the test. That GPT-4 cannot experience that stress is not an indicator of superiority, so much as a demonstration that it lacks the basic capabilities that are being tested in the first place. So far as I can tell from the article, it was also only tested on the first two portions of the bar exam, the multiple choice and essay portions, and not the significantly more important performance test where an aspiring lawyer is given a standard task such as generating a brief or memo for a case file set in a fictional state, along with a library of the laws of said fictional state.

I do not expect that by its design the GPT-4 is capable of a relatively simple task of reasoning using a dataset on which it has not been trained.

All somewhat beside the point as the GPT-4 cannot in point of fact practice law, because by definition a lawyer requires personhood, which a chat algorithm is incapable of.

PT6A
Jan 5, 2006

Public school teachers are callous dictators who won't lift a finger to stop children from peeing in my plane
I think the degree to which an AI being capable of passing the bare minimum requirements of a given vocation is being lauded is quite stunning. Like, loving Rudy Giuliani passed the bar! It's not really that impressive, and it doesn't mean poo poo. The basic licensure requirements of any profession are usually not that difficult, and do not represent expert practice in that profession. What's happening here is people are looking at this thing and saying "it knows poo poo I don't, it must be an expert!!!" when that's really not the case.

BrainDance
May 8, 2007

Disco all night long!

Good Dumplings posted:


The current generators infamously have problems consistently answering 1+1 = ? correctly, and that's the most obvious sign that pattern-matching is categorically different from reasoning: you can't add processing power or more examples of math and be able to be 100% sure that it won't answer fundamental problems like that wrong. You can be 99% sure, but then that 1% of the time comes up in an engineering spec, or a legal contract, and suddenly the theoretical difference between those concepts is painfully clear.

It's not that I disagree (I'm not sure) but I'm not really sure this is all that important. Language models aren't ever going to be a kind of copy of a whole brain, and brains don't work that way either. They're complexly interconnected parts with some parts capable of some things and others capable of other things.

You probably do just know what 1+1 is without having to calculate it, but that's from exposure to the answer countless times so it's just a fact to you. But, other calculations that are more complex, the part of you that knows your name or facts or how to speak doesn't really know it either, it gets it from another place. That's why people can have things like dyscalculia and function otherwise completely fine.

I made a script that passes a prompt to W|A before for dice rolls (which is more than you need for that, but it wasn't a serious project and was more just a proof of concept) and then pass the results to GPT-Neo and that worked. That seems to be the idea OpenAI has with the plugin things too. And that's basically how actual brains do it so there's nothing too weird about doing it that way. And if the AI is able to interpret the input from another system or AI correctly than, I mean it is what it is then right? If it works it works. I guess what I'm saying is, being bad at doing math isn't a sign of not having the capacity to reason, and if another system can integrate with the AI to give it that ability that doesn't show us y can't reason either.

gurragadon
Jul 28, 2006

Liquid Communism posted:

It does have meaningful results for a human, although I'm sure there are better approaches, but that may be my own bias against the validity of standardized tests.

If passing the test were the purpose of the test you would have a point regarding the GPT-4 results.

It is not. The purpose of the bar exam is for a candidate to demonstrate they have the skills necessary to practice law at the standard set by the bar association, as a condition of admission to the bar and licensure to practice law. The ability to manage that stress is part of the point of the test. That GPT-4 cannot experience that stress is not an indicator of superiority, so much as a demonstration that it lacks the basic capabilities that are being tested in the first place. So far as I can tell from the article, it was also only tested on the first two portions of the bar exam, the multiple choice and essay portions, and not the significantly more important performance test where an aspiring lawyer is given a standard task such as generating a brief or memo for a case file set in a fictional state, along with a library of the laws of said fictional state.

I do not expect that by its design the GPT-4 is capable of a relatively simple task of reasoning using a dataset on which it has not been trained.

All somewhat beside the point as the GPT-4 cannot in point of fact practice law, because by definition a lawyer requires personhood, which a chat algorithm is incapable of.

That article is not correct according to this paper that was linked by the GPT-4 information on OpenAI's page. GPT-4 took the entire Uniform Bar Exam. The paper is actually pretty interesting, and it breaks out the answers to the various questions and the memos GPT-4 wrote for the MEE and MPT components.

From the Paper Abstract posted:

In this paper, we experimentally evaluate the zero-shot performance of a preliminary version of GPT-4 against prior generations of GPT on the entire Uniform Bar Examination (UBE), including not only the multiple-choice Multistate Bar Examination (MBE), but also the open-ended Multistate Essay Exam (MEE) and Multistate Performance Test (MPT) components.

You also can't discount that a major purpose of a test is in fact to demonstrate knowledge which is shown by passing the test. The candidate of the Bar exam needs to have skills to be a lawyer, but they need to know about being a lawyer too.

I don't expect that a human would be capable of reasoning using a dataset it hasn't been trained on either. Isn't that the whole point of going to school? I mean sure I could reason on law, but it would be rudimentary compared to somebody who went to law school. Just like Chat GPT can seem to reason on law but really poorly, GPT-4 with training can seem to reason about law a whole lot better.

It being currently barred from practicing law might not be permanent. Obviously, it would be used as an aid for lawyers for quite a long time if it's used at all. But just because it's barred from something doesn't mean it couldn't perform the task competently.

goferchan
Feb 8, 2004

It's 2006. I am taking 276 yeti furs from the goodies hoard.

Good Dumplings posted:

It doesn't know you can't build a real horse out of cabbage, that's the exact reason why hallucinations are a thing.

The current generators infamously have problems consistently answering 1+1 = ? correctly, and that's the most obvious sign that pattern-matching is categorically different from reasoning: you can't add processing power or more examples of math and be able to be 100% sure that it won't answer fundamental problems like that wrong. You can be 99% sure, but then that 1% of the time comes up in an engineering spec, or a legal contract, and suddenly the theoretical difference between those concepts is painfully clear.

It's absolutely possible to create learning systems that don't gently caress this up - a system that uses operations as the base content of its model rather than straight data would be perfect for faking 'reasoning' - but it still wouldn't be truly capable of it or creativity since it'd be limited to only the operations we taught its model. Even then, it'd be a very different critter, limited to questions about that specific field of operations rather than the 'jam these two genres together/hallucinate infinite detail' that GPT can do.

As an AI language model, its understanding of horses is based on the textual data and patterns it has been trained on. It does not possess personal experiences, emotions, or a physical presence, so its understanding is limited to the information it has been exposed to during its training.

In that context, its understanding of horses is derived from the descriptions, facts, and relationships between concepts that are found within the text data it has been trained on. It can provide information about horses, answer questions related to them, and discuss various aspects of horses based on that information, but its understanding is ultimately rooted in language and text, rather than personal experience or direct perception.

reignonyourparade
Nov 15, 2012

goferchan posted:

As an AI language model, its understanding of horses is based on the textual data and patterns it has been trained on. It does not possess personal experiences, emotions, or a physical presence, so its understanding is limited to the information it has been exposed to during its training.

In that context, its understanding of horses is derived from the descriptions, facts, and relationships between concepts that are found within the text data it has been trained on. It can provide information about horses, answer questions related to them, and discuss various aspects of horses based on that information, but its understanding is ultimately rooted in language and text, rather than personal experience or direct perception.

This is true but also the same could be said about a not negligible amount of humans when it comes to horses.

goferchan
Feb 8, 2004

It's 2006. I am taking 276 yeti furs from the goodies hoard.

reignonyourparade posted:

This is true but also the same could be said about a not negligible amount of humans when it comes to horses.

You are not wrong, but honestly that post you're replying to was just a copy-paste of ChatGPT4's response when I asked it to describe how it understood horses in the 3rd person

reignonyourparade
Nov 15, 2012
chatpgt-ed again

Mega Comrade
Apr 22, 2004

Listen buddy, we all got problems!

reignonyourparade posted:

This is true but also the same could be said about a not negligible amount of humans when it comes to horses.

Can it? Even if you have never seen a horse you have likely seen another animal that is similar. It's how we often describe things 'x is like y' to further understand, as at some point you will hit a reference a human has a personal experience or perception of. Is that comparable to a LLM?

Rogue AI Goddess
May 10, 2012

I enjoy the sight of humans on their knees.
That was a joke... unless..?

Raenir Salazar posted:

Chapter 3: TBD, Solutions aside from the complete overthrow of capitalism?
Make it obsolete with Capitalism 2.0. There's no reason why an AI can't be trained to create and run a business.

Raenir Salazar
Nov 5, 2010

College Slice

Rogue AI Goddess posted:

Make it obsolete with Capitalism 2.0. There's no reason why an AI can't be trained to create and run a business.

It'd outperform musk probably.

gurragadon
Jul 28, 2006

reignonyourparade posted:

chatpgt-ed again

You can't always tell because it's getting so good. A good way I've seen is that anytime you ask it some kind of questions about emotions, knowledge, love and "human" things like that is that it will begin its conversation with "As an AI language model" or some derivative. It is programmed to lean on that pretty heavily, which is fair, I think OpenAI wants to keep people from trying to go crazy with it.

Ouroboros
Apr 23, 2011

Rogue AI Goddess posted:

Make it obsolete with Capitalism 2.0. There's no reason why an AI can't be trained to create and run a business.

Way ahead of you, a research group are conducting experiments where they give GPT-4 access to a bank account via APIs with a small amount of money and seeing if it can make money: https://www.youtube.com/watch?v=2AdkSYWB6LY&t=649s

Count Roland
Oct 6, 2013

Ouroboros posted:

Way ahead of you, a research group are conducting experiments where they give GPT-4 access to a bank account via APIs with a small amount of money and seeing if it can make money: https://www.youtube.com/watch?v=2AdkSYWB6LY&t=649s

I'd be shocked if AI models haven't been working with big money on the stock market for years.

porfiria
Dec 10, 2008

by Modern Video Games

Count Roland posted:

I'd be shocked if AI models haven't been working with big money on the stock market for years.

60 to 70 percent of the trading on any given day is algorithmic. Also look up high frequency trading and boggle at how abstract our economy has gotten.

Count Roland
Oct 6, 2013

porfiria posted:

60 to 70 percent of the trading on any given day is algorithmic. Also look up high frequency trading and boggle at how abstract our economy has gotten.

Yeah the high frequency stuff was years and years ago, I assume it's only gotten more abstract since.

XboxPants
Jan 30, 2006

Steven doesn't want me watching him sleep anymore.

PT6A posted:

I think the degree to which an AI being capable of passing the bare minimum requirements of a given vocation is being lauded is quite stunning. Like, loving Rudy Giuliani passed the bar! It's not really that impressive, and it doesn't mean poo poo. The basic licensure requirements of any profession are usually not that difficult, and do not represent expert practice in that profession. What's happening here is people are looking at this thing and saying "it knows poo poo I don't, it must be an expert!!!" when that's really not the case.

The reason it was impressive to me wasn't because it was objectively impressive, it was because of the rate at which it improved. ChatGPT 3.5 scored in the bottom 10% of test takers for the bar, and then just a few months later v4.0 comes out and scores in the top 90%. That's a very rapid improvement. It doesn't mean that ChatGPT 4.0 is capable of practicing law or anything, but it is shockingly fast improvement.

It feeds into a larger narrative where some version of one of these models will be released, and it will seem impressive at first, but quickly people will find weaknesses, things it can't do. And they use those weaknesses as evidence of how far away modern AI is from being able to be useful in a meaningful way. And then very, very rapidly, a new version comes out that can do the exact thing that it couldn't before.

So then that makes it difficult to me to make any firm guesses on what the state of AI is going to look like 5, 10, 20 years from now and that's exciting, and also concerning. We want to try to prepare for what fields of employment are going to be displaced, and when, and that's turning out to be exceedingly difficult to predict.

Count Roland
Oct 6, 2013

Do you guys know people using chatgpt or a similar service regularly?

I was struck recently to learn two friends of mine are, both using it for work. One guy does sports administration; he's organizing a complicated meetup of athletes from across the country. He says it helps him organize stuff, write emails etc.

Another friend is a writer. He says chatgpt helps him modify scripts and story ideas in complicated ways, like by writing loose ideas into an ordered narrative, by offering alternative motivations for characters, or by writing a full article which HR would then go and edit. Says a few minutes with chatgpt can save him half a day.

Aramis
Sep 22, 2009



Of course. Both me, and most of my colleagues do at this point.

If your job involves producing textual content, be it articles, code, recipes, whatever, then ChatGPT can almost certainly be used, today, to make your job easier. It might take a bit of time to establish an efficient workflow, but there's no denying that this is a useful tool as it stands.

I've personally moved on from using the web ui to a simple python script that formats frequent queries into templates that I know provide good results and interfaces with the API, but a text file and some copy-pasting can go a long way already.

Frankly, unless you have specific constraints preventing you to do so, you would be a fool to not at least give it a shot. All this to say, I'm scratching my head as to why you found this surprising at all.

edit: Big emphasis on it being a tool. It's not something that replaces work entirely yet, but it is something that you can wield very effectively in a myriad of contexts.

Aramis fucked around with this message at 19:51 on Mar 28, 2023

Mega Comrade
Apr 22, 2004

Listen buddy, we all got problems!
Yeah I use it as a stack overflow replacement. Sometimes it's very helpful, sometimes it's rubbish and trying to coax the right answer takes longer than just a Google search.
But I'm very careful to never just paste anything in directly.

The amount of people probably pasting company IP into these things without questioning it has gotta be staggering.

gurragadon
Jul 28, 2006

Aramis posted:

Of course. Both me, and most of my colleagues do at this point.

If your job involves producing textual content, be it articles, code, recipes, whatever, then ChatGPT can almost certainly be used, today, to make your job easier. It might take a bit of time to establish an efficient workflow, but there's no denying that this is a useful tool as it stands.

I've personally moved on from using the web ui to a simple python script that formats frequent queries into templates that I know provide good results and interfaces with the API, but a text file and some copy-pasting can go a long way already.

Frankly, unless you have specific constraints preventing you to do so, you would be a fool to not at least give it a shot. All this to say, I'm scratching my head as to why you found this surprising at all.

edit: Big emphasis on it being a tool. It's not something that replaces work entirely yet, but it is something that you can wield very effectively in a myriad of contexts.

I can only speak for myself but it's surprising to me just because of the speed that ChatGPT is improving at. What XboxPants was talking about a few posts up basically.

ChatGPT was launched on November 30, 2022. I might just be late to the party, but this level of adoption is remarkable in its speed. GPT-3 was trash at the bar exam, GPT-4 is amazing at it. We just keep pushing the goal posts to see what it can do and it's not slowing down yet.

Carp
May 29, 2002

BrainDance posted:

A thing Ive been saying since I sorta stumbled into finetuning an incredibly small model (1.3B) into being roughly as good as GPT3 on one specific task (but only that specific task) is that I think transformers can potentially be very useful for a huge variety of things we haven't really tried them on yet if they're purpose trained for that specific thing. GPT is very general, so it's going to be a limited jack of all trades. But a model that follows the same principles and is only trained to do one specific task but of the same overall complexity might be very, very capable and very useful at that one task.

BrainDance, I meant to ask this earlier: which small model were you fine-tuning? GPT-3 is open source and uses TensorFlow and other common libraries, so it could be fun to translate it from Python to C#. However, that's likely laborious. Alternatively, I could try fine-tuning a small model for a specific task, as you suggested, and spend less time coding.

By the way, have you come across the paper that discusses fine-tuning, or maybe training with supervised learning, a small model using the GPT-3 API? The paper claimed that it performed almost as well as GPT-3.

Count Roland
Oct 6, 2013

Aramis posted:

Of course. Both me, and most of my colleagues do at this point.

If your job involves producing textual content, be it articles, code, recipes, whatever, then ChatGPT can almost certainly be used, today, to make your job easier. It might take a bit of time to establish an efficient workflow, but there's no denying that this is a useful tool as it stands.

I've personally moved on from using the web ui to a simple python script that formats frequent queries into templates that I know provide good results and interfaces with the API, but a text file and some copy-pasting can go a long way already.

Frankly, unless you have specific constraints preventing you to do so, you would be a fool to not at least give it a shot. All this to say, I'm scratching my head as to why you found this surprising at all.

edit: Big emphasis on it being a tool. It's not something that replaces work entirely yet, but it is something that you can wield very effectively in a myriad of contexts.

I just didn't understand how useful it was. I knew it was very good at writing code, for example. But I didn't know it could help a writer so effectively with the creative process. I've only just started to play it myself so I've still much to learn.

Aramis
Sep 22, 2009



Count Roland posted:

I just didn't understand how useful it was. I knew it was very good at writing code, for example. But I didn't know it could help a writer so effectively with the creative process. I've only just started to play it myself so I've still much to learn.

It's actually the other way around: It's surprisingly good at writing code, but it's still pretty bad at anything non-trivial. It really shines at writing/editing/summarising tasks. Giving it a paragraph and asking it to fluff it up, or crunch it down to its minimal essence works almost all of the time.

Vivian Darkbloom
Jul 14, 2004


From "Sparks of Artificial General Intelligence: Early experiments with GPT-4" at https://arxiv.org/abs/2303.12712



I don't know if the AGI claim makes sense but the math capability of this LLM is a major advance. ChatGPT has some ability to turn word problems into equations or give a proof of a well-known theorem. For instance, I asked ChatGPT for several proofs that there are infinite prime numbers and it provided them, but I think they're part of the training set so it's just rephrasing proofs it knows and it didn't take long to start repeating itself. But GPT-4 is giving a novel proof, and while this isn't the world's hardest problem it is a challenging one.

Vivian Darkbloom fucked around with this message at 21:49 on Mar 28, 2023

Lemming
Apr 21, 2008

Gentleman Baller posted:

One thing I've been doing with Bing's AI lately, is thinking of new puns and seeing if it can work out the theme and generate more puns that fit the theme, and it does it extremely well, I think.

When I asked it, "Chairman Moe, Vladimir Lenny, Carl Marx. Think of another pun that fits the theme of these puns." It gave me Homer Chi Minh and Rosa Luxembart (and a bunch of bad ones ofc that still fit the theme.)

I did the same with, "Full Metal Arceus, My Spearow Academia, Dragonite Ball Z" and it gave me Cowboy Beedrill and Tokyo Gimmighoul.

A common refrain I see online and even from ChatGPT itself is that it is just a text predictor, and is incapable of understanding or creating truly new things. But as far as I can tell, it can do something that is at least indistinguishable from understanding and unique creation, right?

Edit: I guess what I have been trying to wrap my head around is, if this isn't understanding and unique creation then what is the difference?

This is the important part, because all the "bad results" were equally valid compared to the "good results" from the perspective of the text generation. You picked out the ones you judged to actually have some value. The "intelligence" that came out was the understanding and curation of the human who was overseeing it, for the same reason why it was even able to produce any of those results at all were because it had a large enough data set of things created by people in the first place.

The difference is that if its input continued to be generated on its own output, it would drift further and further into being completely garbage. It requires the large data sets of intelligent input to be able to produce a facsimile of that intelligent input.

Gentleman Baller
Oct 13, 2013

Lemming posted:

This is the important part, because all the "bad results" were equally valid compared to the "good results" from the perspective of the text generation. You picked out the ones you judged to actually have some value. The "intelligence" that came out was the understanding and curation of the human who was overseeing it, for the same reason why it was even able to produce any of those results at all were because it had a large enough data set of things created by people in the first place.

The difference is that if its input continued to be generated on its own output, it would drift further and further into being completely garbage. It requires the large data sets of intelligent input to be able to produce a facsimile of that intelligent input.

The bad results were still correct, just less enjoyable as puns, because they used less famous references or sounded a bit more forced when spoken aloud. No different to the sort of thing you'd see in a pun spit balling session, from my experience.

In all examples the AI was capable of figuring out the correct formula for my puns without me explaining it, and applied it accurately to create words or phrases that certainly aren't in its dataset. That is the understanding and creation I am referring to. Not trying to imply punsmiths are out of the job already or anything.

Gentleman Baller fucked around with this message at 00:24 on Mar 29, 2023

goferchan
Feb 8, 2004

It's 2006. I am taking 276 yeti furs from the goodies hoard.

Count Roland posted:

I just didn't understand how useful it was. I knew it was very good at writing code, for example. But I didn't know it could help a writer so effectively with the creative process. I've only just started to play it myself so I've still much to learn.

Aramis posted:

It's actually the other way around: It's surprisingly good at writing code, but it's still pretty bad at anything non-trivial. It really shines at writing/editing/summarising tasks. Giving it a paragraph and asking it to fluff it up, or crunch it down to its minimal essence works almost all of the time.

Yeah it's very useful at things that take a little bit of brainpower that you could definitely do but just don't feel like doing. "Make this more concise" or "rephrase this in the style of a business email" or whatever typically produce very acceptable results.

BrainDance
May 8, 2007

Disco all night long!

Carp posted:

BrainDance, I meant to ask this earlier: which small model were you fine-tuning? GPT-3 is open source and uses TensorFlow and other common libraries, so it could be fun to translate it from Python to C#. However, that's likely laborious. Alternatively, I could try fine-tuning a small model for a specific task, as you suggested, and spend less time coding.

By the way, have you come across the paper that discusses fine-tuning, or maybe training with supervised learning, a small model using the GPT-3 API? The paper claimed that it performed almost as well as GPT-3.

I was finetuning GPT-Neo, mostly 1.3B, because anything larger needed a lot of ram (deepspeed offloads some of the vram stuff to normal ram) and if you're using swap instead the training time jumps from hours to days. LLaMA got a lot of people paying attention to this though, and now we can use LoRA with the language models so I've been doing that with LLaMA.

Didn't see that paper but if you find it again let me know.

Carp
May 29, 2002

BrainDance posted:

I was finetuning GPT-Neo, mostly 1.3B, because anything larger needed a lot of ram (deepspeed offloads some of the vram stuff to normal ram) and if you're using swap instead the training time jumps from hours to days. LLaMA got a lot of people paying attention to this though, and now we can use LoRA with the language models so I've been doing that with LLaMA.

Didn't see that paper but if you find it again let me know.

Have you come across "GPT in 60 Lines of NumPy" as a way of understanding the basics? There's a post about it on Hacker News that I found informative. Although I may end up using something like LLaMA in the future, right now I feel like I need to learn from scratch and build a foundation. I'm feeling a bit lost with my usual hacking methods of learning. Also, I realized that I was mistaken earlier - GPT-3 isn't actually open source. There's a repository out there, but it is just data supporting a paper.

Wow, yeah, LoRA sounds very interesting, and I agree that it would make fine-tuning a large model much easier. How has it worked out for you? There is so much new information out there about deep learning and LLMs. If I come across the paper again, I'll be sure to let you know.

tbp
Mar 1, 2008

DU WIRST NIEMALS ALLEINE MARSCHIEREN

Vivian Darkbloom posted:

I don't know if the AGI claim makes sense but the math capability of this LLM is a major advance. ChatGPT has some ability to turn word problems into equations or give a proof of a well-known theorem. For instance, I asked ChatGPT for several proofs that there are infinite prime numbers and it provided them, but I think they're part of the training set so it's just rephrasing proofs it knows and it didn't take long to start repeating itself. But GPT-4 is giving a novel proof, and while this isn't the world's hardest problem it is a challenging one.

This is fascinating. GPT-3, from my experimentation, has a lot of problems with maths. If it seems to have a proof somewhere in it's training, it's fine, but I struggled getting coherency out of it when discussing problems that don't currently have proofs. I spent a while talking to it about Goldbach's and I realized after a bit that it was considering 4 a prime number, which was strange.

This proof is pretty good though. It's about how I'd have tackled it anyway

cat botherer
Jan 6, 2022

I am interested in most phases of data processing.

Carp posted:

Wow, yeah, LoRA sounds very interesting, and I agree that it would make fine-tuning a large model much easier. How has it worked out for you? There is so much new information out there about deep learning and LLMs. If I come across the paper again, I'll be sure to let you know.
I think transfer learning tools like LoRA are going to be the main way that stuff like ChatGPT gets used in industry. It's certainly been the main (only) way I've used language models in the past.

Quixzlizx
Jan 7, 2007
I got an invitation to Google Bard, and I'm honestly having trouble thinking up useful things to ask it.

It doesn't help that I can't see myself using any results without double-checking them anyway. For a more practical example, I can't see myself ever saying "Hey Google, order me some mustard," and giving up control over what exactly I'm purchasing, from whom, and for how much.

At least for AI art I'd be able to look at the picture and go "I approve/don't approve of this" without having to do outside research to verify factual accuracy.

gurragadon
Jul 28, 2006

Quixzlizx posted:

I got an invitation to Google Bard, and I'm honestly having trouble thinking up useful things to ask it.

It doesn't help that I can't see myself using any results without double-checking them anyway. For a more practical example, I can't see myself ever saying "Hey Google, order me some mustard," and giving up control over what exactly I'm purchasing, from whom, and for how much.

At least for AI art I'd be able to look at the picture and go "I approve/don't approve of this" without having to do outside research to verify factual accuracy.

Because you posted this I checked my email and I got a Google Bard invite too. I like how it generates the full text and displays it at once, as opposed to ChatGPT showing word by word (Token by token?). I also prefer Google Bards responses to the questions im asking but that is really subjective and the stuff im asking isn't useful for what a LLM would be used for currently.

Question: If I can't describe qualia, how can I say that ChatGPT/Google Bard doesn't experience qualia?

ChatGPT posted:

Qualia refers to subjective conscious experiences such as the experience of seeing the color red or feeling the sensation of pain. Since ChatGPT is an artificial intelligence language model and does not have a subjective conscious experience, it is reasonable to assume that it does not experience qualia.

However, as an AI language model, ChatGPT can understand and generate text about the concept of qualia and even generate text that might lead to discussions about the nature of qualia. Nevertheless, it is still fundamentally different from a conscious being, as ChatGPT does not have a subjective conscious experience, and all of its processing is based on algorithms and data inputs.

Google Bard posted:

You're right, it's difficult to describe qualia, which are the subjective experiences of the mind. They are the raw feels of our senses, thoughts, and emotions. Qualia are often described as being "what it's like" to experience something. For example, the qualia of seeing red might be described as a bright, warm color that fills your field of vision. The qualia of tasting chocolate might be described as a rich, sweet flavor that lingers on your tongue.

It's also difficult to say whether or not Google Bard experiences qualia. There is no scientific consensus on the existence of qualia, and some scientists believe that they are a real phenomenon, while others believe that they are an illusion. However, qualia are a fascinating and important part of the human experience.

If you can't describe qualia, then you can't say for sure whether or not Google Bard experiences them. However, there are some things that you can look for that might suggest that Google Bard does not experience qualia. For example, Google Bard does not have a physical body, so it cannot experience the same kinds of physical sensations as humans. Additionally, Google Bard does not seem to have the same kind of emotional experiences as humans. For example, Google Bard does not seem to feel happiness, sadness, anger, or fear.

Ultimately, whether or not Google Bard experiences qualia is a question that cannot be answered definitively. However, it is an interesting question to consider, and it is one that is likely to continue to be debated as artificial intelligence technology continues to develop.

Google Bard seems more likely to craft a response that is what I'm looking for. I asked the Chat Bots "How can I say?" not "Can I say?"

Carp
May 29, 2002

BrainDance posted:

[...]
Didn't see that paper but if you find it again let me know.

This is likely the closest I'm going to get to finding the paper, and it is more recent, but maybe not new to you. GPT4All is a LLaMa model finetuned on prompt-generation pairs generated with the help of OpenAI's GPT-3.5-Turbo API.

https://github.com/nomic-ai/gpt4all
2023_GPT4All_Technical_Report.pdf
Self-Instruct: Aligning Language Model with Self Generated Instructions

[edit] Found it! https://crfm.stanford.edu/2023/03/13/alpaca.html

cat botherer posted:

I think transfer learning tools like LoRA are going to be the main way that stuff like ChatGPT gets used in industry. It's certainly been the main (only) way I've used language models in the past.

What have you used them for in the past and what do think of ChatGPT and GPT-4?

Carp fucked around with this message at 13:49 on Mar 31, 2023

cat botherer
Jan 6, 2022

I am interested in most phases of data processing.

Carp posted:

What have you used them for in the past and what do think of ChatGPT and GPT-4?
I haven't used either, but I've used older language pre-trained models/vector embeddings like FastText. The jist is that these pretrained models embed each subword as a numeric vector in some latent space (usually like 300-dimensional, but considerably lower dimension than the word space), so that words that are close semantically are usually close in the embedding space. Newer and more sophisticated models embed using more context information, as opposed to just having a 1 word = 1 vector situation.

Here's a typical example of where I'd use something like this: Suppose I have a bunch of consumer reviews with 1-5 stars, with each review associated with a product, user demo info, etc., but that also contains a text review field with natural language. There's potentially a lot of good info locked up in the review text field. However, there's probably too little data to train my own text model, most of the text is short (which means the words have little context of their own), and it would be too much of a PITA anyway. So, instead of doing that, I just get the word vector embedding. Each word in each review gets its vector. Then, those vectors are averaged over each review, so now each review has a semantic vector associated with it (there's other/better ways to do that, but whatever). Just like with the words, semantically similar review texts have similar embedding vectors. These embeddings can then be used alongside the other review data in a downstream model to actually predict the rating.

I'm no expert on the newer models, but the main part of the transfer learning task is still going to be text => numeric vectors, except going through a more sophisticated prediction-context-aware transformer model rather than being essentially a dictionary of words to vectors.

cat botherer fucked around with this message at 00:54 on Mar 31, 2023

Carp
May 29, 2002


That's a pretty good summary. Much better than my notes earlier in the thread, which are a little confused.

Adbot
ADBOT LOVES YOU

Noam Chomsky
Apr 4, 2019

:capitalism::dehumanize:


So how long before it puts web developers and programmers out of work? I’m asking for a friend.

It’s me. I’m the friend.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply