Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Cybernetic Vermin
Apr 18, 2005

attended a talk about the state library of record making an llm out of their collections. was surprisingly compelling, very self-deprecating guy making a case for them having literally everything printed/published, so while it'll be biased for sure the bias is necessarily a reflection of a very unbiased sampling procedure. plus they have unique legal standing as library data retrieval has special legal copyright carve-outs.

i appreciate it. my faith in random companies collecting garbage and then doing token efforts on bias is extremely low, and kind of expect such a record library model to be way more... at least, interesting.

Adbot
ADBOT LOVES YOU

infernal machines
Oct 11, 2012

we monitor many frequencies. we listen always. came a voice, out of the babel of tongues, speaking to us. it played us a mighty dub.
if the output can be trustworthy (as in, an accurate summation or synthesis of the ingested data), that seems like an absolutely ideal use case

Beeftweeter
Jun 28, 2005

OFFICIAL #1 GNOME FAN

infernal machines posted:

if the output can be trustworthy (as in, an accurate summation or synthesis of the ingested data), that seems like an absolutely ideal use case

lol big if. i've yet to see a model that can be trusted to do that 100% of the time, but then again i've only messed around with public implementations trained on god knows what. running a local model seems interesting but it's not really something i can do with the hardware i have here

infernal machines
Oct 11, 2012

we monitor many frequencies. we listen always. came a voice, out of the babel of tongues, speaking to us. it played us a mighty dub.

Beeftweeter posted:

lol big if. i've yet to see a model that can be trusted to do that 100% of the time, but then again i've only messed around with public implementations trained on god knows what. running a local model seems interesting but it's not really something i can do with the hardware i have here

i think it was Cybernetic Vermin saying elsewhere that llms are less prone to "hallucinations" with tighter parameters than what's being used for openai/bing/whatever.

idk, and given what i've seen from the commercial solutions, i'd be disinclined to trust it in general, but i certainly don't know anything about the state of the art

Shaggar
Apr 26, 2006
i dont think its possible for the models to produce objectively correct results consistently from the large models. And the more you clamp down on it with guiderails for the purposes of accuracy the closer you get to it resembling a text indexer or a database query.

Like you're gonna basically get to the point where all youu're using the AI for is to translate "What time is space jam playing?" to SELECT * FROM MovieTimes where movie='space jam' and datettime>= now and location = '<user location>' in order to guarantee safe results.

for subjective output like images or speech generation or w/e this is less important.

psiox
Oct 15, 2001

Babylon 5 Street Team

Shaggar posted:

i dont think its possible for the models to produce objectively correct results consistently from the large models. And the more you clamp down on it with guiderails for the purposes of accuracy the closer you get to it resembling a text indexer or a database query.

Like you're gonna basically get to the point where all youu're using the AI for is to translate "What time is space jam playing?" to SELECT * FROM MovieTimes where movie='space jam' and datettime>= now and location = '<user location>' in order to guarantee safe results.

for subjective output like images or speech generation or w/e this is less important.

you're almost definitely right, and also

i always loved the stories where the 'ai' helped bootstrap itself. epikstistes from lafferty's stories is my favorite. perhaps we can get these idiots to build all the tiny little myriad systems inside themselves to make something useful.

but who cares we already have space jam, butlerian jihad now

Beeftweeter
Jun 28, 2005

OFFICIAL #1 GNOME FAN
shaggar was right

shitface
Nov 23, 2006

well its effectively a database query except for the fact you used 10x the engineering time, 100x the power and you still can’t really trust the answer without verifying it manually. but you can say you used ~AI~ though :smuggo:

Cybernetic Vermin
Apr 18, 2005

infernal machines posted:

if the output can be trustworthy (as in, an accurate summation or synthesis of the ingested data), that seems like an absolutely ideal use case

not really predicting trustworthiness, or suitability real-world use-case really, i mostly found it interesting as a research reference point: a model where we know what went into it, what goes into it is a pretty well-defined (flawed, but very comprehensive) reflection of society, and there's no bias crudely implanted in a selection/filtering process (to be clear adjusting for bias is cool and good, but my faith in companies doing a good job of it behind closed doors is pretty much nil).

i'd expect it to be *less* useful for stuff like making a $80 billion chatbot, as the actual things going into it will be packed full of bias and agendas, but it'll be a way more sensible reference point for understanding things both about models and possibly society.

also the copyright exception tickles me a bit. openai/microsoft/google should get beaten down for using other peoples data to build models willy-nilly, i find the specific "you can make chatgpt write a famous bit of shakespeare" focusing on the wrong end of the problem.

Powerful Two-Hander
Mar 10, 2004

Mods please change my name to "Tooter Skeleton" TIA.


maybe it'd be useful for an open "can you find me a book about farts please?", but it's still a sort of glorified stemming/thesaurus lookup

Chalks
Sep 30, 2009

The thing about making it 100% trustworthy is that you have to massively increase how often it will refuse to answer due to uncertainty, possibly to the point of uselessness.

Agile Vector
May 21, 2007

scrum bored



Chalks posted:

The thing about making it 100% trustworthy is that you have to massively increase how often it will refuse to answer due to uncertainty, possibly to the point of uselessness.

*adds "I think" to the start of each reply*

I'll take my one billion dollars now, OpenAI

infernal machines
Oct 11, 2012

we monitor many frequencies. we listen always. came a voice, out of the babel of tongues, speaking to us. it played us a mighty dub.

Cybernetic Vermin posted:

not really predicting trustworthiness, or suitability real-world use-case really, i mostly found it interesting as a research reference point: a model where we know what went into it, what goes into it is a pretty well-defined (flawed, but very comprehensive) reflection of society, and there's no bias crudely implanted in a selection/filtering process (to be clear adjusting for bias is cool and good, but my faith in companies doing a good job of it behind closed doors is pretty much nil).

i'd expect it to be *less* useful for stuff like making a $80 billion chatbot, as the actual things going into it will be packed full of bias and agendas, but it'll be a way more sensible reference point for understanding things both about models and possibly society.

also the copyright exception tickles me a bit. openai/microsoft/google should get beaten down for using other peoples data to build models willy-nilly, i find the specific "you can make chatgpt write a famous bit of shakespeare" focusing on the wrong end of the problem.

i meant that the ideal use case is having a very feature limited system that can effectively parse fuckloads of known data and return at least "useful" results to natural language queries

as in, it might actually be a task suited to the technology, as opposed to an $80 billion chatbot.

Agile Vector posted:

*adds "I think" to the start of each reply*

I'll take my one billion dollars now, OpenAI

"on balance of probabilities..."

Shaggar
Apr 26, 2006

Chalks posted:

The thing about making it 100% trustworthy is that you have to massively increase how often it will refuse to answer due to uncertainty, possibly to the point of uselessness.

See thats not even it. the problem is it has no concept of objective certainty. Just consensus based on the model

infernal machines
Oct 11, 2012

we monitor many frequencies. we listen always. came a voice, out of the babel of tongues, speaking to us. it played us a mighty dub.
when i say trustworthy, i'm not referring to divining some objective truth from the source data, i mean not simply fabricating information for a response

basically severely limiting the generative aspect

probably this is fundamentally at odds with the concept

Shaggar
Apr 26, 2006
yeah, for something like chatgpt the generative aspect is litterrally the only aspect. there are no non-generative components here that you could somehow limit it to using.

Like theres no way to ask it to spit out the original data from the training set. The best you can do is get it to generate something approximating the original input.

infernal machines
Oct 11, 2012

we monitor many frequencies. we listen always. came a voice, out of the babel of tongues, speaking to us. it played us a mighty dub.
yet another ai benchmark

Chalks
Sep 30, 2009

Shaggar posted:

See thats not even it. the problem is it has no concept of objective certainty. Just consensus based on the model

Yeah, that's what I mean - the only way it could hope to approach "100% reliability" is with total consensus in the training data which will leave in unable to answer basically anything - or just leaving it responding with vague maybe the maybe that.

It cannot sensibly choose between the earth being flat or round if its training data contains both arguments without either hallucination or being given a specific editorial command on the subject. Hallucination is an essential part of its answers because any certainty is always hallucinated.

Beeftweeter
Jun 28, 2005

OFFICIAL #1 GNOME FAN

Cybernetic Vermin posted:

not really predicting trustworthiness, or suitability real-world use-case really, i mostly found it interesting as a research reference point: a model where we know what went into it, what goes into it is a pretty well-defined (flawed, but very comprehensive) reflection of society, and there's no bias crudely implanted in a selection/filtering process (to be clear adjusting for bias is cool and good, but my faith in companies doing a good job of it behind closed doors is pretty much nil).

i'd expect it to be *less* useful for stuff like making a $80 billion chatbot, as the actual things going into it will be packed full of bias and agendas, but it'll be a way more sensible reference point for understanding things both about models and possibly society.

also the copyright exception tickles me a bit. openai/microsoft/google should get beaten down for using other peoples data to build models willy-nilly, i find the specific "you can make chatgpt write a famous bit of shakespeare" focusing on the wrong end of the problem.

i think what the others are getting at — this is my interpretation, anyway — is, what happens when you ask it about something outside of, but extremely similar to, its dataset?

if the answer is an objective "i don't know about that, but here's something that may be related: (actual, objectively true info that is not a direct answer to the query here)"

if the answer is "here's some information about (subject), (bullshit synthesized from related data)", well. thats not so good

the second response is basically what we've seen from commercial models. i agree that putting some extreme guardrails basically makes it a glorified database lookup, but i can still see a LLM being useful for translating the retrieval into natural language — but that's about it. i'm less sure it can effectively, and accurately, do a query like the above

n.b. again this is based off of my experiences with the commercial models. but since i've tried several different ones, and they've all had the same basic deficiencies, i don't really have much confidence that what you're describing would be much different

Shaggar
Apr 26, 2006
I kind of dislike the term hallucination because it implies that theres something "real" that the thing could be generating. All of it is new generated content and none of it is the original content.

Even if the original training set was 100% verified factual data, generational AI like this still has to generate (aka hallucinate) a new response no matter what. The original, verified data IS NOT in the model in any way we would consider safe for the purposes of objective decision making.

Beeftweeter
Jun 28, 2005

OFFICIAL #1 GNOME FAN
i don't like the term "hallucination" because it implies a thought process that is absolutely, unambiguously not present. it's ascribing a capability to the model that it doesn't have.

Shaggar
Apr 26, 2006
yeah that too

Improbable Lobster
Jan 6, 2012

"From each according to his ability" said Ares. It sounded like a quotation.
Buglord

Cybernetic Vermin posted:

what goes into it is a pretty well-defined (flawed, but very comprehensive) reflection of society,

What? No it isn't

Video Nasty
Jun 17, 2003

Cybernetic Vermin
Apr 18, 2005

Improbable Lobster posted:

What? No it isn't

let's put it this way: i think one would struggle to define any better available reflection

and indeed a hugely popular resource for social sciences as things stand

e: to be clear this is more a "library of congress" than a regular library, everything published in the country in the last 400 years, with an enforcement arm going knocking on the doors of anyone not sending them a copy of whatever they're printing/broadcasting/posting (internet criteria is complex). a bias in that for sure, but it is a very non-random sampling. lots of fast food menus too.

Cybernetic Vermin fucked around with this message at 17:04 on Nov 30, 2023

Beeftweeter
Jun 28, 2005

OFFICIAL #1 GNOME FAN
lmao, did it come up with "smooth, cool and childish" on its own?

ngl that's pretty good

Cybernetic Vermin
Apr 18, 2005

Beeftweeter posted:

i think what the others are getting at — this is my interpretation, anyway — is, what happens when you ask it about something outside of, but extremely similar to, its dataset?

starting this out with "when you ask it" is misunderstanding what i am excited for here. granted it is overly specific and i probably should not have messed up the thread with it: sorry.

this is the kind of thing where the "blurry jpeg" aspect is kind of the thing. both to study the data (a lot of linguistic questions likely to be possible to treat, forming a much better representation of e.g. dialectical nuance than is avaliable elsewhere), but it's clearly defined coverage also makes it possible to check the "blurry jpeg" as a hypothesis: is it actually capturing something like a random sample or will researchers discover things *not represented* that seems like it statistically should be?

i.e. this is pretty deeply researchy. though i do also have hopes it'll represent some things you would but get out of a commercial model (precisely because it is not desirable chat bot stuff)

Improbable Lobster
Jan 6, 2012

"From each according to his ability" said Ares. It sounded like a quotation.
Buglord

Cybernetic Vermin posted:

let's put it this way: i think one would struggle to define any better available reflection

and indeed a hugely popular resource for social sciences as things stand

A database full of a small slice of human media os not a reflection of society, it is a reflection of the media that was fed into it.

Video Nasty
Jun 17, 2003

I only added the text

Cybernetic Vermin
Apr 18, 2005

Improbable Lobster posted:

A database full of a small slice of human media os not a reflection of society, it is a reflection of the media that was fed into it.

edited above: not a small slice.

ultimately all reflections are flawed, but i do think this is one of the better ones we have. otherwise, are we shutting down sociology or are you proposing a better reflection?

Cybernetic Vermin fucked around with this message at 17:34 on Nov 30, 2023

infernal machines
Oct 11, 2012

we monitor many frequencies. we listen always. came a voice, out of the babel of tongues, speaking to us. it played us a mighty dub.

Cybernetic Vermin posted:

are we shutting down sociology

probably for the best

Beeftweeter
Jun 28, 2005

OFFICIAL #1 GNOME FAN

Cybernetic Vermin posted:

starting this out with "when you ask it" is misunderstanding what i am excited for here. granted it is overly specific and i probably should not have messed up the thread with it: sorry.

this is the kind of thing where the "blurry jpeg" aspect is kind of the thing. both to study the data (a lot of linguistic questions likely to be possible to treat, forming a much better representation of e.g. dialectical nuance than is avaliable elsewhere), but it's clearly defined coverage also makes it possible to check the "blurry jpeg" as a hypothesis: is it actually capturing something like a random sample or will researchers discover things *not represented* that seems like it statistically should be?

i.e. this is pretty deeply researchy. though i do also have hopes it'll represent some things you would but get out of a commercial model (precisely because it is not desirable chat bot stuff)

ah i see what you mean. interesting indeed, i hope you guys crack that nut

Chalks
Sep 30, 2009

Shaggar posted:

I kind of dislike the term hallucination because it implies that theres something "real" that the thing could be generating. All of it is new generated content and none of it is the original content.

Even if the original training set was 100% verified factual data, generational AI like this still has to generate (aka hallucinate) a new response no matter what. The original, verified data IS NOT in the model in any way we would consider safe for the purposes of objective decision making.

yeah, very true. i only really use it because you often see people talking about "how do we solve the problem of AI hallucinations?" and it's like... buddy, there's nothing else!

Video Nasty
Jun 17, 2003

Bing won't do a prompt with "white lines of powder" or even "lines of powder" but it WILL allow a prompt including simply "powder"

Powerful Two-Hander
Mar 10, 2004

Mods please change my name to "Tooter Skeleton" TIA.


these are brilliant

Beeftweeter
Jun 28, 2005

OFFICIAL #1 GNOME FAN
hmm

Beeftweeter
Jun 28, 2005

OFFICIAL #1 GNOME FAN
lmao what in the world

Video Nasty
Jun 17, 2003

Are you trying to get them to smoke? You can use the word cigarette and it will know what to do from there.

Beeftweeter
Jun 28, 2005

OFFICIAL #1 GNOME FAN
hm,



oh my



that's maggie lol

Adbot
ADBOT LOVES YOU

Beeftweeter
Jun 28, 2005

OFFICIAL #1 GNOME FAN
lmao



ok i think i'm done

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply