|
Charlz Guybon posted:Things seem to be going...poorly Well, what exactly does he mean by that? I don't think ChatGPT grabbed a bunch of beakers and reagents and started mixing chemicals. If he asked it a basic chem question and it responded, that doesn't mean it "knows chemistry", let alone that it "taught itself advanced chemistry". There's plenty of mentions of chemistry in its training set, I'm sure. As many of the replies point out, in fact: https://twitter.com/NireBryce/status/1640259206392545282 https://twitter.com/KRHornberger/status/1640294884845158401 Without more details, I don't think I can take his extraordinary claim on face value. And in the first place, Chris Murphy is neither a chemistry expert nor a tech expert. He's a lawyer-turned-politician. This whole conversation - not only the starting remark, but how easily and uncritically posters believed it - is a great example of how much of the AI discourse is driven by blind hype.
|
# ¿ Mar 27, 2023 14:19 |
|
|
# ¿ May 10, 2024 06:48 |
|
BrainDance posted:The "synthesis" was my own guess in my quick one off example that I typed literally as an example of potential capabilities of AI, not as me stating this is the thing that AI will definitely do in exactly this way. But as an example to show the kinds of things emergent abilities in AIs lead to which you took incredibly literally and as a complete thesis for some reason. I'm sure that "AI" will have some role to play in drug discovery, but that's very different from "ChatGPT" playing a role. And that's a very important distinction to make, because the field of machine learning is much larger than ChatGPT. Natural language processors may be able to trick people into thinking they've developed "emergent" abilities and may "eventually" lead to AI, but meanwhile, real-world users have been using highly specialized machine learning setups for all sorts of small but highly practical things for years - and that includes drug discovery. But it's important to note that drug discovery "AIs" aren't a result of training a natural language model on a pharmaceutical library, they're a result of highly specialized machine learning algorithms designed specifically for drug discovery. If anything, I think the term "AI" is actively detrimental to the conversation. It causes people to lump all this stuff together as if there's no difference between them, and draws their attention away from the actual specific capabilities and technical details.
|
# ¿ Mar 27, 2023 16:01 |
|
gurragadon posted:I just want to address this point before the thread gets too far along. When I asked to remake an AI thread because the Chat-GPT thread was gassed I was told to keep it vague. The reason given for gassing the Chat-GPT thread was it was too specific to Chat-GPT in the title and the title was misleading. In the thread I hope people will refer to the specific AI programs they are referring to, but unfortunately or not this thread was directed to stay vague. Yeah, I'm talking about the specific conversation, not the thread title. We just went from someone talking about ChatGPT doing chemistry to someone linking papers about ML drug discovery models as proof that it's plausible. That's a real apples-and-oranges comparison. BrainDance posted:ChatGPT itself likely won't, but it's hard to say what even larger language models will be capable of because, like I was saying, we've seen a bunch of unexpected emergent abilities appear from them as they get larger. And what I'm getting at is that there's no real evidence that ChatGPT is capable of "doing chemistry" (a phrase that, by itself, really deserves to be specifically defined in this context), outside of a senator having an moment. Personally, I'm very wary of any claims about "emergent" abilities from ChatGPT, because the one thing natural language processors have proven to be extremely good at doing is tricking us into thinking they know what they're talking about. Extraordinary claims always need evidence, but that evidence ought to be examined especially closely when it comes to extraordinary claims about ChatGPT.
|
# ¿ Mar 27, 2023 16:54 |
|
BrainDance posted:It's like a couple of you guys are trying to take things extremely literally and completely miss the point. Yes, I am aware it can't really do chemistry. You need arms to do chemistry. ChatGPT is in some sense immaterial and has no real corporeal form which you also need to do chemistry. I'm not just trying to do an inane "well it isn't actually doing physical actions" thing. There's also other questions like "is this actually a task that requires novel reasoning abilities" or "has any expert validated the output to confirm that ChatGPT isn't just 'hallucinating' the results". For example, plenty of people have gotten ChatGPT3 to play chess, only to find it throwing in illegal moves halfway through. esquilax posted:The GPT-4 paper includes some discussion of its capability to use outside chemistry tools in the context of potentially risky emergent behaviors - specifically the capability to propose modifications to a chemical compound to get a purchasable analog to an unavailable compound. See section 2.10 (starting pdf page 55) in the context of section 2.6. I don't understand the chemistry but I'm guessing this is what the tweet was about, layered through 3-4 layers of the grapevine. Thanks for this! It really helps to be able to examine the claim in more detail than a single sentence from a tweet can provide. The thing that jumps out to me right away is that there's very little reasoning happening - the "question" given to ChatGPT is a list of specific steps it should take...and literally all of those steps are variants on "send search queries to an external service until you get something back that meets your requirements". It's just automatically Googling poo poo. Here are the steps ChatGPT4 actually took in that example:
I'm not a chemist myself so I can't really say what qualifies as "doing chemistry", but what stands out to me is that everything in that list is just querying outside tools and feeding their response to the next outside tool. ChatGPT4 isn't analyzing the chemicals or coming up with analogues, it's just feeding a chemical string to an outside tool that isn't "AI"-powered at all. There's no reasoning here, it's just a bit of workflow automation that could probably have been done with a couple dozen lines of Python. And we can't even say that this has the advantage of being able to skip the programming and do it with natural language, because a bunch of programming already had to be done to connect ChatGPT to all those tools.
|
# ¿ Mar 27, 2023 21:20 |
|
StratGoatCom posted:I would avoid those people, because generative AI is poison for IP. I think you're exaggerating the Copyright Office's decision a bit. I'd say the Zarya of the Dawn outcome is of little consequence to your average closed-source software company. Even if the actual code itself isn't copyrightable, the code isn't usually being made available in the first place. And, to quote the Zarya decision, even if AI-generated material itself is uncopyrightable, the "selection, coordination, and arrangement" of that material by humans is still copyrightable, which is a big part of software development. Moreover, even the uncopyrightable parts can still become copyrightable if sufficiently edited by humans. When you say "poison", it makes me think of "viral" IP issues like the GPL that will spread and "infect" anything they're mixed with, but the Copyright Office was pretty clear that the uncopyrightable status of generated material is quite limited and doesn't spread like that.
|
# ¿ Mar 31, 2023 17:11 |
|
StratGoatCom posted:Nope, it is very long standing doctrine that machine output cannot be copyrighted, as attempts at brute force of the copyright system would be anticipatable since Orwellian book kaleoscopes. Machine or animal generated, versus stuff touched up with is is not going to be allowed. Machine output can certainly be copyrighted. For example, photographs are machine output. What matters in whether something is copyrightable or not is whether it's the direct result of an expression of human creativity. This isn't due to worries about "brute-forcing" or anything like that, it's a practical result of the fact that only humans are legally entitled to copyright. Since only humans can legally hold copyrights, and since creative involvement with the work is necessary to claim initial copyright over it, human creative involvement is necessary because the work is uncopyrightable without human involvement. But we don't have to speak in these broad, vague terms, because there is a recent and specific Copyright Office ruling covering the use of generative AI output. They clearly state that while Midjourney output itself cannot be copyrighted, human arrangements of Midjourney output are copyrightable, and sufficient human editing would make it copyrightable. quote:The Office also agrees that the selection and arrangement of the images and text in the Work are protectable as a compilation. Copyright protects “the collection and assembling of preexisting materials or of data that are selected, coordinated, or arranged” in a sufficiently creative way. 17 U.S.C. § 101 (definition of “compilation”); see also COMPENDIUM (THIRD) § 312.1 (providing examples of copyrightable compilations). Ms. Kashtanova states that she “selected, refined, cropped, positioned, framed, and arranged” the images in the Work to create the story told within its pages. Kashtanova Letter at 13; see also id. at 4 (arguing that “Kashtanova’s selection, coordination, and arrangement of those images to reflect the story of Zarya should, at a minimum, support the copyrightability of the Work as a whole.”). Based on the representation that the selection and arrangement of the images in the Work was done entirely by Ms. Kashtanova, the Office concludes that it is the product of human authorship. Further, the Office finds that the compilation of these images and text throughout the Work contains sufficient creativity under Feist to be protected by copyright. Specifically, the Office finds the Work is the product of creative choices with respect to the selection of the images that make up the Work and the placement and arrangement of the images and text on each of the Work’s pages. Copyright therefore protects Ms. Kashtanova’s authorship of the overall selection, coordination, and arrangement of the text and visual elements that make up the Work. quote:The Office will register works that contain otherwise unprotectable material that has been edited, modified, or otherwise revised by a human author, but only if the new work contains a “sufficient amount of original authorship” to itself qualify for copyright protection. COMPENDIUM (THIRD) § 313.6(D). Ms. Kashtanova’s changes to this image fall short of this standard. Contra Eden Toys, Inc. v. Florelee Undergarment Co., 697 F.2d 27, 34–35 (2d Cir. 1982) (revised drawing of Paddington Bear qualified as a derivative work based on the changed proportions of the character’s hat, the elimination of individualized fingers and toes, and the overall smoothing of lines that gave the quote:To the extent that Ms. Kashtanova made substantive edits to an intermediate image generated by Midjourney, those edits could provide human authorship and would not be excluded from the new registration certificate. Practically, what does all this mean? If you went through Midjourney's archive and took the original images that were used in Zarya of the Dawn, you could use them freely, they're public domain. However, you can't print off and sell your own bootleg Zarya of the Dawn comics, because the comic panels and comic pages are copyrighted. Although the individual images are unprotected, the way in which she assembled those images onto comic pages contains sufficient human creativity to qualify for copyright. In other words, AI involvement doesn't wipe away human involvement and spread uncopyrightability through the entire finished work. In fact, it's exactly the opposite - human involvement wipes away AI involvement and removes uncopyrightability from the finished work.
|
# ¿ Mar 31, 2023 18:09 |
|
Count Roland posted:A side effect of Dall-e and similar programs is that there's a lot of AI art being generated, which shows up on the internet, which is trawled for data, which is then presumably fed back into AI models. I wonder if AI generated content is somehow filtered out to prevent feedback loops. The major algorithms put an invisible digital watermark in the images. It's fairly simple to check for that watermark and remove anything with that watermark from the training data. It's not perfectly reliable at the level of "mass-scraping random poo poo off the web", since modifying the image (such as resizing or cropping it) may damage the watermark, but it should at least substantially reduce the amount of AI-generated media in a training set.
|
# ¿ Apr 1, 2023 19:08 |
|
gurragadon posted:I'm a little confused as to why OpenAI can't train ChatGPT on copyrighted works as long as they aren't just replicating the work wholesale. Anybody who is training to be a writer will train themselves on copyrighted works, and every work is derivative of some experience the writer had. No writer emerges from the ether to release fully new ideas into the world, they would need to have there own language to do it. If a product can't be made usable without breaking the law and trampling all over people's rights, then I don't see how that's a problem for anyone besides the company that made the product. I know we've all gotten very used to tech startups building business models around breaking the law and betting their lawyers can delay the consequences long enough for them to build a lobbying organization to change the laws, but let's not pretend that's a good thing. But you're making a very big omission in your statement here. It's not that you can't train AIs on copyrighted works, it's that you have to get the permission of the copyright holders to train on their copyrighted works. That might be expensive or difficult, sure, but that's the cost of building something that's entirely dependent on other people's content. If you don't like the cost or the annoyance of buying or negotiating license rights, go make your own content like Netflix eventually ended up doing. Hell, that even applies to human writers. They're paying for much of the copyrighted media they consume, or otherwise complying with the licensing conditions of those works. How much money have you spent on media (or had spent on your behalf, by parents or teachers or libraries) over your entire life? Even if they're really dedicated to piracy, they've still paid for more books (or movie tickets, or Netflix subscriptions, or whatever) than OpenAI Inc has.
|
# ¿ Apr 4, 2023 18:56 |
|
gurragadon posted:Personally, I think copyright law is written in a way that tramples over people's rights and using information that is available in the world isn't trampling on peoples rights. Not from a tech start only point of view either, excessive copyright laws just stifle creativity and innovation in my opinion, which is where my stance is coming from. Everything is built on something else; you can't build something entirely independent. What's excessive about this particular application of copyright law? I think it's totally reasonable to use copyright law to impede a for-profit company which wants to use other people's works for free without a license in its for-profit product, especially when the only argument I've seen in favor of waiving copyright for AI companies is "it's inconvenient and expensive to pay people for the use of their work". In your day-to-day life, you experience copyrighted media that you didn't personally directly pay for, but that doesn't mean no one paid for it. You don't have to put money directly into the TV at the sports bar to watch the big game, but the sports bar is paying for a cable package, and that cost is part of the expenses that are passed on to customers as food prices. Even for stuff you've seen for free, sometimes people make it available for free for some formats or usages but charge for others. That said, trying to seriously nail down how much money you've spent on media throughout your entire life is besides the point. After all, the actual question at hand is "should ChatGPT be paying for the media it's trained on?". It's a yes/no question. The actual amount is none of our business. If the answer is "yes", it's up to the media's owners to decide how much they're going to charge, just as it's up to OpenAI to decide how much to charge for usage of ChatGPT. Another reason not to get hung up on nailing down your exact media spending is that it's unlikely that OpenAI would pay the same price you do. Regardless of whether AI training is similar to human learning or not, ChatGPT is not a human. It's a product. A monetized, for-profit product that charges people money to use it, and even its limited free usage is for the purpose of driving public interest and support to the for-profit company that owns it. It's fairly common for media creators to charge a higher price for works intended to be used in for-profit endeavors than they do for pieces for simple non-profit personal entertainment.
|
# ¿ Apr 4, 2023 19:56 |
|
SCheeseman posted:You're continuing to miss the point that the biggest problems generative AI creates won't actually be solved by obstructing only a subset of the businesses wanting to develop and exploit this stuff. I'm all for legislation that actually helps people who had their lives ripped apart by automation (the actual problem that capitalism has been causing for a long time!), but all you're advocating for is maintaining an already hosed status quo. I'm not talking about obstructing business at all. I'm talking about big business having to pay for the stuff they use in their for-profit products, just like everyone else. If that makes generative AI uneconomical, then so be it. But when I say AI companies should respect the rights of media creators and owners, I'm not saying that as some secret backdoor strategy to kill generative AI. I'm just saying that giving AI companies an exception to the rules everyone else has to follow is bullshit. Business shouldn't get an exception from rules and regulations simply because those regulations are inconvenient to their business model. And that's doubly true when it's big business complaining that the rules are getting in the way of their attempts to gently caress over the little guy. gurragadon posted:For this application I think it is inappropriate to apply copyright law at all really. ChatGPT was trained on 45 terabytes of text, which is just an insanely large amount of data. I don't think anyone copyright owner can claim any kind of influence on the program itself. An individual text is just a tiny bit of data that doesn't exert any influence by itself, the program needs a huge amount of text to make patterns. It's silly to say that any individual piece of text has no "influence" on ChatGPT. Clearly the text has some impact, or else OpenAI wouldn't have needed it in the first place. And if OpenAI needs that text, then they should have to pay for it. Trying to divine exactly how much of a part any individual item plays in the final dataset is a distraction. It played enough of a part that OpenAI put it in the training data. If OpenAI is using (in any way) content they don't have the rights to, they should have to pay for it (or at least ask permission to use it). Laws, rules, and regulations make plenty of products impossible to market, and they make plenty business models essentially unworkable. If AI trainers can't figure out a way to train AIs in a manner that complies with existing law, that's their problem, not ours. I don't see any reason that we should give them an exception. My ears are deaf to the cries of corporate executives complaining that regulations are getting in the way of their profit margins. And anyway, you're thinking of it exactly the wrong way around. It's definitely possible to make generative AIs with copyright-safe datasets - again, this isn't some sort of backdoor ban. However, it's only practical to do so if the law is enforced against the companies that don't use copyright-safe datasets! Otherwise, the generative AIs trained on bigger and cheaper pirated datasets will inevitably be more profitable than the generative AIs trained on copyright-safe datasets. The sticker price of ChatGPT is irrelevant. It's a for-profit service run by a for-profit corporation that's currently valued at about 20 billion US dollars. Stability AI is fundraising at a $4 billion valuation. Midjourney is valued at around a billion bucks. There's no plucky underdog in the AI industry, no poor helpless do-gooders being crushed by the weight of government regulation. It's just Uber all over again - ignore the law completely and hope that they'll be able to hold off the lawyers long enough to convince a big customer base to lobby for the laws to be changed.
|
# ¿ Apr 4, 2023 23:10 |
|
SCheeseman posted:Impede, obstruct, whatever. In any case what you want isn't going to make AI art generators uneconomical, it'll make the 'legal' ones economical only for the entrenched IP hoarders. What you want changes nothing about how people will in actuality be exploited and may even serve to make it worse! I don't know what the heck you're responding to, but it's not anything I said. At no point did I propose removing exploitation from the entire media industry as a whole. I'm not pushing some grand utopian reworking of media as we know it, nor am I talking about ways to remove all exploitation from media production. I honestly have no idea what you're talking about. All I said was that AI companies should have to follow current law and respect the existing rights of media creators and media owners. Yes, that won't put a total end to all exploitation of creatives. Nowhere did I say that it would! Honestly, that doesn't even make much sense as a response. Personally, I don't give a poo poo if Disney trains an AI dedicated to spitting out pictures of Mickey Mouse, as long as they only use art they have the right to use. I don't care if they feed all nine Star Wars movies into an AI created for the sole purpose of making a new Star Wars movie every two weeks. It would be a very foolish thing for them to do, but if they're really that determined to shoot themselves in the feet, they have every right to do so. As long as they're only using content they own, and aren't scraping fansites for Luke x Chewbacca slash fanfiction to beef up their dataset, I don't see much real harm from it, except to the very brands they use this on. Gentleman Baller posted:One thing I don't really understand here, is what the legal difference is between text interpreted computer generated works and mouse click interpreted generated works? The randomness of the AI output. As the Copyright Office put it, "prompts function closer to suggestions than orders, similar to the situation of a client who hires an artist to create an image with general directions as to its contents".
|
# ¿ Apr 5, 2023 01:25 |
|
cat botherer posted:It's not, though. In Anglo systems, this stuff falls under fair use. IDK much about other systems, but the answer is not to increase the power of IP holders. That just empowers rent-seeking behavior and creates unintended consequences. I think people really need to take a deep breath here. Whether a novel use falls under fair use or not is difficult to say in advance, because the standards for fair use are somewhat vague and arbitrary. It's not a hard, clear-line test where you either meet the conditions or not. It's a mixture of several general factors to be weighed by a judge. There's certainly a very credible argument that AI training is transformative, sure. But transformative works aren't automatically guaranteed to be fair use. That's just one of the many factors that go into a fair-use determination. For example, there's also a presumption that commercial, for-profit works are significantly less likely to be able to claim fair use than non-profit or educational uses. And one of the factors, which will assuredly play an outsized role in any inevitable AI infringement case, is what impact the potentially-infringing work might have on the market for the original work it's infringing against.
|
# ¿ Apr 5, 2023 02:50 |
|
XboxPants posted:Yeah, that doesn't even seem to be the big issue to me. Let's say I'm the small artist who draws commissions for people for their original DnD or comic book or anime characters, and I'm worried how AI gen art is gonna hurt my business. Worried my customers will use an AI model instead of patronizing me. So, we can decide that we're going to treat it as a copyright infringement if a model uses art that they don't have the copyright for. That will at least keep my work from being used as part of the model. If Disney makes an image generation AI, I certainly wouldn't expect them to license that out to just anyone. They'd guard that as closely and jealously as possible. If Disney builds a machine designed exclusively to create highly accurate and authentic images of their most valuable copyrighted characters, they're not gonna let anyone outside the company anywhere near it. StratGoatCom posted:Do you know, have links outside of the AI sphere? https://www.gofundme.com/f/protecting-artists-from-ai-technologies It's a GoFundMe by the Concept Art Association, an advocacy org for film industry concept artists which is dedicated to protecting their interests from both the movie execs and outside threats. One of the things they intend to spend the money on is a Copyright Alliance membership. So an AI "artist" who thinks the opposition to AI art is "facist" posted some carefully cropped screenshots on Twitter to falsely portray the Copyright Alliance as an organization catering exclusively to the interests of major media companies, and suggested that the entire thing was just a "psyop" by corporate stooges acting in Disney's name. It went viral, naturally, and variations on it got circulated all around by AI art supporters. There isn't actually anything to suggest that there's anything shady about the fundraiser at all, as far as I've seen. It's just something that AI art users made up and circulated around to try and deflect the near-unanimous scorn of real artists away from themselves. Since basically no one ever double-checks anything they see in a screenshot attached to a tweet, it was fairly effective at muddling the waters. KwegiboHB posted:I refuse to link directly to the gofundme, I don't want them funded. You can easily find that with a simple web search if you want to. I think I'm just fine with money going to the Authors Guild, the Screen Actors' Guild, the Directors' Guild of America, the Graphic Artists Guild, the Independent Book Publishers Association, the Association of Independent Music Publishers, and Songwriters of North America, which are just some of the numerous unions, trade associations, and artists' rights groups on that page. Yeah, Disney and a few other big wealthy companies are on that list of members, but so are an absolute fuckton of organizations dedicated to defending the rights of individual creators against those very same big businesses. It's worth noting, however, that the only money that fundraiser is sending to the Copyright Alliance is for paying its own membership fees and sponsoring other artist rights' groups to join. The lobbyist isn't being hired on the Copyright Alliance's behalf - the lobbyist will work directly for the Concept Art Association.
|
# ¿ Apr 5, 2023 06:56 |
|
IShallRiseAgain posted:That uh doesn't actually change anything. Like sure there a lot of unions, trade associations, and artists' rights groups that are part of it, but that doesn't change the fact that companies like Disney are on it too. It makes it a bit more unclear about the actual motives, but it doesn't change the fact that companies which hold a lot of rights to art see some advantage to advocating for it. Also, these unions and other artist advocacy groups are pretty much forced to fight for this because its members very much want this even if they don't understand the full implications of what would actually happen. I don't think their members would care or believe it if they tried to explain the actual probable consequences of this going through. The Copyright Alliance is not involved in that fundraiser at all. That fundraiser is run by the Concept Art Association, which has no current ties to the Copyright Alliance. The entire extent of the Copyright Alliance's involvement in that fundraiser is that the Concept Art Association wants to spend 0.2% of the fundraiser amount on buying a membership to the Copyright Alliance. That's all. This is why it's good to be specific about the details. Vague references to "shady fundraisers", "unclear motives", and "unintended consequences" just muddle the issue, allowing misconceptions and inaccuracies like this to slip past unnoticed and become the foundation of handwavey conspiratorial proclamations. Main Paineframe fucked around with this message at 15:44 on Apr 5, 2023 |
# ¿ Apr 5, 2023 15:42 |
|
Owling Howl posted:GPT can't but it seems to mimic one function of the human brain - natural language processing - quite well. Perhaps the methodology can be used to mimic other functions and that map of words and relationships can be used to map the objects and rules of the physical world. If we put a model in a robot body and tasked it with exploring the world like an infant - look, listen, touch, smell, taste everything - and build a map of the world in the same way - what would happen when eventually we put that robot in front of a mirror? Probably nothing. But it would be interesting. If you tasked a learning model with "exploring the world like an infant", it would do an extremely poor job, because sensory data describing the physical world is:
It's not realistically practical to train a machine learning model that way.
|
# ¿ Apr 9, 2023 02:45 |
|
KillHour posted:I should stress to both sides of this conversation that the reason AI researchers are talking about safety is it's really important that we need to solve a lot of very difficult problems before we create AGI. So there is both an acknowledgement that what we have now isn't that and also that we have no idea if or when it will happen. Nobody thinks a LLM as the architecture exists today is going to become an AGI on its own. The problem is we don't know what is missing or when that thing or things will no longer be missing. You're not wrong, but the AI safety people in general seem to be deeply unserious about it. I know that's a bit of an aggressive statement, but while "how do we prevent a hypothetical future AGI from wiping out humanity" sounds extremely important, it's actually a very tiny and unlikely thing to focus on. "AI", whether it's AGI or not, presents a much wider array of risks that largely go ignored in the AI safety discourse. I actually appreciate that the AI safety people (who focus seemingly exclusively on these AGI sci-fi scenarios) have gone out of their way to distinguish themselves from the AI ethics researchers (who generally take a broader view of the effects of AI tech and how humans use it, regardless of whether or not it's independently sentient). NASA did crash a rocket into an asteroid, but they also do a fair bit of climate change study, and publish plenty of info on it. They do pay a bit of attention to the unlikely but catastrophic scenarios that might someday happen in the future, such as major asteroid impact, but they're also quite loud about their concerns about climate change, which is happening right now and doing quite a bit of damage already. We can't even stop Elon Musk from putting cars on the road controlled by decade-old ML models with no object permanence, a low respect for traffic laws, and no data source aside from GPS and some lovely low-res cameras because sensor fusion was too hard. Why even think we can force the industry to handle AGI responsibly when there's been no serious effort to force them to handle non-AGI deployments responsibly? And what happens if industry invents an AGI that's completely benevolent toward humans and tries its best to follow human ethics, but still kills people accidentally because it's been put in charge of dangerous equipment without the sensors necessary to handle it safely? In my experience, the AI safety people don't seem to have any answer to that, because they're so hyperfocused on the sci-fi scenarios that they haven't even considered these kinds of hypotheticals, even though this example is far closer to how "AI" stuff is already being deployed in the real world.
|
# ¿ Apr 9, 2023 04:31 |
|
KillHour posted:Neither of those problems require a deep understanding of AI research and I think most AI researchers would object to being told to work on this as clearly outside of their field of expertise. AI researchers are, and should be, concerned with the issues of the technology itself, not the issues of society at large. The latter is what sociologists do (and there is a large crossover between sociology and technology, but it's a separate thing). Your framing of AI ethicists is more like "Techno ethicist" because the actual technology in use there isn't particularly relevant. I also object to your framing of these issues as more "real" than misalignment. Both are real issues. I think this is a deep misunderstanding. "AI researcher" and "AI ethicist" are different fields, and AI ethics certainly doesn't require a deep understanding of technology. That's because technology is just a tool - the issue is in how humans use it and in how it impacts human society. That applies even to a hypothetical future AGI. Practically speaking, misalignment is only really a problem if some dumbass hooks it up to something important without thinking about the consequences, and that is absolutely a sociological problem. And I can assure you that tech ethicists are extremely familiar with the problem of "some dumbass hooked a piece of tech up to something important without thinking about the consequences". AI ethics researchers don't need to be experts in AI, they need to be experts in ethics. Their job is to analyze the capabilities, externalities, and role of AI tools, and identify the considerations that the AI researchers are totally failing to take into account. This leads to things like coming up with frameworks for ethical analysis and figuring out how to explain the needs and risks to the technical teams (and to the execs). Yes, they focus on learning about AI as well, to better understand how it interacts with the various ethical principles they're trying to uphold, but the ethical and moral knowledge is the foundational element of the discipline. I highly recommend looking into the kind of work that Microsoft's AI ethics team was doing (at least until the entire team was laid off last month, since they're just an inconvenience now that ChatGPT is successful, lmao).
|
# ¿ Apr 9, 2023 07:56 |
|
Leon Trotsky 2012 posted:His argument kind of seems like a situation where he is saying: "Don't make AI evil" and is not really a useful assessment on a practical level. He doesn't seem to be saying anything about shutting down research, at least not in this particular article. It's hard to say, because there doesn't seem to be a transcript of his actual words anywhere and all the articles are mostly just paraphrasing him, but I don't see anything saying he's calling for a research halt. Rather, what he seems to be concerned about is companies buying into the AI hype, abandoning all safeguards and ethical considerations, and widely deploying it in increasingly irresponsible and uncontrolled ways in a race to impress the investors. quote:Until last year, he said, Google acted as a “proper steward” for the technology, careful not to release something that might cause harm. But now that Microsoft has augmented its Bing search engine with a chatbot — challenging Google’s core business — Google is racing to deploy the same kind of technology. The tech giants are locked in a competition that might be impossible to stop, Dr. Hinton said. Moreover, he seems particularly concerned about the risk of companies letting AI tools operate on their own without humans checking to make sure their output doesn't have unexpected or undesirable side effects. quote:Down the road, he is worried that future versions of the technology pose a threat to humanity because they often learn unexpected behavior from the vast amounts of data they analyze. This becomes an issue, he said, as individuals and companies allow A.I. systems not only to generate their own computer code but actually run that code on their own.
|
# ¿ May 2, 2023 19:33 |
|
Leon Trotsky 2012 posted:He says he didn't sign on to one of the letters calling for a moratorium on AI research because he was still working at Google, but he agreed with it. It says that he didn't sign those letters, and it also says that he didn't want to publicly criticize Google until he quit his job. It doesn't actually state that those two points are related, nor does it actually state that he agreed with the letter. This goes back to me point about how it's difficult to work out the details of his stance because the reporters are paraphrasing him rather than reporting his words directly.
|
# ¿ May 2, 2023 19:39 |
|
the other hand posted:For those who have the time to listen to (very) lengthy audio, this guy’s YouTube channel has a lot of interviews with leading people in AI research and industry. Some of his recent guests were the Boston Dynamics CEO, a computational biology professor from MIT, and the CEO of OpenAI, which makes the GPT software (video below). I wasn't previously familiar with Lex Fridman, so I Googled him and it took about ten seconds to find that he's an ex-researcher who'd been demoted from research scientist to unpaid intern after some "controversial" studies. So I searched to find out what kinds of controversy he'd been involved in, and it took another ten seconds to find an article titled Peace, love, and Hitler: How Lex Fridman's podcast became a safe space for the anti-woke tech elite. Hell of a title! A little more Googling brings up plenty of results suggesting that people in the AI and machine learning industries largely regard him as a grifter who doesn't understand half as much as he claims to. That Business Insider article is by far the best source I've found, so let's pull it out from behind that paywall: quote:Peace, love, and Hitler: How Lex Fridman's podcast became a safe space for the anti-woke tech elite I don't care who guest-stars on this show, I'm not going to listen to Mr. Emphasize With Hitler talk about anything with anyone for two hours. It seems like common-sense to vet a source before you waste hours listening to them.
|
# ¿ May 3, 2023 02:37 |
|
SCheeseman posted:What interests me about this is where the line for parody and satire is drawn now that AI tools allow for uncannily realistic impressions. Dudesy reportedly implicitly talks about the artifice of their "AI" on the show with their fanbase being in on it, the word kayfabe comes up often on the show and and in the fan community, so it's not like this could be framed as some kind of conniving scam. They tipped their hand to anyone paying attention (I wasn't). The short answer is "that's up to the judge". Fair use is inherently a subjective rule; there's no clear and specific line in the sand that everyone can point to. There's plenty of stuff that's obviously fair use, but if you're not sure whether something's fair use and you think it's pretty close to the line, then you're in trouble because it's not really possible to tell exactly where the line is. Doing an impression where it's obviously a parody and you're not making any meaningful money off that person's identity is extremely likely to be fair use. If you're disguising that it's a parody, or if you're making a lot of money off the parody, or if the parody directly competes with the original, it very well might not be fair use. Whether it's funny doesn't really matter, what matters is what your intentions are and what the commercial impact might be. I haven't really seen any indication that the Dudesy thing was parody, though. It's just a human writing their own comedy act, running it through a George Carlin speech synthesizer, and claiming it's from Virtual George Carlin. I haven't seen anyone seriously accuse them of trying to parody Carlin's acts or satirize his views. But if they're not, then they're just stealing a famous name to get people talking and capitalize on his fame, and that's not so great legally. As Carlin's daughter's lawsuit puts it... quote:"Defendants always presented the Dudesy Special as an AI-generated George Carlin comedy special, where George Carlin was 'resurrected' with the use of modern technology," the lawsuit argues. "In short, Defendants sought to capitalize on the name, reputation, and likeness of George Carlin in creating, promoting, and distributing the Dudesy Special and using generated images of Carlin, Carlin’s voice, and images designed to evoke Carlin’s presence on a stage."
|
# ¿ Jan 29, 2024 07:38 |
|
SCheeseman posted:Claiming it's from Virtual Carlin in the same way that pro wrestling is claiming that it's a sport. An "imitation of a now dead George Carlin who has been resurrected with a (fictional) AI" is still an imitation of George Carlin. And the same would be true of Zombie George Carlin or anything like that, too. The storyline doesn't really matter, and whether the diehard fans were in on it doesn't really matter either. The fact is that they used Carlin's name and likeness in their stuff without permission, and that's a problem regardless of whether they're pretending to be the real person. Something being funny doesn't automatically make it parody. Generally there needs to be some kind of social commentary, or commentary on the thing you copied, or something like that. It doesn't have to be particularly insightful or well-done, and it doesn't even particularly have to be funny. It's not enough to demonstrate that it was A Bit. They have to be able to demonstrate that the Bit was about Carlin in some sense. That using his name and likeness specifically was essential to the joke they were trying to tell, that the jokes in their script wouldn't have landed quite the same way if they switched out all instances of "George Carlin" with some other name and used somebody else's voice. I don't know if that's the case or not, but in all the considerable commentary about it, I haven't seen anyone talking about how using Carlin in particular made the jokes extra funny because of all the Carlin references, or anything like that. And that's bad for Dudesy, because Carlin's family is going to argue that the jokes themselves didn't benefit from the usage of Carlin at all, and that his name and likeness was included solely to capitalize on his fame and draw more attention to the podcast.
|
# ¿ Jan 29, 2024 08:55 |
|
SCheeseman posted:This reads like commentary to me and it takes up about a third of the act, using the premise of an AI of George Carlin as part of it's commentary, going on to touch on issues related to using AI generated likenesses of real people and other politics associated with it. They're making clear points about the implications of what they're doing (or appearing to do). Well, it's good for them that fair use doesn't require something to be funny, because I've seen D&D posts funnier than that. Looks more like satire than parody to me. That might be enough for fair use. Or it might not. Depends on the judge. The decisive factor that would bring this into the clear is if the parody relies on using George Carlin specifically, and that the parody wouldn't have landed nearly as well if they'd been made by someone else. As the Supreme Court puts it, "Parody needs to mimic an original to make its point, and so has some claim to use the creation of its victim’s (or collective victims’) imagination, whereas satire can stand on its own two feet and so requires justification for the very act of borrowing". In other words, the parody exception isn't a blanket "it's okay as long as you're trying to be funny and incisive". It's an acknowledgement of the fact that when you're lampooning or playing off or making fun of something, you inherently need to include at least a little bit of the original to make the reference clear. Is this generated voice bitching about people hating AI funnier or more interesting because it came from Virtual George Carlin specifically, rather than Virtual Andy Kaufman or Virtual Charlie Chaplin or Virtual Goku or Virtual Original Character? I don't know, but that's the question that's going to be going in front of the judge. Another case to look at for an idea of how all this settles out is Midler v. Ford Motor Company. It's not really a fair use case, but it's one of the cases that established how copyright treats voices and whether it's okay to use impersonators, since what this case is really about is the use and violation of Carlin's identity. quote:The purpose of the media's use of a person's identity is central. If the purpose is "informative or cultural" the use is immune; "if it serves no such function but merely exploits the individual portrayed, immunity will not be granted." Note that this doesn't say that the work as a whole has to be informative or cultural. It says that the use of the person's identity in particular has to be informative or cultural. There needs to be a specific reason to use that specific identity, and that reason can't just be "because we think it'll be more popular that way". Of course, even then, "is it parody" is not the only condition that matters in fair use. There's other factors that are important, like how much commercial benefit Dudesly might have seen from their use of his identity (I'm inclined to say "a fair bit", since I'd never heard of Dudesly before the rash of media articles about the AI resurrection of George Carlin). But overall, voice transformers don't bring anything new to the table legally. There's already caselaw about voice imitation, and nothing really changes by using a computer-generated voice instead of hiring an impersonator.
|
# ¿ Jan 29, 2024 18:02 |
|
The press managed to find and interview this poor oompa loompa too. https://twitter.com/davidmackau/status/1762981623115465156 highlights:
https://www.vulture.com/article/glasgow-sad-oompa-loompa-interview.html quote:The internet loves a fiasco. Whether it be 2017’s infamous Fyre Festival, 2014’s sad ball pit at Dashcon, or last year’s muddy hike for freedom at Burning Man, we love to marvel at events that make big promises but flop spectacularly. It’s the online equivalent of slowing down in your car to look at a giant wreck.
|
# ¿ Feb 29, 2024 16:49 |
|
Boris Galerkin posted:How good are the LLMs at being multilingual, like how a person raised in a multilingual household would be? Doesn't really matter what context it's in, a LLM is a LLM. And Siri is definitely not a LLM. How well a LLM handles different languages depends pretty much exclusively on its training set. And I think you're making the mistake of anthromorphizing it here. LLMs don't work like human thought processes do. They don't really have a sense of "language" in the first place. The only thing they know about a given word is what words it tends to be used along with most often. They don't know that "ernährungswissenschaftler" is German or that "nutritionist" is English - they just know that "ernährungswissenschaftler" tends to be used along with other words that are way too long and contain way too many consonants. Even though they have no concept of "ernährungswissenschaftler" being a German word, they'll tend to use it with other German words, because that's how it tends to be used in much of their training data. Unless you use it on an otherwise English sentence along with words like "meaning" or "translation", in which case it'll respond with an English description of what it means, because any instances of "what does ernährungswissenschaftler mean?" in its training data will be followed by an English explanation of its meaning. Circling back around to your original question, I'll repeat myself: it depends on its training data, because it's basically just repeating back words based on statistical analysis of its training set. They'll tend to use English words with English words and German words with German words, but that's because their training data will most likely use English words with English words and German words with German words. There's no fundamental understanding of language there. If their training data contains a lot of mixing English with German, then they'll be more likely to mix English with German themselves. Simple as that. That's for responses, anyway. In terms of taking input, if you're typing the words in, then it shouldn't have any issue with multilingual input since its all just words, and the LLM doesn't have a real concept of language. But if you're speaking words, then you're not putting words directly into the LLM. You have to go through voice recognition first, and that's not an LLM. Moreover, it is usually much more complicated. Voice recognition tends to still be language-specific, because it's probably a couple orders of magnitude more complex than just crunching text.
|
# ¿ Mar 14, 2024 15:08 |
|
SaTaMaS posted:It's pretty hard to define a sense of "language" other than as a collection of associations between words, which is exactly what LLMs are. It doesn't make sense to talk about a "sense of 'language'". Of course, it's pretty hard to define "language", which is why there's entire scientific fields dedicated to the study of human languages, which has created several theoretical frameworks for understanding the concept of language. Chomsky usually gets posted here for his political beliefs, but his day job is linguistics. Similarly, Tolkien is best known for his fantasy works, but he was actually a philologist whose fantasy books were heavily influenced by his studies of language and literature. But even in that context, "a collection of associations between words" is a strikingly poor definition for a language. An individual language is generally understood to be a system, containing not just word associations but several sets of rules (sometimes quite complex ones). And in the context we're talking about, which goes well beyond the scientific study of language in isolation, I think we also need to note the considerable social, cultural, and historical ties languages hold, because these are all relevant to contexts in which people might want to speak one language over another. For example, code-switching. GABA ghoul posted:You are making a lot of assumptions about how language processing works in human brains. I'm not quite sure what you read from my post, but it sure as hell doesn't sound like what I wrote. I didn't make any assertions about how language processing works in human brains, nor did I ever claim that there's "some specific natural mechanism in the brain to separate languages". It's a ridiculous claim to make, which is exactly why I didn't make it, nor did I suggest anything even remotely similar to it. SaTaMaS posted:People such as Ferdinand de Saussure have already worked to define language. He argues that words acquire meaning through their relational positions in a language system rather than through direct links to the external world (aka structuralism). Tokens aren't the same as words, but words are made up of tokens. LLMs rely on the statistical relationships between words in large datasets to predict and generate text, and the context in which a word appears is crucial for its interpretation. I think Ferdinant de Saussure might be a bit behind the curve in linguistic study, given that he died more than a century ago. His influence certainly extends into the current day, but now his structural linguistics are just one of many linguistic theories out there. I also think your read of structural linguistics doesn't quite line up with mine (though, granted, I'm not a linguist). As far as I know, structural linguistics is fundamentally about interpreting the rules of a language. It's not that individual words acquire meaning through their relational positions in a language system, it's that grammatical elements and semantics and syntax and other language rules acquire meaning through those relational positions. The "structure" in "structuralism" is the structure of the language system itself.
|
# ¿ Mar 15, 2024 16:34 |
|
SaTaMaS posted:It's not at all true that an LLM can't learn arithmetic problem solving, for example https://machine-learning-made-simple.medium.com/how-google-built-the-perfect-llm-system-alphageometry-ed65a9604eaf That is a) geometry rather than arithmetic, and even more importantly for this conversation, b) not using an LLM to do the math. This is a perfect example of how easily the capabilities of LLMs get enormously exaggerated, and how important it is to remain clear-eyed about them and keep their limitations in mind. All of the actual mathematics is being done by their symbolic deduction engine, a more traditional machine learning system. The LLM plays an accessory role here by basically restating the initial problem with more and more detail until there's enough detail there for the deduction engine to handle. The exact details use a level of jargon that's a little tough for me to follow, but the situation here appears to be that the deduction engine is extremely picky about its input data and usually requires humans to rewrite the problem in a specialized syntax and language, while also adding much more detail because things have to be extremely specific for the limited capabilities of the deduction engine. Which is still significant work, of course, but it's very important to note here that the LLM isn't actually doing any number-crunching here. All seems to be doing is rewording the problem to make it possible for the dedicated math-solver (which is not a LLM, and therefore has very limited language capabilities, to interpret. GABA ghoul posted:
My entire point is that humans can reason about the abstract concept of language, and can consciously choose to edit the generated speech in their head to strictly conform to one of them, or mix them as they please. They don't have to do either of those things, but the fact that they can (and that LLMs can't) is important to note in the context of the specific question that was originally being asked and the specific situation being discussed. A more conventional voice recognition engine is likely to draw far too firm a line between languages and be unable to handle bilingual input at all. On the other hand, LLMs are unable to draw that line at all, buyt they're good enough at following the example of their training set and the prompt that it's extremely unlikely the user will notice that the line isn't hard.
|
# ¿ Mar 15, 2024 21:03 |
|
SaTaMaS posted:I think you're really underestimating the technology here. There's no reason it can't be trained on arithmetic problems and I assume another LLM already has been. "All" that AlphaZero did in order to beat all humans at Go was combine a Transformer (aka the same fundamental technology as an LLM) that suggested some next likely moves with a Monte Carlo Tree Search to figure out which of the suggested moves would be the best. There's not a lot that looks like human reasoning going on in either case, and the LLM/Transformer was only 50% of the solution. Well, you showed that as an example of a LLM doing arithmetic, and I pointed out that the paper itself says that the LLM is not actually doing any math there. That doesn't mean that I'm underestimating LLMs, it means that you were wrong about your claim, and it means you tried to "prove" your claim with a paper that directly contradicted what you were trying to use it to prove. Judging from what you're saying about AlphaZero here, I think it's time to to suggest that we all agree on an important ground rule here: "LLM" is not a synonym of "transformer", "neural network", "machine learning", or "AI". LLMs are a kind of transformer, which is a kind of neural network, which is a kind of machine learning, which is a kind of AI. But that doesn't mean that LLMs can do anything that AI, machine learning, neural networks, or transformers can do. Nor should they be expected to! Large language models, as the name implies, are models that are designed for and specialized for handling language. Similarly, there are other kinds of machine learning systems, architectures, and models that are specialized for handling other tasks, like math. The symbolic deduction engine used in AlphaGeometry is a machine-learning system, and in fact a substantial portion of the paper you posted was dedicated to innovations and advances they'd made in how to train that specialized machine-learning geometry-solver. But it wasn't a LLM! AlphaZero may use many of the same fundamental technologies as a LLM, but that doesn't mean it's a LLM, nor does it mean that LLMs can do the exact things that it does! And just as ChatGPT would be quite poor at doing what AlphaZero does, AlphaZero would be quite poor at doing what ChatGPT does. And this is fine! There's no problem with handling different kinds of tasks with different kinds of systems that are specialized for each task. Especially when these systems can be connected together in ways that allow them to pass tasks around to whichever system is best suited for the specific task that's been given to them. Different tools for different tasks. The problem is when one of the tools is a talking hammer, and suddenly people start thinking everything is a nail.
|
# ¿ Mar 16, 2024 21:39 |
|
|
# ¿ May 10, 2024 06:48 |
|
Lucid Dream posted:It's really not very complicated. The LLM itself is not very good at math, but it can be trained to utilize external tools such as a calculator. Whether you think that counts as the LLM "learning to do math" or whether it counts as the LLM "solving the problem" is obviously subjective. I think it counts, but we can agree to disagree. It obviously doesn't. If someone said "this 5-year-old human knows how to do division", you would expect that human to be able to solve "20 / 5" without a calculator. If their math demonstration consisted solely of just entering the problem into a calculator and repeating the result, and if they were completely unable to solve the problem without the help of a calculator, then I think most people would get annoyed at the parent for wildly exaggerating their child's abilities. "The LLM can do math" and "a system including both a LLM and a math-solver can do math" are very different statements. I don't see why so many people are so incredibly eager to treat them as synonymous. quote:Well, they could attempt to calculate the answer and they'd probably do a hell of a lot better than me if you gave me the same problems See, this is what I mean with people losing track of the limitations of LLMs. LLMs are fundamentally incapable of doing this sort of mathematical calculation. They can't attempt to calculate arithmetic, because they're highly specialized tools that are capable of doing exactly one thing: statistically calculating what word a human would use next. That's basically the only thing they can do. That's an enormously powerful thing, of course, because it turns out that drat near everything we do can be expressed in words. But despite how broad their abilities seem to be, they are not general-purpose AIs. Word-prediction is a technique that can be enormously versatile, but there are fields where it just doesn't hold up, and one of them is arithmetic. If you ask ChatGPT to solve "two plus two", it might be able to answer "four", but only because "two plus two" shows up fairly frequently in human writings and is typically followed by the words "four" or "equals four". It can't actually do numerical addition.
|
# ¿ Mar 17, 2024 17:00 |