|
Also, the thread needs to be open to shunt AI art slapfights into, otherwise they are going to pop up all over the forum.
|
# ? Dec 19, 2023 22:15 |
|
|
# ? May 28, 2024 15:31 |
|
edit: eh wasnt really an original or interesting thought
Verviticus fucked around with this message at 22:47 on Dec 19, 2023 |
# ? Dec 19, 2023 22:41 |
|
Lemming posted:No, it's not. This goes back to the original point that using the word "hallucination" has made everyone discuss this situation in a really dumb way, but a great example is that fact that most people don't hallucinate (because hallucinations are manifestation of some kind of mental illness, where your brain isn't working the way it's supposed to), and ALL LLMs "hallucinate" necessarily as a function of how they work (because they're text predictors and don't have any "understanding" of the underlying truth of a situation) Hallucinations can happens easily with visual illusions, even on demand, for people withouth a mental illness. The probably cause of visual illusions are optimizations on our vision system that are caught off guard when the optimization happens to be in the wrong time. Anyway they are hallucinations, we think we see something, but is not there, is hallucinated. Another common hallucination for humans and artificial vision systems is Both AI systems and humans suffer from pareidolia hallucinations. And not crazy people. Tei fucked around with this message at 23:08 on Dec 19, 2023 |
# ? Dec 19, 2023 23:04 |
|
I set up a makeshift version of chatgpt using the gemini api, and gemini pro seems kind of lovely. It's basically free though, so that's nice I guess. I haven't really experimented with the multimodal capabilities, but I haven't been particularly impressed by chatgpt's ability to "see" images so I'm curious to compare at least.
Lucid Dream fucked around with this message at 23:21 on Dec 19, 2023 |
# ? Dec 19, 2023 23:19 |
|
Tei posted:Both AI systems and humans suffer from pareidolia hallucinations. And not crazy people. Wanna be that mug of coffee's friend.
|
# ? Dec 19, 2023 23:20 |
|
Nervous posted:Wanna be that mug of coffee's friend. He’s got a lot on right now, between visiting his speaker friend with the nasogastric tube in the hospital, evidently loving the absolute *hell* out of that washing machine and doing it all under the judgmental eyes of those power sockets.
|
# ? Dec 20, 2023 02:59 |
|
whoops, turns out the base dataset that most of the big image generation models are trained on contains at least 3000 instances of child sexual abuse material https://twitter.com/jason_koebler/status/1737460292299190371
|
# ? Dec 20, 2023 23:34 |
|
https://x.com/jason_koebler/status/1737469369154773301?s=20 Hey they are exactly as careless as everyone assumes people who scrape the internet are. "We cannot check every image manually" You mean you don't want to. "Since we are not distributing or deriving other images from originals, I do not think the image licensing apply" Setting up a nice space where maybe they aren't technically doing anything wrong but the law hasn't caught up to there being multiple digital middlemen in the process.
|
# ? Dec 21, 2023 00:26 |
|
Space Skeleton posted:https://x.com/jason_koebler/status/1737469369154773301?s=20 "There's no way we can review every image in this dataset...that would require us to do our research at a more measured pace" "There's no way we can be responsible for what we publish on our social media site...I mean without paying a bunch of people to moderate it all properly" "We can't be expected to account for all the impacts of AI on the wider community, that would cost us money and time" "It's not my fault your house is underwater. Congress passed an act that says I can tear down any municipal dam without liability provided I use one of the new Techdozers. This will stimulate investment in the exciting new Techdozer sector"
|
# ? Dec 21, 2023 08:44 |
|
I do not read that has "our dataset contains child porn" and more like "who knows?, lol, I don't think so, and we took actions to activelly filter horrible stuff, but that can happens". Also, it contains a genuine, naive and direct answer of: "Do we train our bot in copyrighted data ignoring the license of such data? yes." If theres a image online with a license "Any use except being feed to AI bots datasets", that image is going to be feed to AI bots datasets.
|
# ? Dec 21, 2023 14:17 |
|
Tei posted:If theres a image online with a license "Any use except being feed to AI bots datasets", that image is going to be feed to AI bots datasets. A copyright license dictates how you can copy (share) an image, not how you can use it.* If I put a license on an image saying "any use except jacking off to it" and you jack off to it, I can't sue you for doing that. The legal question at issue is if and when the image is being copied as part of running an AI system trained on that image. *Yes, I know there are clickwrap agreements on software that try to dictate the manner in which it is used but everyone pretty much ignores them anyways and the legal figleaf is that installing software is technically making a copy, which is stupid, but the law can be stupid sometimes. Also, technically there are some restrictions on use like breaking DRM, even if you don't share it, but that's explicitly illegal based on the DMCA (in the US), not based on the license. KillHour fucked around with this message at 15:07 on Dec 21, 2023 |
# ? Dec 21, 2023 15:04 |
|
Tei posted:I do not read that has "our dataset contains child porn" and more like "who knows?, lol, I don't think so, and we took actions to activelly filter horrible stuff, but that can happens". Oh, they're well aware it does, and have been for years. Stanford just published a study where they went through the Canadian Center for Child Protection and matched known CSAM to existing parts of the LAION-5B dataset. In any reasonable society, this is a moment where regulators slam the loving brakes and ask how exactly this happened, and why known safeguarding measures weren't applied.
|
# ? Dec 22, 2023 07:05 |
|
KillHour posted:A copyright license dictates how you can copy (share) an image, not how you can use it.* If I put a license on an image saying "any use except jacking off to it" and you jack off to it, I can't sue you for doing that. I am not lawyer, so I know I am on quicksands but USE licenses also exist. That make me think... what are the conditions required for a landgrab? - The previous owners don't have the strength to stop it. ( indigenous people in america, all the lands that had their treasures stolen by the british to put in a museum in england, the artists in 2023 ) - The enforcement arm of the land is okay with it - Racism / hate towards the minority group that own the property that is going to be stolen by the larger more powerfull group Art does no really have a value in capitalism. But you can ask a monthly fee for the access to a AI bot trained from that art. A forest have not value in capitalism. But if you burn it, and sell the burned trees as cheap coal, somebody can make money from that.
|
# ? Dec 22, 2023 10:18 |
|
Tei posted:I am not lawyer, so I know I am on quicksands but USE licenses also exist. In most cases, if I buy a thing, you can't tell me what to do or not do with that thing. An exception would be if you were leasing or renting it instead of selling it. Another exception is real estate because of old laws from 16th century Saxony or whatever. But if I sell you a book or a painting, I have no legal way to tell you that you can't wipe your rear end with the paper it's made of. That doesn't change if I give it to you for free either. If I created that work myself, what I can do is tell you not to make copies of it and sell them for a dollar or whatever. In theory, digital art should be the same, but it turns out that laws that make sense in the context of a physical item don't necessarily make logical sense for a bunch of abstract bits of information that get "copied" every time you use them. KillHour fucked around with this message at 15:54 on Dec 22, 2023 |
# ? Dec 22, 2023 15:47 |
|
.
|
# ? Dec 22, 2023 15:52 |
|
Back when ACDSee was one of the popular shareware image viewers for Windows, I recall its license agreement specified that you couldn't use it for viewing porn. It was always funny imagining a world where that was in any way enforced. I mean, leaving out those who didn't just crack it anyway.
|
# ? Dec 22, 2023 16:08 |
|
KillHour posted:In most cases, if I buy a thing, you can't tell me what to do or not do with that thing. An exception would be if you were leasing or renting it instead of selling it. Another exception is real estate because of old laws from 16th century Saxony or whatever. But if I sell you a book or a painting, I have no legal way to tell you that you can't wipe your rear end with the paper it's made of. That doesn't change if I give it to you for free either. If I created that work myself, what I can do is tell you not to make copies of it and sell them for a dollar or whatever. While this is a fun simplification, it completely ignores the concept of how derivative works interact with copyright, which is the real thing in question here. The primary issues: 1. Per the USPTO's present ruling AI-generated works cannot be copyrighted because they lack human authorship, although human-generated works that use AI-generated work may be. 2. The question of if using others' works without license as training data for generative AI is still being worked out through the courts, although the primary claim from OpenAI et al seems to be 'we can't grow our business fast enough if we're expected to vet and license the data we use'. See my prior post above as to the moral hazard involved in not vetting datasets. There's a pretty solid summary prepared by the Congressional Research Service. My personal reading is that we're going to see the ruling come down that it is in fact infringement, as there have been multiple demonstrations of ways to make generative AI (ChatGPT in particular) regurgitate training data in whole or part, which OpenAI's argument in the lawsuit hinges on not being possible.
|
# ? Dec 22, 2023 17:23 |
|
Liquid Communism posted:While this is a fun simplification, it completely ignores the concept of how derivative works interact with copyright, which is the real thing in question here. It ignores it because derivative works aren't relevant to the original question of doing the model training in the first place. It's running the trained model that is alleged to create potentially derivative works. I'm saying the training is unrelated to copyright, which is why you can't make a license that says not to.
|
# ? Dec 22, 2023 18:01 |
|
Liquid Communism posted:1. Per the USPTO's present ruling AI-generated works cannot be copyrighted because they lack human authorship, although human-generated works that use AI-generated work may be. quote:There's a pretty solid summary prepared by the Congressional Research Service. My personal reading is that we're going to see the ruling come down that it is in fact infringement, as there have been multiple demonstrations of ways to make generative AI (ChatGPT in particular) regurgitate training data in whole or part, which OpenAI's argument in the lawsuit hinges on not being possible. SCheeseman fucked around with this message at 18:27 on Dec 22, 2023 |
# ? Dec 22, 2023 18:22 |
|
e: opps quoting from several pages and days ago, Tei posted:Maybe part of the reason the human brain is so slow is because is mechanical. Biological cells must actually build new connections and chemistry changes (molecules) actually have to move. isnt the "computer" in this example a massive server farm? like sure it can do things fast, but its probably taking more energy/reoursces and space (for now??) than an unpaid intern PhazonLink fucked around with this message at 20:07 on Dec 22, 2023 |
# ? Dec 22, 2023 20:00 |
|
Tei posted:Art does no really have a value in capitalism. But you can ask a monthly fee for the access to a AI bot trained from that art. What the gently caress are you talking about? What's with everyone thinking they are so smart by lumping every phenomena of human valuation in with capitalism? Art obviously has value, regardless of the economic "system" (I consider true "laissez-faire" capitalism to be the virtual lack of coherent economic system, basically being just freedom under robust property rights and a rule of law) one lives under, that's why we have been making it for as long as we have records of ourselves as a species. You are talking about human beings being reflected in markets/capitalism , not the markets/capitalism themselves. The free-er the markets, the more they will reflect the actual desires of people. A forest "doesn't have value" (which is untrue) and coal does because forests are everywhere and coal is not. You are deeply confused, or stupid, or both.
|
# ? Dec 22, 2023 21:44 |
|
Serotoning posted:Art obviously has value, regardless of the economic "system" (I consider true "laissez-faire" capitalism to be the virtual lack of coherent economic system, basically being just freedom under robust property rights and a rule of law) one lives under, that's why we have been making it for as long as we have records of ourselves as a species.
|
# ? Dec 22, 2023 22:11 |
|
I'm preparing a longer write-up addressing several major AI topics. One significant development I can briefly bring up is the next pending legal case that could set important future precedence. This will go before a jury to decide. Thomson Reuters v. Ross Intelligence https://copyrightlately.com/why-a-little-known-copyright-case-may-shape-the-future-of-ai/ At the heart of the matter is that simple lists of facts are not copyrightable, only the creative aspects, or layout, or inclusion or exclusion of those facts. The Case Law organized by Thomson Reuters Westlaw is not copyrightable, only their organizing and summaries of it. Ross Intelligence training their AI Model on the Case Law itself is not an issue as that is public domain data. I don't know the extent of further training Ross Intelligence did on their model as I'm still learning about this case but it seems to be what the entire case is about and why it's important and worth following. One thing I am working on writing about is what is actually inside the AI Model now that I've had some time to do a deeper look after attempting to make my own. A Key Issue behind AI Training, the actual saved frozen AI Weights are only a list of relations of facts from the training data. This is an important distinction because even if copyrighted material was in a dataset used for training, the actual results of training are only a list of facts about the trained material and not the actual copyrighted material itself. They are not a mish mash of stored images or compressed files like was suggested in the first headline grabbing lawsuit that ended up dismissed. There is obviously a lot more to be said about this but it's enough to start with. I will say that copyright seems a poor method to handle such a fundamentally transformative multi-faceted societal changing issue.
|
# ? Dec 22, 2023 23:32 |
|
KwegiboHB posted:One thing I am working on writing about is what is actually inside the AI Model now that I've had some time to do a deeper look after attempting to make my own. well they can be if the model is over-fitted, as demonstrated by the new midjourney V6 model which seems to be especially prone to regurgitating near-perfect replicas of images from the training set for some reason in that last example the name and creator of the original piece aren't even included in the prompt, and MJv6 still zeroed in on replicating that piece in particular repiv fucked around with this message at 00:03 on Dec 23, 2023 |
# ? Dec 22, 2023 23:43 |
|
repiv posted:well they can be if the model is over-fitted, as demonstrated by the new midjourney update which seems to be especially prone to regurgitating near-perfect replicas of images from the training set for some reason This is why I need a much longer write-up because those are not pixel perfect recreations and the explanation of why that matters is important. Yes, I know that distinction is not going to matter for most people when it's 99.999999% the same. Pixel-perfect recreations are actually mathematically possible regardless of what a model was trained on and that is a big deal. I need time and I'm going to take what time I need to write this up proper.
|
# ? Dec 22, 2023 23:54 |
|
repiv posted:well they can be if the model is over-fitted, as demonstrated by the new midjourney V6 model which seems to be especially prone to regurgitating near-perfect replicas of images from the training set for some reason This is a weird one to me. Like, if you tell the AI to draw a picture of Mona Lisa and it draws a good Mona Lisa that isn't a bug, except in that it might be legally problematic. If you had a perfect AGI and asked it to output "Mona Lisa" it would be a bug if it *wasn't* a perfect representation.
|
# ? Dec 23, 2023 00:04 |
|
Lucid Dream posted:This is a weird one to me. Like, if you tell the AI to draw a picture of Mona Lisa and it draws a good Mona Lisa that isn't a bug, except in that it might be legally problematic. If you had a perfect AGI and asked it to output "Mona Lisa" it would be a bug if it *wasn't* a perfect representation. in the case of the mona lisa i'd agree, since that's a specific still image you would expect a sufficiently large model to reproduce it exactly if asked for the mona lisa. it's only notable as an example that models can store and reproduce the images they were trained on, contrary to some claims that the training data is always atomized beyond recognition so it doesn't count as reproducing it. the other two examples though - the joker prompt doesn't ask for a specific known image but the model decided to regurgitate a specific frame from the movie rather than interpolating over the broad space of "joker movie images", and in the last example the piece being copied isn't referenced in the prompt whatsoever. that to me indicates over-fitting, biasing the model towards being less creative and more plagiarizey in an effort to improve quality. repiv fucked around with this message at 00:54 on Dec 23, 2023 |
# ? Dec 23, 2023 00:11 |
|
repiv posted:in the case of the mona lisa i'd agree, since that's a specific still image you would expect a sufficiently large model to reproduce it exactly if asked for the mona lisa. it's only notable as an example that models can store and reproduce the images they were trained on, contrary to some claims that the training data is always atomized beyond recognition so it doesn't count as reproducing it. Hmm, I still think the Joker one is similar to the Mona Lisa example in that it was asking for a screenshot from the film and we don't actually know how many attempts it took. The third one is pretty damning though I suppose.
|
# ? Dec 23, 2023 02:18 |
|
Lucid Dream posted:This is a weird one to me. Like, if you tell the AI to draw a picture of Mona Lisa and it draws a good Mona Lisa that isn't a bug, except in that it might be legally problematic. If you had a perfect AGI and asked it to output "Mona Lisa" it would be a bug if it *wasn't* a perfect representation. I think creating copies of movies, songs, cars.. etc.. won't be a problem if is for personal use. But if you start selling Ferrari models you 3D printed on your computer, that can count has counterfeight product.
|
# ? Dec 23, 2023 03:23 |
|
I think the potential issue here, to me at least, is that having a piece of software that shows you the mona lisa when you type in "mona lisa" isn't exactly a novel issue - Google does that already. It's also not enough to say that a tool can produce an exact or near exact replica of something copyrighted when you give it specific instructions - you've always been able to do that in Photoshop or whatever even from a blank canvas if you happen to know what instructions to give it. I'm not really sure there is any way to make a standard about AI in the general case - it basically has to be a case-by-case basis each time. In a lot of ways this is the worst of all worlds, because people will absolutely use AI in place of human artists, but I think you have to prove they used it instead of a PARTICULAR artist (the claimant) to really be able to win. On the other hand, companies can't trust that AI outputs will be safe, unless you are Disney or whoever and have enough of your own IP to train a model by yourself. Everyone else will have to live in constant fear that some panel or another is directly lifted from a comic book or whatever and it turns out that people with good lawyers have a slam-dunk case against you. I don't see any way that you can ban AI in general on this basis either, to be clear. As KwegiboHB said, it's basically just stored statistics about the overall dataset, like "in 80% of images associated with the word "steeple", there was a pair of sharp borders forming a 20 degree angle". Even that is assigning more intent/comprehension to it than actually exists. The issue with the three images listed there is that exact duplicates, or images that are basically the same but with different aspect ratios, probably exist hundreds of times in the dataset. On top of THAT, the Mona Lisa in particular is going to have a bunch of poo poo like "this is what an AI generated when I told it "mysterious smile", look how close it got!" and "here is what researchers think the Mona Lisa would have looked like at the time it was painted, accounting for the age of the paint" and "look, I photoshopped my face into the Mona Lisa with this filter". I'm pretty sure the Dorothea Lange photo has been in every high school history textbook ever published in the US, which means it also is on every educational website about the era. It's obviously less ubiquitous than the Mona Lisa internationally, but assuming they are matching photos found on the Internet with text found on the same webpage it makes perfect sense this would happen - I probably had that photo in at least 4 different textbooks (photography, journalism, US history, economics) The Joker one is really the most problematic example. In principle it probably amounts to something similar, with a lot of posts about people becoming "jokerfied" about things where they might have posted the screen cap with different aspect ratios, jpegs that are just "top text, joker face", etc. I think there was an issue/running joke about early image generators thinking that photos of cats are supposed to have impact font text on the top and bottom, so you'd get garbled not-quite-letters if you didn't plan around it . In the same sense, it's not surprising to me that "2019 joker movie" is going to give you the most-memed picture. BougieBitch fucked around with this message at 07:24 on Dec 23, 2023 |
# ? Dec 23, 2023 07:21 |
|
BougieBitch posted:you've always been able to do that in Photoshop or whatever even from a blank canvas if you happen to know what instructions to give it. Photoshop didnt have copyrightable material fed into it though so I don't think the comparison works here. Photoshop is worlds closer to traditional art creation than AI image generators are. I think the law will end up in place where the large model companies have to do everything they can to restrict copyright regurgitation but with everyone understanding it can't be totally prevented. Similar to how social media companies aren't held liable for hate speech on their platforms as long as they show they are actively trying to stop it. Mega Comrade fucked around with this message at 10:21 on Dec 23, 2023 |
# ? Dec 23, 2023 10:18 |
|
repiv posted:well they can be if the model is over-fitted, as demonstrated by the new midjourney V6 model which seems to be especially prone to regurgitating near-perfect replicas of images from the training set for some reason Related, we had a guy try to sell us a generative model to produce synthetic data from our locked-down private data. Turns out the optimal strategy to produce synthetic data that looks real is to output data identical to the original!
|
# ? Dec 23, 2023 11:23 |
|
Mega Comrade posted:I think the law will end up in place where the large model companies have to do everything they can to restrict copyright regurgitation but with everyone understanding it can't be totally prevented. Similar to how social media companies aren't held liable for hate speech on their platforms as long as they show they are actively trying to stop it. yeah i could see the AI vendors implementing a "you can copy my homework but don't make it too obvious" filter if they're pressured to, they could build a database of image fingerprints from the training set (similar to how GIS/tineye works) then check the generated output against that and re-roll with a different seed if it's too close to a training image within some threshold that would be an additional compute burden though, and inevitably make the models quality worse as its forced to throw away good (stolen) images, so they'd rather not if they don't have to
|
# ? Dec 23, 2023 15:53 |
|
Mega Comrade posted:Photoshop didnt have copyrightable material fed into it though so I don't think the comparison works here. Photoshop is worlds closer to traditional art creation than AI image generators are. Alternatively, a regulatory capture situation could result where regulation on AI will gradually and continually grow more expensive so that only the largest AI companies will be able to effectively comply, effectively banning any new competitors.
|
# ? Dec 23, 2023 16:09 |
|
esquilax posted:Alternatively, a regulatory capture situation could result where regulation on AI will gradually and continually grow more expensive so that only the largest AI companies will be able to effectively comply, effectively banning any new competitors. Yeah, it's pretty easy to see a path where generative AI exists and becomes a standard tool of creative professions but it's impossible to use without a large corporate middleman holding their hand out.
|
# ? Dec 23, 2023 16:51 |
|
BougieBitch posted:I think the potential issue here, to me at least, is that having a piece of software that shows you the mona lisa when you type in "mona lisa" isn't exactly a novel issue - Google does that already. It's also not enough to say that a tool can produce an exact or near exact replica of something copyrighted when you give it specific instructions - you've always been able to do that in Photoshop or whatever even from a blank canvas if you happen to know what instructions to give it. Something similar happens with Gartic Phone https://garticphone.com/ Where if you draw something that varelly resemble Mario, it will become Mario, or something blue with like lines in his back, it become Sonic. Is like these images have a magnetic power to attract drawing to themselves. Memetic power.
|
# ? Dec 23, 2023 19:17 |
|
After watching some YouTube videos about robots and how there are chatgpt-powered ones being developed a question sprung up in my mind; when will we see the first murderbots? I mean, eventually, tech-savvy people will be able to buy a robot and dunk an open source, unrestricted AI into them. Will we see a robot war á la "I, Robot" soon?
|
# ? Dec 24, 2023 14:51 |
|
We already have human controlled drone warfare. The various militaries around the world have been experimenting with machine learning AIs controlling them for a while
|
# ? Dec 24, 2023 15:00 |
|
If you're asking whether or not a self-aware robot army is about to rise up against humanity anytime soon, then the answer is no.
|
# ? Dec 24, 2023 19:00 |
|
|
# ? May 28, 2024 15:31 |
|
Quixzlizx posted:If you're asking whether or not a self-aware robot army is about to rise up against humanity anytime soon, then the answer is no. What about a non self aware murder swarm with faulty/breached IFF coding that's running amok on a civilian population center?
|
# ? Dec 24, 2023 21:45 |