AI Art: It is criminal to not post your prompt

The Something Awful Forums > Main > General Bullshit > AI Art: It is criminal to not post your prompt

«‹›432 »

Tunicate: May 15, 2012

A Strange Aeon posted:

What would SD do with that prompt? That's the open source one, right?

without seasonings, not very well

with some of the seasonings I was doing sci-fi stuff with

# ? Sep 16, 2022 05:43

Adbot: ADBOT LOVES YOU

# ? May 12, 2024 21:41

Rinkles: Oct 24, 2010; What I'm getting at is...
Do you feel the same way?

A Strange Aeon posted:

What would SD do with that prompt? That's the open source one, right?

not too great. SD prefers verbose prompts

# ? Sep 16, 2022 05:43

Hadlock: Nov 9, 2004

Wheany posted:

One problem with just dumping a shitload of images into the learning process is that generally images on the internet are not labeled. Booru-based image generation is

Was reasonably impressed wiktionary had a valid definition for booru

# ? Sep 16, 2022 08:27

Hadlock: Nov 9, 2004

BARONS CYBER SKULL posted:

Cool, I have an idea for a TIE Fighter DOS remake

# ? Sep 16, 2022 09:07

Hadlock: Nov 9, 2004

Boba Pearl posted:

New Waifu diffusion just dropped for SD

https://storage.googleapis.com/ws-store2/wd-v1-2-full-ema.ckpt

This model was removed, is there a new location, or can someone PM me a link

What is the process to train more images and add them to the model? Like, very, very specifically, is there a github repo or technical document on how this works. $160 of cloud compute time is equivalent to... something like 2-4 weeks of desktop computer time? Maybe. That was for ~50k of images

All 9 star wars movies at 24 fps is ~2.1 million images, which is... 43x larger, so like $7000 to train on every frame of all 9 movies. If you train on only 1/4 frames it's like $1750

Not suggesting star wars is an ideal training target here, but just trying to put into perspective how easy it would be to train the model to improve on certain things it's weak on (bugs bunny, for example) or if you're building a video game or writing a novel that needs cover art. At $160, it might be cheaper to add training data to the model than hire an artist to do it for you.

edit: download links: https://github.com/harubaru/waifu-diffusion/blob/main/docs/en/weights/danbooru-7-09-2022/README.md

What looks like the sourcecode (haven't tested it yet): https://github.com/harubaru/waifu-diffusion

Hadlock fucked around with this message at 09:53 on Sep 16, 2022

# ? Sep 16, 2022 09:26

frumpykvetchbot: Feb 20, 2004; PROGRESSIVE SCAN; Upset Trowel

Hadlock posted:

All 9 star wars movies at 24 fps is ~2.1 million images, which is...

...untagged.

Wheany posted:

One problem with just dumping a shitload of images into the learning process is that generally images on the internet are not labeled. Booru-based image generation is pretty much easy mode since people spend countless of hours tagging every possible detail in the images.

Training a few specific new concepts with textual inversion still requires the individual concepts to be consistently labeled. The LAION-5B team automated that on a very large scale so I guess it's possilbe. The amount of pop culture that's in SD already is impressive enough... maybe you could do the same for specific properties so it can e.g. tag all thematically important characters, settings, vehicles, props correctly.

Wheany posted:

If you wanted to use the same power for good, you could take tagged photos on flickr for example, then try to detect untagged features in other photos and possibly suggest them to the users. "Does this photo have Vibrant Foliage in it?", possibly even adding an overlay to the photo where it thinks it's seeing the thing.

I remember when FB started urging people to tag their friends in your image posts. Like, one-click tag. Is this so and so? Click yes or no. It was super creepy. I bet it's still in service behind the scenes. No escaping the panopticon.

frumpykvetchbot fucked around with this message at 11:22 on Sep 16, 2022

# ? Sep 16, 2022 11:09

Hadlock: Nov 9, 2004

Kind of curious where all that "click all the photos with school busses/stop lights/crosswalks/airplanes in them" anti bot captcha data is getting funneled to

Tagging a bunch of photos from a movie could easily be streamlined, many shots are close-up of actors talking, or relatively static shots. Due to budget constraints there aren't that many dynamic shots in most films outside of money shots like battle sequences etc. Wouldn't be a walk in the park but a 20 second dialogue scene might just be the same three recycled shots over and over in a diner with the same two plates of food. I did notice that 99% of all the restaurant photos stable Diffusion creates looks like the inside of a 1970s Denny's, wood veneer walls and amber lighting

# ? Sep 16, 2022 11:57

Wheany: Mar 17, 2006; Spinya^{ha^{ha^haha}ha}ha_{ha_{ha_haha}ha}ha!; Doctor Rope

A Strange Aeon posted:

What would SD do with that prompt? That's the open source one, right?

Stable Diffusion has a very vague idea what a Bugs Bunny is:

Wheany posted:

Stable diffusion doesn't know who Bugs Bunny is.

Compare the prompts mickey mouse and bugs bunny:

This almost makes me want to run a bunch of cartoon characters through Stable Diffusion just to see which ones it recognizes

edit: lol, donald duck

# ? Sep 16, 2022 12:09

Wheany: Mar 17, 2006; Spinya^{ha^{ha^haha}ha}ha_{ha_{ha_haha}ha}ha!; Doctor Rope

It would be kind of interesting and not especially difficult to make a script that generates pictures of a few hundred well known cartoon characters just to see which ones Stable Diffusion recognizes.

# ? Sep 16, 2022 12:15

BoldFace: Feb 28, 2011

How far are we from the situation where AI is able to create and label its own training data, creating an infinite learning feedback loop?

BoldFace fucked around with this message at 14:26 on Sep 16, 2022

# ? Sep 16, 2022 14:24

Objective Action: Jun 10, 2007

Hadlock posted:

What is the process to train more images and add them to the model? Like, very, very specifically, is there a github repo or technical document on how this works. $160 of cloud compute time is equivalent to... something like 2-4 weeks of desktop computer time? Maybe. That was for ~50k of images

The easiest way to just try and add a style or character is to use Textual Inversion https://github.com/rinongal/textual_inversion or https://github.com/nicolai256/Stable-textual-inversion_win if you are on Windows. This isn't actually training the model though, its looking at what the model already knows about and seeing if you can use that vector space to construct a vector basis for the new concept and add it to the lexicon. The upshot is you only need ~5 images to train, the downside is if the vector space doesn't span the basis you need for the concept you can't learn it successfully.

To actually train the model the only publicly available way I know of right now is https://github.com/Jack000/glid-3-xl-stable which has explicit instructions for how to take the SD checkpoint file, rip it apart into its components, train the bits, and stitch the whole thing back together. This is very powerful because you can add new vectors to the space but needs way more training time and a hell of a lot more example images.

# ? Sep 16, 2022 14:43

Elotana: Dec 12, 2003; and i'm putting it all on the goddamn expense account

BoldFace posted:

How far are we from the situation where AI is able to create and label its own training data, creating an infinite learning feedback loop?

This kind of recursion is already part of the training process

# ? Sep 16, 2022 16:25

Elotana: Dec 12, 2003; and i'm putting it all on the goddamn expense account

Objective Action posted:

The easiest way to just try and add a style or character is to use Textual Inversion https://github.com/rinongal/textual_inversion or https://github.com/nicolai256/Stable-textual-inversion_win if you are on Windows. This isn't actually training the model though, its looking at what the model already knows about and seeing if you can use that vector space to construct a vector basis for the new concept and add it to the lexicon. The upshot is you only need ~5 images to train, the downside is if the vector space doesn't span the basis you need for the concept you can't learn it successfully.

Is it possible to do something like Midjourney face prompting with this? Could I feed it a set of four headshots from my "character bank" and have it recognizably spit back that same character in variable costumes, poses, and art styles?

I love the poo poo out of Midjourney but they've been burbling about hiking the filters from PG-13 to G like Dalle-2 so I should probably start learning to roll my own SD sooner rather than later

# ? Sep 16, 2022 16:32

mcbexx: Jul 4, 2004; British dentistry is
not on trial here!

Day 3, still not tired of creating celebrities hulking out.

I added about new 50 images since my last post.

Updated gallery here, including:

Danny Trejo

Robert de Niro

and guest starring, Bruce Willis as Thanos

# ? Sep 16, 2022 16:34

Rutibex: Sep 9, 2001; by Fluffdaddy

Elotana posted:

I love the poo poo out of Midjourney but they've been burbling about hiking the filters from PG-13 to G like Dalle-2 so I should probably start learning to roll my own SD sooner rather than later

God damnit midjourney this isn't what I'm paying you for. I guess I need to get my supply of sexy witches now before it's too late :argh:

# ? Sep 16, 2022 16:39

TheWorldsaStage: Sep 10, 2020

The self censorship would really bother me if it wasn't certain there will eventually be fine tuning for whatever you want in SD. Thank God for open source

# ? Sep 16, 2022 16:46

pixaal: Jan 8, 2004; All ice cream is now for all beings, no matter how many legs.

TheWorldsaStage posted:

The self censorship would really bother me if it wasn't certain there will eventually be fine tuning for whatever you want in SD. Thank God for open source

And it's always hard to stand out against it because it's met with "so you just want to make porn". No but sometimes a prompt will randomly have something in it and you can work it out if you know and see it. Also the classification tends to be poo poo (hello Shrek being like a 30% hit rate for dildo).

# ? Sep 16, 2022 16:55

Elotana: Dec 12, 2003; and i'm putting it all on the goddamn expense account

Rutibex posted:

God damnit midjourney this isn't what I'm paying you for. I guess I need to get my supply of sexy witches now before it's too late

It's really silly. They expanded way too fast with a skeleton dev crew and all-volunteer moderators and rolled out their SD fork to over a million users (in *their* discord, god knows how many in scuzzier ones) relying solely on word blocks and manual bans.

The resulting entirely predictable faceful of /d/ seems to have scarred them and now the CEO can't stop musing aloud about banning the word "bikini" because he's offended that horndogs are using his New Tool For Human Expression� as a cheesecake factory. (You would think SWE types would have learned some cynicism by now.)

# ? Sep 16, 2022 16:55

Objective Action: Jun 10, 2007

Elotana posted:

Is it possible to do something like Midjourney face prompting with this? Could I feed it a set of four headshots from my "character bank" and have it recognizably spit back that same character in variable costumes, poses, and art styles?

I love the poo poo out of Midjourney but they've been burbling about hiking the filters from PG-13 to G like Dalle-2 so I should probably start learning to roll my own SD sooner rather than later

That's pretty much the intended way to use textual inversion, yeah. Generally it works best for stuff with lots of vectors/examples. So people work well because there are lots of photos of people in the SD training set. Anime, cartoons, comics etc are more hit and miss but can sometimes be interesting. Painting styles are all over the place.

If you want to play with existing embeddings before you worry about trying to train one just to see which ones work, and in what circumstances, HuggingFaces has a bank of hundreds of them people are contributing at https://huggingface.co/spaces/sd-concepts-library/stable-diffusion-conceptualizer.

If you use the https://github.com/AUTOMATIC1111/stable-diffusion-webui web-ui you can drop them in a folder and whatever their filename is will automatically be parsed out of the prompt and the relevant embedding file loaded automatically, very convenient.

Objective Action fucked around with this message at 17:12 on Sep 16, 2022

# ? Sep 16, 2022 17:02

Objective Action: Jun 10, 2007

Looks like the Facebook research guys figured out how to speed up the Diffusers attention layer ~2x by optimizing the memory throughput on the GPU, https://github.com/huggingface/diffusers/pull/532.

Exciting for when this gets through to mainline because this should speed up Stable Diffusion models by quite a lot even on consumer cards.

# ? Sep 16, 2022 21:30

Rutibex: Sep 9, 2001; by Fluffdaddy

Elotana posted:

It's really silly. They expanded way too fast with a skeleton dev crew and all-volunteer moderators and rolled out their SD fork to over a million users (in *their* discord, god knows how many in scuzzier ones) relying solely on word blocks and manual bans.

The resulting entirely predictable faceful of /d/ seems to have scarred them and now the CEO can't stop musing aloud about banning the word "bikini" because he's offended that horndogs are using his New Tool For Human Expression� as a cheesecake factory. (You would think SWE types would have learned some cynicism by now.)

Princess Peach Eating a Burger, League of Legends, Pin-up

Topaz Clown

# ? Sep 16, 2022 22:50

Siguy: Sep 15, 2010; 10.0 10.0 10.0 10.0 10.0

I was excited to get access to the DallE2 beta because I heard it was more powerful than Stable Diffusion, but it�s so restricted in terms of content that it�s hard for me to find anything I wouldn�t rather just do in SD on my own machine without having to worry about running out of credits.

# ? Sep 17, 2022 03:27

The Sausages: Sep 30, 2012; What do you want to do? Who do you want to be?

trying out new hair with img2img on dreamstudio while I wait for my haircut

Sucks that the barber can't fix my face like that tho

# ? Sep 17, 2022 03:35

IShallRiseAgain: Sep 12, 2008; Well ain't that precious?

Siguy posted:

I was excited to get access to the DallE2 beta because I heard it was more powerful than Stable Diffusion, but it�s so restricted in terms of content that it�s hard for me to find anything I wouldn�t rather just do in SD on my own machine without having to worry about running out of credits.

Yeah, I think its going to be left behind despite being really good because its the most expensive and its restricts what you can make so much. Its the least open of the image generation AIs. I hope somebody creates an open source GPT-3 equivalent too.

# ? Sep 17, 2022 03:38

Tunicate: May 15, 2012

Dalle works really well for a specific prompts in a way that i havent been able to replicate on SD/MJ.

That said with negative weights and the power of leaving my computer running overnight i can get good stuff out of SD, but its scattershot in the general region of what i want, and not nearly as precisely directed as dalle.

# ? Sep 17, 2022 03:50

Hadlock: Nov 9, 2004

So apparently searching for stable diffusion poo poo has flagged me as someone Very Likely to click on purple haired, big boob anime girls fronting for random mobile games. I don't think I'd ever seen an ad like that before in my feed. I'm seeing it on social media and even on CNN article pages, MSNBC etc. Apparently it is a very strong marker

Did not realize how important this stuff was to anime people

That guy making a bunch of customized lifelike Pokemon for a living is mega hosed, r.i.p.

# ? Sep 17, 2022 03:54

Elotana: Dec 12, 2003; and i'm putting it all on the goddamn expense account

Emad was openly talking about including https://www.gwern.net/Danbooru2021 in the training dataset and waifudiffusion was one of the first generally "accessible wihout learning to code" SD builds so that's not too surprising

# ? Sep 17, 2022 04:27

IShallRiseAgain: Sep 12, 2008; Well ain't that precious?

I used img2img to convert a certain creepy painting into a photo.

# ? Sep 17, 2022 05:23

The Sausages: Sep 30, 2012; What do you want to do? Who do you want to be?

IShallRiseAgain posted:

I hope somebody creates an open source GPT-3 equivalent too.

EleutherAI has been working on that with GPT-Neo and GPT-NeoX.

# ? Sep 17, 2022 05:48

Boba Pearl: Dec 27, 2019; by Athanatos

Hadlock posted:

That guy making a bunch of customized lifelike Pokemon for a living is mega hosed, r.i.p.

That guy was getting on twitter telling AI people that their art will never replicate his work, so like thousands of people got set on doing that, and then spammed his twitter with it.

# ? Sep 17, 2022 07:15

Bottom Liner: Feb 15, 2006; a specific vein of lasagna

https://twitter.com/arvalis/status/1569843616662851587

lmao

Bottom Liner fucked around with this message at 08:21 on Sep 17, 2022

# ? Sep 17, 2022 07:19

Wheany: Mar 17, 2006; Spinya^{ha^{ha^haha}ha}ha_{ha_{ha_haha}ha}ha!; Doctor Rope

The Sausages posted:

EleutherAI has been working on that with GPT-Neo and GPT-NeoX.

Yes and they work fairly well already, after feeding it that one FYAD post, this is one of the results I got:

quote:

proposal: a forum for the most stupid, pointless, and ridiculous posts you've ever seen in your life

reality: a moderator tried to kick a guy for wearing a T-shirt that said "i love cilantro."

And these are something another preople got:

quote:

proposal: a thread about the best way to make a fire

reality: a man who was a member of the "fire department" and had a degree in fire science, spent a year and a half building a machine that would burn a person alive

quote:

proposal: a forum for "fair trade" and "coffee shops"

reality: a goon misappropriated funds to buy new bicycles for himself and a few friends

quote:

proposal: a thread about how the internet is the new opium den

reality: a goon once told me that a girl's only job is to "wet the bed"

# ? Sep 17, 2022 07:24

Wheany: Mar 17, 2006; Spinya^{ha^{ha^haha}ha}ha_{ha_{ha_haha}ha}ha!; Doctor Rope

Objective Action posted:

Looks like the Facebook research guys figured out how to speed up the Diffusers attention layer ~2x by optimizing the memory throughput on the GPU, https://github.com/huggingface/diffusers/pull/532.

Exciting for when this gets through to mainline because this should speed up Stable Diffusion models by quite a lot even on consumer cards.

Hell yeah, looks like the low end card got a ~50% speed boost and the higher end ones got a 100%

# ? Sep 17, 2022 07:30

Hadlock: Nov 9, 2004

Seems like a pretty gnarly existential hell, though

Imagine, for example, you're making a pretty good living as an electrician, wiring up houses and fixing electrical sockets, etc

Suddenly someone writes a bash script that causes a self sustaining resonance that channels energy from a 10th dimensional object to all electronic devices simultaneously, interdimensionally, with no wires, perfectly, forever

You just wake up one day and everything electric works perfectly without being wired up

Not only

1) you're hopelessly out of the job, the replacement is instant and infinitely better in every way
2) the psychic shock of your livelihood being replaced in an instant, with no warning

I'd love to interview that guy

# ? Sep 17, 2022 07:31

TIP: Mar 21, 2006; Your move, creep.

Wheany posted:

proposal: a thread about the best way to make a fire

reality: a man who was a member of the "fire department" and had a degree in fire science, spent a year and a half building a machine that would burn a person alive

I laughed way too hard at that one

# ? Sep 17, 2022 07:34

Tunicate: May 15, 2012

Hadlock posted:

Seems like a pretty gnarly existential hell, though

Imagine, for example, you're making a pretty good living as an electrician, wiring up houses and fixing electrical sockets, etc

Suddenly someone writes a bash script that causes a self sustaining resonance that channels energy from a 10th dimensional object to all electronic devices simultaneously, interdimensionally, with no wires, perfectly, forever

You just wake up one day and everything electric works perfectly without being wired up

Not only

1) you're hopelessly out of the job, the replacement is instant and infinitely better in every way
2) the psychic shock of your livelihood being replaced in an instant, with no warning

I'd love to interview that guy

I mean, welcome to working any factory job

# ? Sep 17, 2022 07:36

Hadlock: Nov 9, 2004

Yeah but robots and automation have been on the horizon since they were first envisioned, progressing at a steady pace

The latest round of image AI is several generations ahead of what crude bullshit people were hacking together

This is more like the steam engine, but everyone got an engine in their factory, for free, on the same day, six weeks after it was announced

People in 100 years are gonna wake up and we're gonna have a "her" movie moment

# ? Sep 17, 2022 07:47

Bottom Liner: Feb 15, 2006; a specific vein of lasagna

Hadlock posted:

People in 100 years are gonna wake up and we're gonna have a "her" movie moment

There are already stories of this very thing happening with AI chat bots.

# ? Sep 17, 2022 07:48

Tunicate: May 15, 2012

Hadlock posted:

Yeah but robots and automation have been on the horizon since they were first envisioned, progressing at a steady pace

The latest round of image AI is several generations ahead of what crude bullshit people were hacking together

This is more like the steam engine, but everyone got an engine in their factory, for free, on the same day, six weeks after it was announced

People in 100 years are gonna wake up and we're gonna have a "her" movie moment

The crappy public-facing stuff from 2017 was still pretty passable

. I guess it looks different if you haven't been following it

# ? Sep 17, 2022 07:56

Adbot: ADBOT LOVES YOU

# ? May 12, 2024 21:41

Prolonged Panorama: Dec 21, 2007; Holy hookrat Sally smoking crack in the alley!

Bottom Liner posted:

the fact that he added his signature to this one but it still has the Legendary Pictures watermark in the top left is just chefs kiss.

It's pretty normal for released concept art to include studio logos?

# ? Sep 17, 2022 07:57

The Something Awful Forums > Main > General Bullshit > AI Art: It is criminal to not post your prompt

«‹›432 »