Register a SA Forums Account here!
JOINING THE SA FORUMS WILL REMOVE THIS BIG AD, THE ANNOYING UNDERLINED ADS, AND STUPID INTERSTITIAL ADS!!!

You can: log in, read the tech support FAQ, or request your lost password. This dumb message (and those ads) will appear on every screen until you register! Get rid of this crap by registering your own SA Forums Account and joining roughly 150,000 Goons, for the one-time price of $9.95! We charge money because it costs us money per month for bills, and since we don't believe in showing ads to our users, we try to make the money back through forum registrations.
 
  • Post
  • Reply
Raenir Salazar
Nov 5, 2010

College Slice

KwegiboHB posted:

Working with Raenir Salazar now. I only work with Stable Diffusion and image generation but follow all of this closely so I know of and can provide links to a large number of different tools. Sadly I don't play Trad Games so I don't know what will fit best here. It does sound like fun to play it's just going to be quite awhile until I have enough free time again.
Could anyone give me some examples of the main games and ways people play? Sitting around a table with people, what's in front of them? Character portraits? Minis? Dungeon maps? All of these can be made with AI but if you give me a list I can better get specific links on how to do so.

Heya thanks for stopping by, I have some time to reply properly now. :)

I don't know what exactly what every use case is but I imagine the general breakdown is:

1. Art for games. Which can be battlemaps, backgrounds/backdrops, Character art, portraits.
2. Music for battle music, ambience, etc.
3. Voices perhaps, for voiced NPCs/enemies/creatures.
4. 3D models maybe either for making mini's, or figures, or environments for Virtual tabletops that support 3D assets.

Usually I expect the typical use case is people are playing online via either a game client virtual tabletop where you can upload assets or like a web browser based virtual tabletop which also allows for uploaded assets but maybe the performance limits are different? I know Foundry supports images, gifs, music files, and so on.

So a DM might want to upload PNGs/Jpgs for maps; pictures for NPC art, players might also upload character art; I know in heavily homebrew games I'm customizing my custom abilities with random appropriate anime art.

The DM might also be interested in uploading voice files, music, etc.

Maybe people also play in person in which case I imagine any generated art or content is for sharing in like discord or to show via your phone etc.

Text generation can be useful; the DM for making custom spells, abilities, backstories, entire quest chains, for brainstorming the story, what stores/shops/quests/characters a given location might have; players might want a little help in creating and fleshing out their backstory and so on.

So for example, being able to feed to the AI, "these kobolds have these goals and motivations and some character traits, how do they react to the PC's questions?" and then for it to give a valid response.

It could also be interesting to know which tools support a human touch, like for art uploading a sketch to help guide the AI and so on.

Adbot
ADBOT LOVES YOU

Kestral
Nov 24, 2000

Forum Veteran

KwegiboHB posted:

I have to run and do some grocery shopping now, I'll be back later today to work on this some more. If anyone else has any other ideas I'm all ears. This will help me work on the GBS thread as well :)

Thank you for the effortpost! Glad it's useful for GBS too :) Something I'm curious about, related to the following:

KwegiboHB posted:

Where I think things will truly start to shine is in some of the custom training options, such as making a specific character LoRA. You can take the custom artwork you have of a character and train a smaller model which you feed back into the generator which allows you to change specific things about the character, like outfits, action sequences, or battle wounds, etc.

Megaman's Jockstrap posted:

I think I might tackle the Stable Diffusion section a bit, because one of the biggest things about it is that you can host it yourself - which gives you a ton of options that using it as a hosted service just doesn't - and you can apply a conditioning interface (called ControlNet) to give you quite a bit of control over the output.

Very much looking forward to learning more about this usage. Campaigns build up their own aesthetic over time, and the idea of having a tool that can grow with the campaign, essentially be bonsai'd into its aesthetic, would be incredibly useful.

Something I keep coming back to about RPG use of these tools is how incredibly specific our image needs can be. So much of what we're trying to describe is, if not unique to our specific campaign, then so rare or specifically-detailed that it's impossible to find representations of it. This can actually be a problem in play! If I'm describing something, and the other players have a mental image that doesn't line up with mine, then they'll make decisions based on incorrect information, and that hampers play. If I spent forever describing a weird thing, that might get us closer to a shared concept, but I'm also monopolizing the spotlight and at a certain point people will just tune out.

()I actually have a story about this that just played out at my table, on the intersection of RPGs, commissioned art, and AI, that I'll post later when I have time to reflect on it a bit.)

Which brings me to the actual question, which I think Megaman's Jockstrap may be intending to answer already: what's your current best option for getting an output with a high degree of specificity out of a weird description? Like, off the top of my head, if I wanted to generate an angel whose many wings were made of old televisions and their spear is a TV aerial, is that something achievable with the current tools, or should you be looking at combining multiple outputs in Photoshop and such?

Megaman's Jockstrap
Jul 16, 2000

What a horrible thread to have a post.

Kestral posted:

Which brings me to the actual question, which I think Megaman's Jockstrap may be intending to answer already: what's your current best option for getting an output with a high degree of specificity out of a weird description? Like, off the top of my head, if I wanted to generate an angel whose many wings were made of old televisions and their spear is a TV aerial, is that something achievable with the current tools, or should you be looking at combining multiple outputs in Photoshop and such?

You could do that but it would not be even close to instant or easy. All would require inpainting/controlling and maybe even rudimentary sketches to guide the AI for the angel wings (I can tell you right now that you are going to have A LOT of trouble with the "many wings made of old televisions".)

Here's a tutorial on photobashing and inpainting with Stable Diffusion and Controlnet. All you need here is some basic cut and paste stuff and you can get some big results. Please keep in mind that you have to be hosting Stable Diffusion locally or have access to an instance of Automatic1111 or a derivative in order to follow along with this tutorial, but even if you don't you can just sort of skip through it and see what he's talking about. Around 8:00 is the photobash.

https://www.youtube.com/watch?v=dLM2Gz7GR44

I'll get something for the OP but I don't know when. We'll try by Friday.

Megaman's Jockstrap fucked around with this message at 21:58 on Jun 13, 2023

KwegiboHB
Feb 2, 2004

nonconformist art brut
Negative prompt: amenable, compliant, docile, law-abiding, lawful, legal, legitimate, obedient, orderly, submissive, tractable
Steps: 32, Sampler: DPM++ 2M Karras, CFG scale: 11, Seed: 520244594, Size: 512x512, Model hash: 99fd5c4b6f, Model: seekArtMEGA_mega20

Kestral posted:

Something I keep coming back to about RPG use of these tools is how incredibly specific our image needs can be. So much of what we're trying to describe is, if not unique to our specific campaign, then so rare or specifically-detailed that it's impossible to find representations of it.
Which brings me to the actual question, which I think Megaman's Jockstrap may be intending to answer already: what's your current best option for getting an output with a high degree of specificity out of a weird description? Like, off the top of my head, if I wanted to generate an angel whose many wings were made of old televisions and their spear is a TV aerial, is that something achievable with the current tools, or should you be looking at combining multiple outputs in Photoshop and such?


https://www.youtube.com/watch?v=p52MHCcPc7Y 2:18 long.
This is my favorite https://www.filmcow.com/ short. I don't know of any better way to describe the difference in specific complexity classes in so little time lol.

I gave "angel whose many wings were made of old televisions and their spear is a TV aerial" a shot, and... missed. Here's an Imgur gallery of failures for you to see where things are with just word prompts alone. I usually make 4 pictures at a time to find candidates for further detail upscaling.

Now don't get me wrong, this is a totally doable project, just not in one easy step like so many people keep saying. It takes work to get something so unique.

Image generation training is based on word-image pairs, https://haveibeentrained.com/ is a site for searching through what Stable Diffusion was initially trained on. The most common picture of Angel will have feathery wings so by default putting in just "Angel" is most likely going to as well. There are methods for lowering probabilities like putting "feathers" in a negative prompt to tell it you don't want feathers in your picture at all, but that only goes so far before you need outside tools.

One of those is called Inpainting. You can draw over a section of your picture and then regenerate it with a new prompt and it will fill in just that part with whatever you tell it to. I doubt even that would be enough for something as radically unique as television wings so we go even further into the many extensions available.

ControlNet + OpenPose + Masking Tools + Krita Support

You can take a child's scribble drawing and ControlNet will return a detailed version. So any level of drawing ability is enhanced by this. You can totally start with stick figures!
There are also options for Posing a stick figure model and then your prompt will fit in whatever position you want it to.
There are Depth Maps for making proper 3d effects.
There are Masking tools for specific area selection such as just the wings for replacement.
You could use all of the in Krita to take the initial feather wings and replace them with some kind of CRT effect texture.

There will definitely be a write-up about ControlNet for the OP. It's a complete game changer.

Now as has been mentioned since Photoshop first came out, you can totally make Anything. These tools can drastically reduce the amount of time required to make that Anything. You still have to put the effort in though.
It's up to you to decide how much effort is worth it, a one-off low level encounter with a pack of goblins could be worth a simple generation where a campaign's BBEG would be totally worth using all the bells and whistles. Then you can train that result into a LoRA for easy repeated future use.
I love your Bonsai tree analogy, mind if I steal that?

Grey Hunter posted:

Schrödinger's AI

I also love that, mind if I steal that as well?

Grey Hunter
Oct 17, 2007

Hero of the soviet union.
Accidental destroyer of planets

KwegiboHB posted:

I also love that, mind if I steal that as well?

Go for it!

I'm in the worldbuilding part of a new campaign, and I'm looking forward to trying my hand at a group shot of the PCs, and being able to update it as we go along.

Skios
Oct 1, 2021
Replying to this post from the WoD/CoD thread:

mila kunis posted:

Speaking of, have people found interesting/fun uses for GPT/AI in their campaigns?

I've posted in the past about using MidJourney for coterie portraits:



I use it for NPC portraits too, although for WoD specifically, it can be very annoying to get the more inhuman aspects of a character across. I have yet to find a prompt that lets MidJourney do anything that remotely looks like a Nosferatu, and it has some serious restrictions on blood/gore as well.

I've also managed to work out a series of prompts that gets a general nineties World of Darkness illustration style as a result:



And some NPC illustrations:



These were all generated by appending a basic description with the following prompt:

quote:

high detail, high resolution, ink on paper, noir comic book style, sharp lines, black and white, high contrast

One last tip I can give you for MidJourney specifically - if you want consistent results for characters, especially in faces, your best bet is to just use someone famous enough for MidJourney to have in its model.

Alderman
May 31, 2021

Grey Hunter posted:

Edit - I'm also enjoying the Schrödinger's AI - Everything it produces is rubbish, but at the same time it's going to put people out of work.
You can make convincing arguments for both, but not both at the same time!

You can, and easily. Just look at what has happened to translation - people paid peanuts to "edit" and "check" machine translation which is so bad ask to require full rewrites, often more difficult than just doing the thing from scratch but undervalued to hell because "most of the work is already done"

That it produces rubbish which will still be used to replace the work of actual people who know what they're doing, and/or to undervalued the work of those people, is exactly the problem

YggdrasilTM
Nov 7, 2011

Skios posted:

I use it for NPC portraits too, although for WoD specifically, it can be very annoying to get the more inhuman aspects of a character across. I have yet to find a prompt that lets MidJourney do anything that remotely looks like a Nosferatu, and it has some serious restrictions on blood/gore as well.
I can get some relatively "pretty" Nosferatu. This is what I got for my coterie:

Megazver
Jan 13, 2006

YggdrasilTM posted:

I can get some relatively "pretty" Nosferatu. This is what I got for my coterie:



I am pretty sure I recognize that prompt, hehehe.

Skios posted:

I use it for NPC portraits too, although for WoD specifically, it can be very annoying to get the more inhuman aspects of a character across. I have yet to find a prompt that lets MidJourney do anything that remotely looks like a Nosferatu, and it has some serious restrictions on blood/gore as well.

MJ has trouble not generating pretty people. I find that just piling on stuff like "ugly misshapen grotesque asymmetrical skeletal gaunt disfigured" and, sigh, "middle-aged" as part of the prompts helps a little.

Skios
Oct 1, 2021
Oh, that's very useful, I'll have to keep tweaking it.

Doctor Zero
Sep 21, 2002

Would you like a jelly baby?
It's been in my pocket through 4 regenerations,
but it's still good.

The discussion has gone around a bit so I lost track: is it okay for someone to start a forums RPG thread and state that AI asset use is OK and that the GM will be using it as well? I am NOT volunteering to do this, it was a good point brought up in the other thread, and I never saw a direct answer (it's probably there and I just missed it).

PS: Those rules are perfect. :discourse:

Raenir Salazar
Nov 5, 2010

College Slice

Doctor Zero posted:

The discussion has gone around a bit so I lost track: is it okay for someone to start a forums RPG thread and state that AI asset use is OK and that the GM will be using it as well? I am NOT volunteering to do this, it was a good point brought up in the other thread, and I never saw a direct answer (it's probably there and I just missed it).

PS: Those rules are perfect. :discourse:

I think you posted this on the wrong thread, this is the current ai chat thread, the other thread is the feedback thread.

Leperflesh
May 17, 2007

TG hosts both discussion threads (in TG proper) and game threads (in TGR). Typically you post a recruit thread in TGR, and list the game as recruiting in the recruitment thread, unless you already have players set up. Or you may just announce a game thread, like for a CYOA. Then interested people join up and you run your game in TGR.

I think if you explicitly said in your recruitment thread that you'll be using AI assets in your game, people would either not join or not care, and then you could just run your game. But also I'm not sure why AI assets would be a highlight or feature, vs. the actual theme of the game and whatever is going on. It might also be fine to just not talk too much about what assets you're using, or poll players who have joined via PM to see if they object to the use of AI-generated character portraits or whatever.

Ultimately I believe what triggered the big blowup and demand for rules, is people dropping AI stuff into discussion threads in TG, and that's what we've focused on. I don't recall anyone in the feedback thread raising the question of using AI stuff in a PBP or CYOA, which most people only watch if they're participating in, but that might just be because it hasn't come up yet.

If there's a GM who wants to do that maybe they should just talk to a mod, we'd give it a try, and see if it's a problem or not. I can't imagine there's a big list of GMs just clamoring to do that.

Raenir Salazar
Nov 5, 2010

College Slice

Leperflesh posted:

TG hosts both discussion threads (in TG proper) and game threads (in TGR). Typically you post a recruit thread in TGR, and list the game as recruiting in the recruitment thread, unless you already have players set up. Or you may just announce a game thread, like for a CYOA. Then interested people join up and you run your game in TGR.

I think if you explicitly said in your recruitment thread that you'll be using AI assets in your game, people would either not join or not care, and then you could just run your game. But also I'm not sure why AI assets would be a highlight or feature, vs. the actual theme of the game and whatever is going on. It might also be fine to just not talk too much about what assets you're using, or poll players who have joined via PM to see if they object to the use of AI-generated character portraits or whatever.

Ultimately I believe what triggered the big blowup and demand for rules, is people dropping AI stuff into discussion threads in TG, and that's what we've focused on. I don't recall anyone in the feedback thread raising the question of using AI stuff in a PBP or CYOA, which most people only watch if they're participating in, but that might just be because it hasn't come up yet.

If there's a GM who wants to do that maybe they should just talk to a mod, we'd give it a try, and see if it's a problem or not. I can't imagine there's a big list of GMs just clamoring to do that.

So, I think this crosses over to something I was trying to point out, that usually for the purpose of transparency and ethics, people would like to be informed about these things. Hence why I suggested that people should be allowed to post if something is AI, because people have a presumed right to be informed, much like warning labels on food.

As an example, there's a 3rd party supplement for D&D 5e which I won't name, but they they claimed in their patreon update that the latest supplement has lovingly handcrafted art. I went and bought it and a majority of the art was obviously AI generated.

No where in the product page or their patreon post was this mentioned, now I don't mind hypothetically paying money for this supplement even if it has AI, but I sure as hell am annoyed that they weren't upfront. I sent them a message but no response.

Its about transparency and I feel like this should be obvious, it isn't about it being a selling point, but it is a valid thing to list as a disclaimer.

Humbug Scoolbus
Apr 25, 2008

The scarlet letter was her passport into regions where other women dared not tread. Shame, Despair, Solitude! These had been her teachers, stern and wild ones, and they had made her strong, but taught her much amiss.
Clapping Larry
I used Midjourney a ton for not just character pictures for my Gotham 75 game but also for newspaper pictures





and ambiance pictures







credit to some of the front page stories goes to the New York Times Time Machine archive and edited to match up with Gotham

Leperflesh
May 17, 2007

Raenir Salazar posted:

So, I think this crosses over to something I was trying to point out, that usually for the purpose of transparency and ethics, people would like to be informed about these things. Hence why I suggested that people should be allowed to post if something is AI, because people have a presumed right to be informed, much like warning labels on food.

As an example, there's a 3rd party supplement for D&D 5e which I won't name, but they they claimed in their patreon update that the latest supplement has lovingly handcrafted art. I went and bought it and a majority of the art was obviously AI generated.

No where in the product page or their patreon post was this mentioned, now I don't mind hypothetically paying money for this supplement even if it has AI, but I sure as hell am annoyed that they weren't upfront. I sent them a message but no response.

Its about transparency and I feel like this should be obvious, it isn't about it being a selling point, but it is a valid thing to list as a disclaimer.

I don't have a great answer for you because "source your art" isn't a hard rule, it's a suggestion, and it comes from people some of whom genuinely just think that's a good idea to do, and maybe some who presumably want to use it as a rule to expose anyone sneakily using AI art, and in the latter case I don't think that works or is practical and I don't think it's police-able either because "I made this without AI" "nuh-uh, you clearly used AI" is not a fight I want to try to parse especially as AI output gets better as most of us expect it to.

I don't like some sort of "don't ask don't tell" policy either, seems bad

I think it would be better to work from actual cases rather than try to theorycraft the cases like this. If you're not planning to run games yourself, then let's leave it till someone who actually is planning to run games and use AI art has an issue and mods can discuss directly with them.

Humbug Scoolbus
Apr 25, 2008

The scarlet letter was her passport into regions where other women dared not tread. Shame, Despair, Solitude! These had been her teachers, stern and wild ones, and they had made her strong, but taught her much amiss.
Clapping Larry
For my Gotham Ambience prompt, I used 1975 Gotham City skyline, Gothic Architecture, night, raining, detailed, 8k,4k, intricate, 4k, 8k the sewer one came from Nightcafe and I did not use any artist's style attached. The pictures were things like; Uma Thurman in a trenchcoat,1975, photograph, black and white,photorealistic, portrait--ar 3:4 and Bicentennial Celebration,Times Square, 1976, kodachrome, photograph

Raenir Salazar
Nov 5, 2010

College Slice
To be clear I'm mainly just providing additional context and an explanation to OPs question, that a DM probably reasonably will want to be upfront about their game containing AI assets so potential players who might have an issue with it can make an informed decision to not join that game.

Humbug Scoolbus
Apr 25, 2008

The scarlet letter was her passport into regions where other women dared not tread. Shame, Despair, Solitude! These had been her teachers, stern and wild ones, and they had made her strong, but taught her much amiss.
Clapping Larry
That's a reason not to join a game?

Mirage
Oct 27, 2000

All is for the best, in this, the best of all possible worlds

AI or not, please please tell me this is the Riddler's hideout.

Humbug Scoolbus
Apr 25, 2008

The scarlet letter was her passport into regions where other women dared not tread. Shame, Despair, Solitude! These had been her teachers, stern and wild ones, and they had made her strong, but taught her much amiss.
Clapping Larry

Mirage posted:

AI or not, please please tell me this is the Riddler's hideout.

Oh, it is. It took a while for the players to notice the photoshopped question mark. I Photoshop most of my stuff. My old character pictures I would grab a ton of images off the internet and cut and paste them together. Collage is a technique that's been around since glue and printed imagery and artists that did it, like Baldassari and Hamilton, never credited the source material they cut and pasted together. I just did it digitally.

Humbug Scoolbus fucked around with this message at 20:04 on Jun 14, 2023

Megazver
Jan 13, 2006
It's pretty dope!

Leperflesh
May 17, 2007

Humbug Scoolbus posted:

That's a reason not to join a game?

For some people it very clearly is, yes.

Zurai
Feb 13, 2012


Wait -- I haven't even voted in this game yet!

Humbug Scoolbus posted:

Collage is a technique that's been around since glue and printed imagery and artists that did it, like Baldassari and Hamilton, never credited the source material they cut and pasted together. I just did it digitally.

There's also a poetry technique called found poetry which is basically the literary equivalent of a collage.

Humbug Scoolbus
Apr 25, 2008

The scarlet letter was her passport into regions where other women dared not tread. Shame, Despair, Solitude! These had been her teachers, stern and wild ones, and they had made her strong, but taught her much amiss.
Clapping Larry

Megazver posted:

It's pretty dope!

Thanks!

Here's an example as I am building a new character picture for a friend's superhero game character. She wanted to play a Guyver. Okay. First I fed the prompt into Midjourney

The prompt was ' female Guyver Unit,full face helmet, jade green,detailed, 8k,4k, intricate'



As you can see it really did not do 'full face helmet' well...

So I generated some helmets.
prompt 'a head shot of a Guyver Unit,jade green,detailed, 8k,4k, intricate'



This first image was the best out of the ones I got...



So on to Photoshop!

Chosen head...


Layer Two, helmet flipped resized and cropped...



Layer Three, The weirdaass eyeball snagged from a screen shot of the old Guyver movie resized, color corrected, and puppetwarped so it matched the angle of the helmet...



Layer Four the cheek vents, taken from a picture of a Guyver Miniature on sale from a very sketchy Japanese online anime shop. Resized, warped and adjusted to match the cheek contour...



Layer Five, The browpiece which I grabbed off a free gaming stock art/asset site because it looked cool. It also had to be resized, angled, color matched yadda yadda yadda...



A few tricks with layers and filters and...

Meet Guyver -IX

Megazver
Jan 13, 2006
These look dope as well! I've been meaning to sit down and actually do a course of learning image manipulation to help me with this stuff, but I keep procrastinating.

Also, I got curious, so I gave this a go as well.



"a portrait of female guyver unit, face hidden by the helmet, jade green, detailed --ar 2:3"

I'm of the opinion that the "4k, 8k" copypasta in prompts is magic thinking, in MJ at least, and I find that if you're trying to generate something with headwear or helmets or elaborate hair it helps to set a 2:3 aspect ratio.

Humbug Scoolbus
Apr 25, 2008

The scarlet letter was her passport into regions where other women dared not tread. Shame, Despair, Solitude! These had been her teachers, stern and wild ones, and they had made her strong, but taught her much amiss.
Clapping Larry

Megazver posted:

These look dope as well! I've been meaning to sit down and actually do a course of learning image manipulation to help me with this stuff, but I keep procrastinating.

Also, I got curious, so I gave this a go as well.



"a portrait of female guyver unit, face hidden by the helmet, jade green, detailed --ar 2:3"

I'm of the opinion that the "4k, 8k" copypasta in prompts is magic thinking, in MJ at least, and I find that if you're trying to generate something with headwear or helmets or elaborate hair it helps to set a 2:3 aspect ratio.

That last set was after an hour of prompting. I knew I could make a faceplate (in the way I did), but I wanted the head angle to be right and the proportions to work. I don't claim to be an artist, but I do understand composition and layout.

If you want to do any kind of image manipulation, use a program that has layer support and duplication. Being able to layer poo poo is key so You don't gently caress up your master and can step back. If you look at the layers sections of those screen shots you'll see some that have blueish thumbnails, those are ones I use for testing masking

Humbug Scoolbus fucked around with this message at 21:28 on Jun 14, 2023

Megaman's Jockstrap
Jul 16, 2000

What a horrible thread to have a post.
Stable Diffusion

This is a huge topic and you could literally write a book about it. I won't. We'll try to make this short(ish) and sweet.

At first glance, Stable Diffusion is not as good as MidJourney. Until SDXL (an upcoming model) it had a bug that meant everything in it was literally trained wrong, so all of it's produced images will generally looked more washed out and lower-contrast than Midjourney (this is somewhat fixable). It's base models have less cohesion and more artifacting than Midjourney. Given the same prompt, it will generally produce a weaker and less interesting image.

Well, why use it then? The answer is easy: it's very customizable and you can have a lot of control over it. Forgive the facile analogy, but if Midjourney is Windows then Stable Diffusion is Unix/Linux - the analogy even holds a little because Stable Diffusion is open source and Midjourney is closed and proprietary. The basic expectation here is that you are going to take the time to learn the in and outs and dig under the hood of Stable Diffusion to learn it. If not, stop right now. Use Midjourney.

We'll go over the four things that are good about Stable Diffusion that Midjourney doesn't offer:

1) You can run it on your local box

Yes, if your local box has an NVIDIA graphics card made in the last 10 years it can run Stable Diffusion. Obviously the beefier and faster the card the better this works, but I was getting perfectly acceptable (if slow) results on a decade-old i5 with a GTX 970. You can also run it on AMD stuff but I don't know how that works at all. I know it's a bit trickier and requires some special config work but people do it.

The de-facto interface for it is a web front-end called Automatic1111. There's a ton of tutorials out there on how to install Automatic1111 and get the SD models so I'll leave that to you and Google.

Because you host it you don't have to pay credits or setup an account or anything. You have control over it. You can just generate whatever images you want. It's FREEEEE (unless you're an insane person and buy a new graphics card for SD! don't do that!)

2) Customized models

Stable Diffusion released their models completely for free. Open source and all that. Because it's open source, about 4 minutes later somebody made a way to continue to train models. We're not going to get into how models work but just know that it's entirely possible to extend these models. For example, if a model doesn't know what a Porsche is, you could give it a bunch of pictures of a Porsche and say "that's a Porsche" and then it "knows" and can gen images of that. So lots of people have been extending the two biggest Stable Diffusion models (referred to by their version numbers, 1.5 and 2.1 - but mostly 1.5) with new stuff, or refining old stuff.

Because these models use Stable Diffusion as a base (so that they can continue to "understand" styles and concepts that the creators didn't specifically train them on) they're usually a decent size, 6 gigs or more. But they CAN be highly useful. I say CAN because some of them are very good, but some are WAY overtrained. If you use only a few images for a concept, or a bunch of very similar images, the model will not have any variance and keep generating things that look the same. For example, if I made a model and the only pictures of men I uploaded were of 1973's Mr. Universe Arnold Swartzenegger, then every time I asked for a picture of a man I would get some variation of Arnie (probably flexing). Well, people do this all the time. In fact I'm sorry to say that most people just train them on big tiddy anime waifus. No, I am not joking. The model training community are a bunch of terminal horndogs.

At the time of writing, Civitai.com hosts Stable Diffusion models. Should you choose to visit this site: NSFW NSFW NSFW the front page will instantly poison your mind and make you hate AI. Push past it. Search for RPG to find some decent stuff (and a loving ton of samey anime girls). The "RPG_v4" model (soon to be v5) is pretty good and great for generating portraits and weapons for your games.

3) LORAs

So let's say you have a model that you really like except for one small problem: it doesn't know what a BattleTech Marauder 2C looks like. Criminal! But who wants to train a model and generate gigs of data just to add a Marauder? Well you don't have to, luckily.

Think of a LORA as a post-it-note that you stick to a model that contains exactly one style or concept that you want it to use. It requires incredibly less computer power than adding to a model, and less work too. It's entirely feasible for you (yes you!) to train a LORA, the amount of images needed is shockingly small (5 - 7 gets you decent results, more is better of course). You can use these to do things that would be nearly impossible for Midjourney, such as outputting art in the style of RIFTs. You can get real powerful results here that are simply not obtainable with other methods using custom models and LORA styles.


A picture from a little known RIFTS sourcebook from 1992, RIFTS: Robot Boyz

Remember earlier when I said you could mitigate some of the aspects of Stable Diffusion's bad training bug? Well that's a LORA. https://civitai.com/models/13941/epinoiseoffset

4) Controlnet

The best way to think of Controlnet is a way to apply traits from one image to another, like telling an artist "I want this person to be doing this pose" or whatever. So if you like the general composition of an image, or the pose a person is holding, but you want other things to change completely, that's Controlnet. If you want to apply some aspect of an image to another image, that's Controlnet.

You might have noticed earlier that people are still using SD 1.5 even though 2.1 is out. Well, that's because ControlNet only works with it. That fact alone has kept it going.

When it comes to RPG applications it's really good for getting the facial expressions you want. Here's a rakish rogue with the prompt wanting him smiling:


That's just not good enough! I want a real big smile and a particular pose, let's go ahead and grab this picture that's pretty close to what I want and put it into Controlnet, then tell Controlnet to apply that facial expression to my guy:



It doesn't matter that it's a cartoon. If it's got a face, Controlnet can work with it. Here's the final output:


So we didn't get the Dreamworks Smirk (and there's a small artifact and some small teeth issues that I just didn't bother to fix) but he does look a lot happier! Also note that his head position and pose is exactly like the cartoon picture (although his eyes aren't looking the same way). And if my player cared enough we could try another picture or whatever, but this is good enough as an example I did in 5 minutes.

Here's another 5 minute example. I wanted to make a flower sword where the blade is coming out of petals near the hilt. A quick sketch in GIMP (not even using a tablet) and sending it to Controlnet on the RPG_V4 model with a very basic "a photograph of a flower sword, highly detailed blade" prompt goes from this:

to this:


Again, it's in the exact orientation I wanted, proportions what I asked for, no guessing for a magic prompt. It's not perfect but for 5 minutes it's a great player handout.

It also does full body poses, fingers and hands, architecture, edge detection, A LOT. Controlnet is magic. Controlnet is life. Covering what Controlnet can do is waaaaay beyond the scope of this post so instead here's a mega big tutorial on it: https://stable-diffusion-art.com/controlnet/

Final Thoughts

Stable Diffusion is huge and I've barely scratched the surface here (I did not mention image-to-image, interrogating, inpainting, Krita integration, hypernetworks, Dreambooth, etc), but sufficed to say you're not locked in to giving money to Midjourney or trying to find the "magic phrase" if you don't want to. You can host and use your own image diffusion system and get a really cool way to make player portraits, player handouts, maps, and anything else you could want. A very useful tool for upping the professionalism of your game.

Megaman's Jockstrap fucked around with this message at 17:59 on Jun 15, 2023

BrainDance
May 8, 2007

Disco all night long!

Here's a quick thing I wrote up about running local LLMs for the next OP.

Open-source local LLMs are currently going through their Stable Diffusion moment. Before March open source LLMs were much weaker than GPT3 and mostly would not run on consumer hardware. We were limited to running GPT-Neo or, at best, GPT-J slowly. Training custom models on them was slow, hard, and poorly documented.
Then in March two things happened at about the same time. Meta released (it “leaked”) their LLaMA models, trained differently from previous open-source models with more training data to compensate for a lower parameter size making a 60B parameter model perform about as well as a 120B parameter model. Then the hardware requirements for running these larger models was drastically reduced through “quantization”, shaving off some bits and leaving the model a fraction of the size. A group at Stanford then trained a model on top of LLaMA in basically the same way as ChatGPT making the first actual open-source equivalent to ChatGPT.
Soon after LLaMA was released methods to run LLaMA and other models at an actually decent speed on a CPU were released.
Just like with Stable Diffusion, LoRA (Low Rank Adaptation) finetuning allowed LLMs to be finetuned with much weaker hardware than before, people could replicate exactly what Stanford had done on their own with a gaming GPU.

Now we’re at the point where we have new models that outperform the last by a huge jump almost every week. LLaMA was released in four sizes, 7B, 13B, 33B, and 65B. The 65B models are a thing most people still cant run, but the 33B models are, in many tasks, almost indistinguishable from GPT3.5.

What do I need to run these models?
7B models with 4-bit quantization require 6GB of vram to run with a GPU, or 3.9GB ram to run with a CPU
13B 10GB vram or 7.8GB ram
30B 20GB vram or 19.5GB ram
65B 40GB vram or 38.5GB ram
The low ram requirements don’t mean you can realistically run a 30B model on any computer with enough ram. You technically can, but it will be slow. Still, running 7B and 13B models on modern CPUs is probably faster than you think it’s going to be.
There are other kinds of quantization which used to make more sense when it was more difficult to use 4-bit models but that’s not all that relevant anymore.

How do I run these models?
There are so many frontends at this point but the two big names are oobabooga’s text-generation-webui and llama.cpp

Oobabooga’s text-generation-webui - The LLM equivalent to Automatic1111’s webui for Stable Diffusion. Download the one-click installer and the rest is pretty obvious. Allows you to download models from within it by just pasting in the huggingface user and model name, has extensions, built in powerful support for finetuning, works with CPUs and GPUs. It now supports 4-bit models by default. The one thing that does require a little bit more setup is using LoRA’s with 4-bit models. For that you need the monkey patch, and to start the webui with the –monkey-patch flag (stick it after CMD_FLAGS = in webui.py, this recently changed so some documentation tells you otherwise) instructions are here

Llama.cpp - started as a way to run 4-bit models on macbooks (which works surprisingly well) and is now basically the forefront of running LLMs on your CPU. Getting a lot of development, like currently the big thing is using the CPU but offloading what you can to the GPU to get a big speedup, sometimes outperforming GPU only models.
I don’t use it though, because I spent way too much on getting 24gb of vram

Which models do I use?
There are two main quantized model formats right now. Things are a little chaotic so who knows how long this will stay true, and one of the formats has even gone through a big update making all previous models obsolete (but just needing conversion) so things can change. Generally though:

GPTQ Models – For running on a GPU
GGML Models – For running on a CPU
Generally you’re going to want to have the original LLaMA models to apply LoRAs to.
Otherwise, almost all models get quantized by one guy right now, TheBloke https://huggingface.co/TheBloke
Right now, good general models are the Vicuna models, Wizard Vicuna uncensored, and for larger models (30B) Guanaco, though I don’t think an uncensored version of this exists yet. Even censored models here though are usually a lot less censored than ChatGPT. You can find all these models in different parameter sizes on TheBloke’s huggingface.
There are a lot of models that have a more niche purpose though. Like Samantha, a model not trained to be a generally helpful model but instead trained to believe she is actually sentient.

How do I finetune models?
This is really where the local models become incredibly useful. It’s not easy, but it’s a lot easier than it was a few months ago. The most flexible and powerful way to finetune a larger model is to finetune a LoRA in the oobabooga webui. This actually has good documentation here
The trickiest thing is formatting all of your training data. Most people are using the alpaca standard right now where this is a field for an instruction and then a field for an output. The AI learns when it gets an instruction it’s then supposed to generate an output like in its training data.
The other, easier way is to just give it a kind of trigger word. You have examples of the kind of data you want it to output all following a word you made up so it learns “when I see this word, complete for it with this kind of text”

This is an area though that gets incredibly complicated, there are too many different ways to do it and too many steps to just write down here. This though, is probably where open source models become most relevant to traditional games, models trained to be aware of specific lores or to respond to questions in a way that fits that world or who knows what else (I’ve had an idea for training a model on questions and answers from DnD masters before for the longest time, just as an experiment to see how much it learns the rules and how much it just hallucinates)

Another option for the hard way to finetune a whole model that is probably going to require renting an A100 somewhere is mallorbc’s finetuning repo, which started before LLaMA as a method for finetuning GPT-J. This is where I started out, before the oobabooga webui existed.

There is a lot more. There are ways to give an AI a massive memory from a larger database, experiments in much larger or even infinite context sizes are popping up, people trying out new formats and new ways to quantize models but, this post can only be so long.

KwegiboHB
Feb 2, 2004

nonconformist art brut
Negative prompt: amenable, compliant, docile, law-abiding, lawful, legal, legitimate, obedient, orderly, submissive, tractable
Steps: 32, Sampler: DPM++ 2M Karras, CFG scale: 11, Seed: 520244594, Size: 512x512, Model hash: 99fd5c4b6f, Model: seekArtMEGA_mega20

Megaman's Jockstrap posted:

Stable Diffusion

Amazing writeup! Thank you for this!

I just want to add there have been advances and an NVIDIA card isn't strictly required anymore. There is AMD, Intel, CPU, hell even iPhone support now. Yeah you can run this on your phone.
Here is a list of quick installers, no messing with git or PATH required.

- Automatic1111 - NVIDIA GPU - By far the most popular webui with an incredible array of options and extensions. https://github.com/EmpireMediaScience/A1111-Web-UI-Installer
- InvokeAI - AMD or Intel GPU - This uses DirectML instead of CUDA. https://invoke-ai.github.io/InvokeAI/installation/010_INSTALL_AUTOMATED/
- NMKD - NVIDIA GPU - Executable windows GUI not webui. https://nmkd.itch.io/t2i-gui
- ComfyUI - NVIDIA GPU - Node based with downloadable premade workflows. https://github.com/comfyanonymous/ComfyUI/releases
- Stable Horde - NO GPU - Crowdsourced donated compute free for those without other means. https://stablehorde.net/ links to a client interface with no installation required https://dbzer0.itch.io/lucid-creations
- Mac and iPhones - I don't know anything about Mac, here's a link anyways https://apps.apple.com/us/app/draw-things-ai-generation/id6444050820
- Runpod - 'The Cloud' - Just means someone elses computer. Rent someone elses GPU for money. Method of last resort. https://blog.runpod.io/stable-diffusion-ui-on-runpod/
- OpenVino - CPU mode - This takes forever but is still doable on literally a toaster, hell yeah stable toast. https://github.com/bes-dev/stable_diffusion.openvino

Leperflesh
May 17, 2007

I'm curious (coming from a cloud software background), is everyone running "local" models only on their home machines, or have people played around with running on cloud infra? Like there are cheap and even free infra options out there.
https://cloud.google.com/free
https://azure.microsoft.com/en-us/free/
https://aws.amazon.com/free/
https://docs.oracle.com/en-us/iaas/Content/FreeTier/freetier_topic-Always_Free_Resources.htm
These are the four big guys as far as I know and they all have either a free trial or some low level of always-free compute resources available.

Megaman's Jockstrap
Jul 16, 2000

What a horrible thread to have a post.
I have never tried SD with cloud hosting but it's 100% possible. It's literally just python scripts and compiled libraries, a big math matrix that gets sent to your video card (or CPU if you're unlucky) for crunching, and a web front end that runs it all. For me, though, I just don't like dealing with the maintenance overhead, plus as a former programmer I want ~T O T A L C O N T R O L~

KwegiboHB is clearly a fellow SD guy but I just don't share the enthusiasm for running it on substandard stuff. I mean, if you have no choice, sure. It genuinely sucks to wait 50 seconds between image gens, though. Being able to do an image a second from the previous 30 seconds per gen completely changed how I used it for the better.

I updated my SD post to include one more Controlnet example, to show that you can take your own crummy art and turn it into something acceptable (or even great, if you spent a ton of time on it).

Ruffian Price
Sep 17, 2016

Humbug Scoolbus posted:

full face helmet
For most cases "don't include the pink elephant" prompts are more likely to add a pink elephant to the picture. You might want to try face, skin in the negative prompt, which I'm told is done in MJ via the --no parameter

Megazver
Jan 13, 2006
MJ has gotten kinda weird about the --no in v5, I don't use it anymore. The best bet is usually restating what you want in positive terms that doesn't mention the thing you don't want to see.

Humbug Scoolbus
Apr 25, 2008

The scarlet letter was her passport into regions where other women dared not tread. Shame, Despair, Solitude! These had been her teachers, stern and wild ones, and they had made her strong, but taught her much amiss.
Clapping Larry

Ruffian Price posted:

For most cases "don't include the pink elephant" prompts are more likely to add a pink elephant to the picture. You might want to try face, skin in the negative prompt, which I'm told is done in MJ via the --no parameter

I did try that, but as I said, I kept getting poor angles and proportions, so it was easier just generating a helmet separately and combining them. I had to add the cyclops eye and brow pieces I wanted anyway.

The main point of that post was that I was using MJ (which is a digital collage of image pieces gathered from all over the internet) as part of a Photoshop digital collage.

Doctor Zero
Sep 21, 2002

Would you like a jelly baby?
It's been in my pocket through 4 regenerations,
but it's still good.

Raenir Salazar posted:

I think you posted this on the wrong thread, this is the current ai chat thread, the other thread is the feedback thread.

Yep I am a dope. Sorry.

Speaking of Gotham 75, I went on a bender trying to get pictures of the DC characters doing questionable stuff at studio 54:

I have a butt load more if anyone’s interested.










feverish and oversexed
Mar 9, 2007

I LOVE the galley!
the creator of ironsworn is also against AI and disallows it on their discord completely lmao (I completely expect them to chime in here tbh)

My goodness this hardcore gatekeeping is rough.

Hey if anyone is using AI to be their "game master" DM me, I'm interested in having a space to talk about it without people jumping down my throat about the ethics because I don't really give a gently caress about the ethics

Doctor Zero
Sep 21, 2002

Would you like a jelly baby?
It's been in my pocket through 4 regenerations,
but it's still good.

This thread is safe harbor. I’ve tried it… what do you want to know?

feverish and oversexed
Mar 9, 2007

I LOVE the galley!
I was just posting my experiences on their discord, trying to spark some conversation about the way I'm using chatgpt right now to run my ironsworn games. They have a blanket ban on AI talk on the discord, so I left it and tried the reddit and the creator announced hes against AI in general (boo)

Anyways, some people DID respond with how they are using similar systems to play.
I personally am having good traction with setting up a game like this:


quote:

Hello Chatgpt. Today I want you to be my ironsworn DM.

I will provide you with contextual information, and you will describe next what happens.

Eamon: Background: Once a blacksmith in the small village of Forgehaven, now a wanderer seeking justice and rebuilding after a devastating raid. Ethnicity: Durrain (a blend of Celtic and Norse influences).

"Beginning scene: a town named forgehaven lies in ruin the ironlands, having been hit by raiders in the early morning.

Eamon crawls out of the wreckage, and starts looking around for other survivors"

Please describe these scene in the third person, describing the ruin that Eamon sees and providing some background context and narrative as a DM would for the game of ironsworn

I've been running a lot of different tests on different websites, and I know other people are out there doing similar things, but the incredibly annoying blowback against any AI discussion touching peoples precious games has been derailing conversations everywhere I've tried.

Yes I understand your issues with the ethics. Can I talk to other people about how to play my stupid solo rpg and make images for it now?
Since I bought all the books and poo poo for the system I'm not going to toss them out the window just because the creator doesn't agree with me. Just gimme a spot to entice others to talk about it drat.

Adbot
ADBOT LOVES YOU

feedmyleg
Dec 25, 2004
I've tried it but found that its "memory" is too short for it to be much fun for long. The illusion is really compelling for a bit, but as soon as you hit it forgetting key details and hallucinating things that don't fit with the narrative you've been in a few times, it feels less like playing a game and more like writing fiction. And constantly having to remind it of the "rules" and summarize play feels tedious and frustrating. There's a model that OpenAI is testing that ups the memory from 4000 tokens to 16,000 tokens which feels much better for these purposes, though, so maybe it'll be a better experience soon.

  • 1
  • 2
  • 3
  • 4
  • 5
  • Post
  • Reply